The .poly file

The file S3/BY_RM_gxcomp.poly lists the sequence polymorphisms as detected by the program from the alignments. There are three kinds of sequence polymorphisms: SNP, insertion (w.r.t the reference sequence) and deletion (w.r.t the reference sequence). We can then easily know how many SNPs (snp), insertions (ins) and deletions (del) are present in our alignment as follows:
$cut -d ' ' -f 3 S3/BY_RM_gxcomp.poly | sort | uniq -c
   3252 del
   2973 ins
  47817 snp
So here we have 47,817 SNPs, 2,973 insertions (of at least 1bp) and 3,252 deletions (of at least 1bp). If we now look at the way each polymorphism is described in the file S3/BY_RM_gxcomp.poly we see something like that:

$grep -w snp S3/BY_RM_gxcomp.poly | head -1
cis chrVIII snp 12719 T supercontig_1.12 2874 A 1
where the columns indicate
  1. the nature of the alignment (see above),
  2. the name of the reference sequence,
  3. the kind of polymoprhism (here a SNP),
  4. the position of the SNP in the reference sequence,
  5. the allele of the SNP in the reference sequence,
  6. the name of the query sequence,
  7. the position of the SNP in the query sequence,
  8. the allele of the SNP in the query sequence,
  9. the length in bp of the polymorphism (always 1 for a SNP).
For an insertion we have
$grep -w ins S3/BY_RM_gxcomp.poly | head -1
cis chrVIII ins 14780 14780 supercontig_1.12 4934 - 1
where the fifth column corresponding previously to the allele of the SNP in the reference sequence indicates now the ending position of the insertion in the reference sequence, the eighth column corresponding to the allele of the SNP in the query sequence indicates is now empty (-) and the last column indicates the length of the insertion (here is a single base insertion). Finally, for a deletion we have:
$grep -w del S3/BY_RM_gxcomp.poly | head -1
cis chrVIII del 17132 - supercontig_1.12 7286 7286 1
where the fifth column is now empty (-), the eighth column indicates the ending position of the deletion in the querty sequence and the last column indicates the length of the deletion (here is a single base deletion).

Jean-Baptiste Veyrieras 2010-05-28