$cut -d ' ' -f 3 S3/BY_RM_gxcomp.poly | sort | uniq -c 3252 del 2973 ins 47817 snpSo here we have 47,817 SNPs, 2,973 insertions (of at least 1bp) and 3,252 deletions (of at least 1bp). If we now look at the way each polymorphism is described in the file S3/BY_RM_gxcomp.poly we see something like that:
$grep -w snp S3/BY_RM_gxcomp.poly | head -1 cis chrVIII snp 12719 T supercontig_1.12 2874 A 1where the columns indicate
$grep -w ins S3/BY_RM_gxcomp.poly | head -1 cis chrVIII ins 14780 14780 supercontig_1.12 4934 - 1where the fifth column corresponding previously to the allele of the SNP in the reference sequence indicates now the ending position of the insertion in the reference sequence, the eighth column corresponding to the allele of the SNP in the query sequence indicates is now empty (-) and the last column indicates the length of the insertion (here is a single base insertion). Finally, for a deletion we have:
$grep -w del S3/BY_RM_gxcomp.poly | head -1 cis chrVIII del 17132 - supercontig_1.12 7286 7286 1where the fifth column is now empty (-), the eighth column indicates the ending position of the deletion in the querty sequence and the last column indicates the length of the deletion (here is a single base deletion).
Jean-Baptiste Veyrieras 2010-05-28