Extacting probe sequences in fasta format

First, go into the Affymetrix directory in Data to generate from the .1lq file provided by Affymetrix the corresponding fasta file.

$NM1lq2fsa -i S.cerevisiae_tiling.1lq --reverse 1>S.cerevisiae_tiling.fasta

This generally takes a few minutes, so be patient. Note that here we have to use the --reverse option since the probe sequence in the Affetrix file are stored in reverse order. Then, if you have a look at the fasta file generated by NM1lq2fsa you should have something like that

$head -10 S.cerevisiae_tiling.fasta
>0_0
TCCTGAACGGTAGCATCTTGACGAC
>1_0
GTCGTCAAGATGCTACCGTTCAGGA
>2_0
TCCTGAACGGTAGCATCTTGACGAC
>3_0
GTCGTCAAGATGCTACCGTTCAGGA
>4_0
NNNNNNNNNNNNNNNNNNNNNNNNN

where the unique identifient of each probe is given here by the concatenation of the $ (x,y)$ coordinate of the probe on the array (separated by a underscore, i.e $ x\_y$). Note here that sequence 4_0 is just a serie of Ns. This happens when the corresponding sequence in the .1lq file is in fact not defined.



Jean-Baptiste Veyrieras 2010-05-28