An interesting issue with nucleosome occupancy profile is to look at the
pattern of nucleosome density around specific genomic features like for example
the transcriptions start sites (TSS) or end sites (TES). We can also be interested
in computing average nucleosome occupancy profiles over cis-regions like
the entire cis-region of genes. To do so we need first to reformat the output
of NMhmmvit
into a tiling array dataset like format. This is done by the perl
script NMnuc2toc
as follows:
$NMnuc2toc -i S2/BY_S288c/NMhmmvit 1>S2/BY_S288c/BY_S288c_nuc.txt
Then, as for the original dataset, we have now to convert this file into a .db
file
by using NMtdb
:
$NMtdb -i S2/BY_S288c/BY_S288c_nuc.txt \ -c S2/BY_S288c/BY_S288c_nuc.dbconf \ -o S2/BY_S288c/BY_S288c_nuc.db
Now, we will create a new folder S2/BY_S288c/Features to store the results for any feature we want to investigate. Let's look first at the gene transcript boundaries (TSS and TES):
$mkdir -p S2/BY_S288c/Features/Transcript $echo "transcrpit ." >S2/BY_S288c/Features/Transcript/feature_id.txt
The file S2/BY_S288c/Features/Transcript/feature_id.txt is required by NMt2feat
. It tells to the program
that we are interested only in feature of type transcript, whatever the ID of the feature (that's why we used the wildcard '.' after the transcript tag).