Lab coats

README

isotigs_all_full_length.fa - gene models
-----------------------------------------

The transcripts sequences were obtained by genome independent de novo
transcriptome assembly (Adamidi et al.(2011)) combined with genome based
transcript predictions inferred with CUFFLINKS (Trapnell et al. (2010)).
For details, see Adamidi et al. (2011) and Oenal et al. (2012).

isotigs_all_unique.gtf - genomic transcript coordinates
-------------------------------------------------------

If a transcripts maps to multiple genomic contigs, only the longest matching
region is reported.

Oenal_et_al_2012_mRNA_and_protein_expression_data.csv - expression data
-----------------------------------------------------------------------

The file contains transcript and protein expression data for all genes.
If the protein expression was quantified for peptides in different translated frames
(due to frame-shifting errors in the genome independent de novo annotation) the
table contains multiple entries for the respective gene.

Table column description:
=========================

* transcript_id unique transcript ID

* merged_gene_locus transcript group with overlapping genomic coordinates

* read_count_X1 number of mapped mRNA deep-sequencing reads for the
X1 fraction (non-unique reads were distributed uniformly among loci)

* read_count_X2 number of mapped mRNA deep-sequencing reads for the
X2 fraction (non-unique reads were distributed uniformly among loci)

* read_count_Xins number of mapped mRNA deep-sequencing reads for the
Xins fraction (non-unique reads were distributed uniformly among loci)

* log2_rpkm_X1 transcript expression quantification (log2RPKM) for the
X1 fraction (RPKM = reads per kilobase of transcript per million mapped reads)

* log2_rpkm_X2 transcript expression quantification (log2RPKM) for the
X2 fraction (RPKM = reads per kilobase of transcript per million mapped reads)

* log2_rpkm_Xins transcript expression quantification (log2RPKM) for the
Xins fraction (RPKM = reads per kilobase of transcript per million mapped reads)

* expression_cluster expression cluster of the transcript (Transcripts with
less than 25 deep-sequencing reads were discarded before clustering)

* norm_protein_intensity_X1 norm. protein intensity measured by mass
spectrometry for the X1 fraction (data are normalized by the sum of

* norm_protein_intensity_X2 norm. protein intensity measured by mass
spectrometry for the X2 fraction

* norm_protein_intensity_Xins norm. protein intensity measured by mass
spectrometry for the Xins fraction

* best_protein_quantification if the protein expression was quantified for
peptides in different translated frames (due to frame-shifting errors in the
genome independent de novo annotation) the quantification with the largest
number of peptides is assigned a 1 (and 0 otherwise)