[Archived] Liriodendron tulipifera - Transcriptome Assembly v120313
Resource Type
Transcriptome Assembly
Ontology Browser
View Gene Ontology browser or KEGG Ontology browser for Liriodendron tulipifera
Data Source
Source Name
: de novo assembly
Source Version
: 120313
Date Performed
Monday, December 2, 2013 - 20:00
Number of transcripts
Average Transcript Length
Program, Pipeline, Workflow or Method Name
Trinity; CD-HIT-EST
Program Version
trinityrnaseq_r2013-11-10, cd-hit-v4.6.1-2012-08-27
Description and Download
This project aims to elucidate the molecular response of hardwood tree seedlings to varying levels of ozone concentration. Ozone pollution places environmental stress on forest trees resulting in early leaf senescence and loss of photosynthetic capacity. MiSeq reads from seven libraries were cleaned with Trimmomatic and assembled by Trinity. CD-hit with parameter -c 0.95 was used to collapse highly similar reads into a single sequence. Protein sequences were predicted using Trinity.

Assembly Statistics

Number of Transcripts53,346
Transcript N501,237 bp
Transcript Average Length749 bp
Number of Proteins26,248
Protein N50369 aa
Protein Average Length292 aa
Download assembled data:
Putative Transcripts (fasta format)
Predicted ORFs (fasta format)


BLAST against the Swiss-prot protein database:
Blastx, 1e-5 cutoff - 46% of transcripts matched a swiss-prot entry
Blastp, 1e-5 cutoff - 69% of proteins matched a swiss-prot entry
BLAST against the Trembl protein database, only plant entries:
Blastx, 1e-5 cutoff - 63% of transcripts matched a Trembl plant entry
Blastp, 1e-5 cutoff - 91% of proteins matched a Trembl plant entry
HMMER search against Pfam database
Excel output of all hits
Proteins assigned to GO terms inferred from pfam hits
SSR Pipeline
Excel file with statistics, SSR motifs and primers (388 predicted high quality markers)
Fasta file of sequences with an SSR repeat and primers (388 sequences)

Read Statistics

RNA was sampled from leaves of seedlings exposed to ozone levels (control, 80ppm, 125ppm, or 225ppm) for 7 hours and 14 days. Raw data is being uploaded to the NCBI Short Read Archive. Links will be added when they are available.
Library DescriptionMiSeq ReadsMiSeq Bases
14Day 125ppb 682,722 93,967,711
14Day 225ppb 340,810 47,948,411
14Day 225ppb 340,810 47,995,569
14Day 80ppb 731,463 97,514,441
14Day 80ppb 731,463 97,648,782
14Day control 785,027 106,569,437
14Day control 785,027 106,703,869
7hr 125ppb 1,139,738 159,465,574
7hr 125ppb 1,139,738 159,633,857
7hr 225ppb 2,670 340,408
7hr 225ppb 2,670 341,766
7hr 80ppb 1,073,063 148,872,614
7hr 80ppb 1,073,063 149,052,856
7hr control 1,023,282 142,480,844
7hr control 1,023,282 142,646,177
TOTAL 11,557,550 1,595,033,658
In addition to the miSeq reads from the ozone experiments, two other sources of transcripts were used for the assembly. 2.3 million 454 reads from the ancestral genome project and 24,663 reads from the EST division of NCBI.
Give Feedback!