[Archived] Liriodendron tulipifera - Transcriptome Assembly v120313
Resource Type
Transcriptome Assembly
Data Source
Source Name
: de novo assembly
Source Version
: 120313
Date Performed
Monday, December 2, 2013 - 20:00
Number of transcripts
Average Transcript Length
Program, Pipeline, Workflow or Method Name
Trinity; CD-HIT-EST
Program Version
trinityrnaseq_r2013-11-10, cd-hit-v4.6.1-2012-08-27
Description and Download

This project aims to elucidate the molecular response of hardwood tree seedlings to varying levels of ozone concentration. Ozone pollution places environmental stress on forest trees resulting in early leaf senescence and loss of photosynthetic capacity.
MiSeq reads from seven libraries were cleaned with Trimmomatic and assembled by Trinity. CD-hit with parameter -c 0.95 was used to collapse highly similar reads into a single sequence. Protein sequences were predicted using Trinity.

Assembly Statistics

Number of Transcripts 53,346
Transcript N50 1,237 bp
Transcript Average Length 749 bp
Number of Proteins 26,248
Protein N50 369 aa
Protein Average Length 292 aa

Download assembled data:

Putative Transcripts (fasta format)

Predicted ORFs (fasta format)


BLAST against the Swiss-prot protein database:

Blastx, 1e-5 cutoff - 46% of transcripts matched a swiss-prot entry

Blastp, 1e-5 cutoff - 69% of proteins matched a swiss-prot entry

BLAST against the Trembl protein database, only plant entries:

Blastx, 1e-5 cutoff - 63% of transcripts matched a Trembl plant entry

Blastp, 1e-5 cutoff - 91% of proteins matched a Trembl plant entry

HMMER search against Pfam database

Excel output of all hits

Proteins assigned to GO terms inferred from pfam hits

SSR Pipeline

Excel file with statistics, SSR motifs and primers (388 predicted high quality markers)

Fasta file of sequences with an SSR repeat and primers (388 sequences)

Read Statistics

RNA was sampled from leaves of seedlings exposed to ozone levels (control, 80ppm, 125ppm, or 225ppm) for 7 hours and 14 days. Raw data is being uploaded to the NCBI Short Read Archive. Links will be added when they are available.

Library Description MiSeq Reads MiSeq Bases
14Day 125ppb 682,722 93,967,711
14Day 225ppb 340,810 47,948,411
14Day 225ppb 340,810 47,995,569
14Day 80ppb 731,463 97,514,441
14Day 80ppb 731,463 97,648,782
14Day control 785,027 106,569,437
14Day control 785,027 106,703,869
7hr 125ppb 1,139,738 159,465,574
7hr 125ppb 1,139,738 159,633,857
7hr 225ppb 2,670 340,408
7hr 225ppb 2,670 341,766
7hr 80ppb 1,073,063 148,872,614
7hr 80ppb 1,073,063 149,052,856
7hr control 1,023,282 142,480,844
7hr control 1,023,282 142,646,177
TOTAL 11,557,550 1,595,033,658

In addition to the miSeq reads from the ozone experiments, two other sources of transcripts were used for the assembly. 2.3 million 454 reads from the ancestral genome project and 24,663 reads from the EST division of NCBI.

Give Feedback!