Alnus rubra - Transcriptome Assembly
Resource Type
Transcriptome Assembly
Data Source
Source Name
: de novo assembly
Source Version
: 021816
Date Performed
Wednesday, February 17, 2016 - 20:00
Number of transcripts
Average Transcript Length
Program, Pipeline, Workflow or Method Name
Trinity, built under bowtie-1.0.1 and samtools-0.1.19; CD-HIT-EST
Program Version
trinityrnaseq_r20131110, cd-hit-v4.6.1-2012-08-27
Cross Reference
Description and Download

MiSeq reads from a single library were cleaned with Trimmomatic and assembled by Trinity. CD-hit with parameter -c 0.95 was used to collapse highly similar reads into a single sequence. Protein sequences were predicted using Trinity. Data has been uploaded to NCBI ( go to NCBI BioProject page).

Assembly Statistics

Number of Transcripts 29,113
Transcript N50 617 bp
Transcript Average Length 496 bp
Number of Proteins 14,657
Protein N50 209 aa
Protein Average Length 194 aa

Download assembled data:

Putative Transcripts (fasta format)

Predicted ORFs (fasta format)


BLAST against the Swiss-prot protein database:

Blastx, 1e-4 cutoff - 59% of transcripts matched a swiss-prot entry

Blastp, 1e-4 cutoff - 74% of proteins matched a swiss-prot entry

BLAST against the Trembl protein database, only plant entries:

Blastx, 1e-4 cutoff - 82% of transcripts matched a trembl entry

Blastp, 1e-4 cutoff - 94% of proteins matched a trembl entry

HMMER search against Pfam database

Excel output of all hits

SSR Pipeline

Excel file with statistics, SSR motifs and primers (239 high quality markers)

Read Statistics

RNA was isolated and sequenced from pooled leaf tissues.

Illumina MiSeq Data

Library Description Library Code Platform MiSeq Reads MiSeq Bases
Red Alder - pooled seedling leaf RNAs from ozone treatments (round 1, round 2) RA1 Illumina MiSeq 3,088,680 783,469,947
Give Feedback!