Alnus rhombifolia - Transcriptome Assembly
Resource Type
Transcriptome Assembly
Data Source
Source Name
: de novo assembly
Source Version
: 021816
Date Performed
Wednesday, February 17, 2016 - 20:00
Number of transcripts
Average Transcript Length
Program, Pipeline, Workflow or Method Name
Trinity, built under bowtie-1.0.1 and samtools-0.1.19; CD-HIT-EST
Program Version
trinityrnaseq_r20131110, cd-hit-v4.6.1-2012-08-27
Cross Reference
Description and Download

MiSeq reads from a single library were cleaned with Trimmomatic and assembled by Trinity. CD-hit with parameter -c 0.95 was used to collapse highly similar reads into a single sequence. Protein sequences were predicted using Trinity. Data has been uploaded to NCBI ( go to NCBI BioProject page).

Assembly Statistics

Number of Transcripts 26,186
Transcript N50 705 bp
Transcript Average Length 542 bp
Number of Proteins 14,263
Protein N50 225 aa
Protein Average Length 204 aa

Download assembled data:

Putative Transcripts (fasta format)

Predicted ORFs (fasta format)

  • Download
  • Download


BLAST against the Swiss-prot protein database:

Blastx, 1e-4 cutoff - 61% of transcripts matched a swiss-prot entry

Blastp, 1e-4 cutoff - 75% of proteins matched a swiss-prot entry

BLAST against the Trembl protein database, only plant entries:

Blastx, 1e-4 cutoff - 84% of transcripts matched a trembl entry

Blastp, 1e-4 cutoff - 94% of proteins matched a trembl entry

HMMER search against Pfam database

Excel output of all hits

SSR Pipeline

Excel file with statistics, SSR motifs and primers (191 high quality markers)

Read Statistics

RNA was isolated and sequenced from pooled leaf tissues.

Illumina MiSeq Data

Library Description Library Code Platform MiSeq Reads MiSeq Bases
White Alder - pooled seedling leaf RNAs from ozone treatments (round 1, round 2) WA1 Illumina MiSeq 1,997,227 539,065,469
Give Feedback!