Overview

Welcome to the Hardwood Genomics Help Center!

Click on the tabs on the left to read documentation for different parts of the site.

This guide pertains to browsing and downloading different data types available on HWG. To learn more about the different tools hosted on HWG, read the Tools guide.

Organism Page

The organism hub is the most direct way to access data for a species of interest. Use the "Trees" dropdown to select the tree you are interested in, or, visit the list of organisms on HWG.

trees dropdown.

The resources available on an organism's page will vary depending on what is available. In the example below, Quercus lobata has a Reference Genome, JBrowse, Feature Search, and Analyses.

quercus lobata

The Summary tab will display basic information about the page you are on. For a tree, this includes the scientific and common names, and a link to the NCBI Taxon entry for that species.

To download the entire genome assembly, or simply learn more about the assembly, click on the Reference Genome tab (or, Transcriptome Assembly for organisms lacking genomes) and follow the link to the genome analysis page. This page will have links to download the whole genome FASTA file and annotations.

q lobata genome page

In addition to offering data downloads, we provide a variety of tools for exploring and analyzing a genome. By clicking on the Jbrowse tab and the Go to Jbrowse button, you can manually explore the Jbrowse instance for a tree. Because there are often thousands of scaffolds for a given genome, it is often more helpful to use the Feature Search tool. The feature search tab will allow you to search the selected organism's set of mRNAs or mRNA contigs. You can search by name or annotations. In the below example, we search Q. lobata for kinase, and find 5,165 results. We can click the Download results as FASTA to download all features matching our search as a FASTA file.

searching for kinase

You can also click on the feature name to go to that feature's page. The individual feature page has the Interpro and GO terms associated with that feature, its nucleotide and translated amino acid sequence, and the reference BLAST alignment.

feature blast

The JBrowse tool on the feature page allows you to navigate directly to that feature in the organism's JBrowse instance.

feature jbrowse

Feature Page

Individual nucleotide and polypeptide sequences are found on the feature page. Features on HWG are either mRNA/polypeptide or mRNA_contig/polypeptide, corresponding to genomes and transcriptomes, respectively.

example feature

  • The Summary pane includes information the feature type, name, and associated organism.
  • The Relationship pane tells you what other features might relate to this one. Generally, a mRNA or mRNA contig has a polypeptide which derives it, and sometimes a gene which it is derived from.
  • The Sequences pane has the nucleotide and polypeptide sequence associated with the mRNA or mRNA contig.
  • The Homology pane has the BLAST annotations for a feature.
  • The JBrowse pane will load that organism's JBrowse pointed at that feature.
  • The Annotation pane will show any GO, Interpro, or KEGG annotations available for that organism.

    You can click on the links at the top of the Homology page to see the specific analyses used to generate these annotations. Clicking on the blast hit names will link out to the uniprot entry of a hit.

    feature blast

Analysis Page

The genomic data for every organism is linked to an analysis, which refers to a bioinformatic program or workflow. You can visit the page for an analysis to learn more about the specifics of how it was created.

Fields

  • Resource Type - Analyses are group based on the general type of data they accept and produce. See below for the different analysis types on HWG.
  • Name - The name of the analysis. This usually includes the name of the tree and the tool used.
  • Program Pipeline & Version - These fields tell you more about the software used to generate the data. This should include the software name and its version.
  • Date Performed - The date that the computational analysis was run. Data Source - The file that the analysis was performed on.

Transcriptome and Genomes might have an additional Description field with full information for the analysis, including data downloads and publications if applicable.

Analysis Types

There are many different analysis types, matching the different computational procedures used to generate different data sets. Hardwood Genomics currently has the following analysis types:

  • InterProScan Annotation - Features were annotated using InterproScan.
  • BLAST Annotation - Features were annotated using BLAST. mRNA or mRNA_contig sequences are searched against an abbreviated UniProt/Swiss-Prot and UniProt/TREMBL databases.
  • Genome Assembly - The analysis resulting in a genome assembly. Input data is DNA and RNA. This typically includes read trimming, alignment, assembly, and annotation.
  • Transcriptome Assembly - The analysis resulting in a transcriptome assembly given sequenced RNA. This typically includes read trimming, alignment, assembly, and annotation.
  • Gene Expression Profile - An analysis where Biosamples had their RNA sequenced and quantified to estimate gene expression.

Account Creation

  • Visit the registration page. You will need to provide a username, email, and password.
  • If you visit the My Account page, you'll see information about your account.

Creating an account allows you to create and store data collections. These collections can be downloaded in a variety of formats, or they can be sent to Galaxy for custom analyses.

FAQ

Can you host my tree's genome?

We'd love to! Please Contact Us and we can discuss loading your data into Hardwood Genomics.

What is the difference between the different SSR tabs?

A tree species might have up to three different SSRs associated with it.

Predicted SSRs are SSRs identified from a genome sequence experiment. We have generated primers for amplifying these SSRs, but they have not been tested and have not been confirmed as polymorphic.

Polymorphic SSRs are a subset of the predicted set, which have been screened for polymorphism.

Genomic SSRs are SSRs that have been experimentally validated, amplified from multiple individuals and found to have variable length.

How do I use the JBrowse tracks hosted on HWG?

Please see the JBrowse guide in our tools section.

Give Feedback!