RNA Analysis
The goal of many RNA sequencing projects is to determine how genes are differentially expressed between experimental groups. With the ever-decreasing cost of sequencing and increasing flowcell capacities, processing large amounts of RNA-seq data can require vast computational resources not readily available to the average researcher and may hinder researchers from starting their analyses.
We offer two levels of RNA analysis packages, Basic and Intermediate, which can help our customers start analyzing their differential expression studies. These standard packages can be a great starting place for burgeoning bioinformaticians, but they will not pick up non-coding RNA and are not intended for metatranscriptomes.
Basic RNA Analysis
The first level of analysis we offer, Basic Analysis, maps RNA sequencing reads to a provided annotated reference genome. Read mapping and counting can be computationally expensive, so this package is ideal for customers who wish to perform their own comparisons and more in-depth analysis, but may lack the computational resources necessary to produce the raw counts in a reasonable amount of time.
Raw counts of CDS annotated genes are returned in table format, with multi-mapping reads discarded. Examples of methods can be found here for prokaryotic samples and here for eukaryotic samples. For experiments with multiple organisms, we standardly recommend performing independent mapping for each organism.
Intermediate RNA Analysis
The second level we offer is Intermediate RNA Analysis, which expands on the Basic package. After mapping and counting, the pipeline normalizes the raw CDS counts and makes pairwise comparisons between experimental group means. If the reference organism is in the KEGG Pathway Catalog, pathway information is included in the output. Counts are normalized using edgeR’s Trimmed Mean of M’s method (TMM) before statistical analysis is performed. An overall PCA plot for general assessment of the samples is provided, along with high-level heatmaps and full and filtered lists of the differentially expressed genes for each comparison.
This package is most effective with multiple experimental replicates for each group. Multi-dimensional analyses are not supported.
Controlling for Batch Effect
Sequencing data, and RNA analysis by extension, is very sensitive to batch effects. Because of this, if additional sequencing or replicates will be added to an analysis, we strongly recommend that samples be added in a symmetrical fashion across treatment groups. (Ex. Each round of library preparation and sequencing should contain representatives from each group in equal numbers.) This type of symmetric batch effect can be controlled for during intermediate analysis and will minimize skew resulting from differences in batches.
Contact:
91 43rd Street, Ste. 250
Pittsburgh, PA 15201
(878) 227-4915
Services:
About:
Resources: