StarNet 2
Category Cross-Omics>Pathway Analysis/Gene Regulatory Networks/Tools
Abstract StarNet 2 is a new web-based tool that allows post hoc ‘visual analysis’ of correlations that are derived from expression microarray data.
StarNet 2 facilitates user discovery of putative ‘gene regulatory networks’ in a variety of species (human, rat, mouse, chicken, zebra-fish, Drosophila, C. elegans, S. cerevisiae, Arabidopsis and rice) by graphing networks of genes that are closely co-expressed across a large heterogeneous set of pre-selected microarray experiments.
For each of the represented organisms, raw microarray data were retrieved from NCBI's Gene Expression Omnibus (GEO) - (see G6G Abstract Number 20013) for a selected Affymetrix platform. All pairwise Pearson correlation coefficients were computed for expression profiles measured on each platform, respectively.
These precompiled results were stored in a MySQL database and supplemented by additional data retrieved from NCBI.
This products web-based tool allows user-specified queries of the database, centered at a gene of interest. The result of a query includes graphs of ‘correlation networks’, graphs of ‘known interactions’ involving genes and gene products that are present in the correlation networks, and initial statistical analyses.
Two analyses may be performed in parallel to compare networks, which is facilitated by the new HeatSeeker module (see below...).
StarNet's Main functions -- StarNet is an advanced ‘visual data mining’ front end for exploring ‘correlation networks’ constructed from microarray data. This software serves two (2) main functions:
1) It readily provides new hypotheses via the standard “guilt by association model”, where genes that participate in the same pathways frequently have similar ‘expression profiles’; and
2) It provides a starting place for reconstructing ‘biological networks’ using modeling approaches such as dynamic ‘Bayesian networks’, by providing lists of candidate genes to use in such approaches.
StarNet asks the user for a gene of interest, some parameters to specify how the network will be constructed, and some parameters for drawing preferences. StarNet then draws correlation networks local to the gene of interest.
Users can specify:
- a) that only one network be drawn,
- b) that two (2) networks from the same species be drawn, or
- c) that two (2) networks from different species be drawn.
StarNet's Features -- StarNet offers several useful features, including:
1) Network ‘gene lists’ linked to Entrez Gene, with flagging and listing of genes found in both networks, if two (2) networks are drawn (homologous genes are highlighted, if networks are from different species);
2) Edge (correlation) lists with 95% and 99% confidence intervals;
3) Optional highlighting and listing of nodes with specified Gene Ontology (GO) keyword matching (default = ‘transcription’);
4) Lists of GO terms (and associated genes) that are enriched in the network compared with the entire array; and
5) To easily compare the correlation network to current knowledge, networks of known interactions (from Entrez’s Gene RIFs) involving genes in the drawn correlation networks.
StarNet's HeatSeeker Module -- A recently added tool, HeatSeeker, will also draw false color maps comparing the two networks (again, only when two (2) networks have been drawn).
Specifically, HeatSeeker draws false color maps of correlations between genes in the network, for each cohort. The genes in each heatmap are hierarchically clustered using complete linkage, by correlation distance, within their cohort, and the appropriate dendrogram is drawn on the heatmap.
Each cohort’s heatmap is redrawn using the clustering of the other cohort, for comparison purposes.
HeatSeeker also draws false color maps of the difference between the correlations in the first and the second cohort. One heatmap of the difference is drawn for each of the two cohort’s clusterings.
The set of genes used in this procedure is the union of all genes in both networks. (Only genes with homologs in both networks are considered in two species analysis.)
HeatSeeker allows retrieval of data in a tabular format, one table per image. Each table presents numerical values of correlation distances, or differences in correlation distances, as appropriate.
Differences are tested for statistical significance: correlations are transformed (Fisher r to Z transform) and differences compared against a standard normal. Significant differences are flagged in the text files.
Getting started with StarNet -- The easiest way to see what StarNet does is to enter the symbol for your favorite gene, and your favorite species, and click submit.
This will draw networks using the manufacturer's default parameters. If you don’t know the official symbol or Entrez ID for a gene, you can use the Gene ID lookup tool on the StarNet web-site front page to search by keyword or by partial symbols.
StarNet Data sets -- The data used (as stated above...) was collected from NCBI's Gene Expression Omnibus (GEO). The manufacturer chose ten (10) species to examine, and for each chose a suitable Affymetrix microarray platform.
Microarray samples were chosen for each species. Within a species, where possible, the manufacturer has selected a subset of arrays pertaining to development; thus users can compare networks drawn from a full and from a development specific cohort of array samples.
The number of array samples per species ranges from 100 to 3,000, with each array platform containing between 5,000 and 23,000 genes. A list of species, with numbers of full and development cohort samples, GEO identifiers, and numbers of genes, is available on the Species page.
Further information regarding which GEO series were chosen for each platform and each cohort is available on the GSE descriptions page. Lists of specific samples chosen for each species and cohort can be found at the cel files used page.
StarNet Data processing -- Raw microarray data were normalized using the RMA normalization method available in BioConductor, and pairwise Pearson correlation coefficients (as stated above...) were calculated between the expression patterns of the genes within each array platform, using Octave. The resulting correlations were (as stated above...) loaded into a MySQL database for further processing, to obtain more tractable and meaningful subsets of correlations.
For each species, and each cohort, the manufacturer chose first, the largest 100,000 negative, and the largest 100,000 positive correlations. The manufacturer also grouped these two distributions. As there are many genes that are highly ‘connected’ in these tails, the 100K tails contain a small number of the genes which actually appear on an array.
To get complete coverage of features on the array, the manufacturer created a ‘Gene-centric’ distribution, which contains the top 10 positive and top 10 negative correlations for every feature. The manufacturer used the same ‘top 10 positive/negative’ approach to construct two (2) specialty distributions: one where both genes have a GO annotation matching ‘transcription’ and one where each gene matches either ‘transcription’ or ‘signal’.
StarNet draws sub-graphs of larger correlation networks starting with your gene of interest in the center of the graph. Networks are drawn in concentric ‘levels’, where the first level consists of genes that are directly connected to the gene of interest. The second level consists of genes directly connected to genes in the first level, and so on.
StarNet can draw the following types of networks:
1) Levels - this draws every connection in the specified distribution for N levels, starting from your gene.
2) Levels with Internal Edges - same as above, but connections within a level or back to a lower level are allowed.
3) Weight - the user supplies a cutoff; the product of the coefficients in the path from your selected gene to any other gene must be higher than this cutoff; up to N levels are drawn, where N is user specified.
4) Highest - draws the top n connections for your specified gene, then does the same for the next level, and so on; up to N levels are drawn.
5) Highest with Internal Edges - same as above, but connection within a level or back to a lower level are allowed.
Clicking on a graph drawn by StarNet will spawn a new page for that cohort, where the nodes are linked to NCBI's gene description. Below the graphs, on both the main page and cohort specific pages, there is supporting information and analysis, including a gene list; an edge list with confidence intervals;
a list of the genes in the network that match the user-supplied GO search terms (default = ‘transcription’) - nodes with matching GO terms are also highlighted in red on the graph; GO terms enriched in the network compared with the whole array platform; small networks of known interactions for genes in the correlation networks.
On the main page, in addition, there are lists of genes common to both networks drawn, or homologous genes in the case of two species; and a link to HeatSeeker, if two (2) networks were drawn.
System Requirements
Web-based.
Manufacturer
- VanBuren Lab
- Department of Systems Biology and Translational Medicine
- College of Medicine
- Texas A & M Health Science Center
- Temple, TX 76504, USA
Manufacturer Web Site StarNet 2
Price Contact manufacturer.
G6G Abstract Number 20569
G6G Manufacturer Number 104176




