Web site and design © 2008 by G6G Consulting Group. All Rights Reserved. Most product content has been taken directly from manufacturer's web sites;
other product content is assembled by G6G Consulting Group. G6G welcomes any corrections and/or comments.
Product Feedback
* Required Field
*Your name:
*Email:
*Questions, comments, or feedback:
    GenePattern Gene Expression Analysis Module

    Category  Genomics>Gene Expression Analysis/Profiling/Tools

    Abstract  GenePattern combines an advanced scientific workflow
    platform with more than 90 computational and visualization tools for the
    analysis of genomic data. GenePattern provides support for four (4)
    broad categories of gene expression analysis:

    1) Differential Analysis/Marker Selection; 2) Class Prediction
    (Supervised Learning); 3) Class Discovery (Unsupervised Learning);

    and 4) Pathway Analysis. GenePattern also supports several data
    conversion tasks (see G6G Product Number 20183), such as filtering
    and normalizing, which are standard prerequisites for genomic data
    analysis.

    1) Differential Analysis/Marker Selection -- Differential analysis, also
    known as 'marker selection', is the search for genes that are
    differentially expressed in distinct phenotypes. GenePattern can assess
    differential expression using either the signal-to-noise ratio or t-test
    statistic. GenePattern provides the following support for differential
    analysis:

    a) Comparative Marker Selection - ranks the genes based on the value
    of the statistic being used to assess differential expression and uses
    permutation testing to compute the significance (nominal p-value) of the
    rank assigned to each gene.

    Due to the number of genes tested against the null hypothesis of No
    differential expression, many genes are likely to have significant p-
    values by chance alone. The analysis adjusts for multiple hypotheses
    testing using a number of statistical approaches, including false
    discovery rate (FDR) and family-wise error rate (FWER). You can control
    the ranking based on the statistic most appropriate for your data.

    b) Class Neighbors - helps you identify genes whose expression
    pattern is strongly correlated with a phenotype. This analysis, developed
    by scientists at the Broad Institute, “defines an ‘idealized expression
    pattern’ corresponding to a gene that is uniformly high in one class and
    uniformly low in the other. [It] tests whether there is an unusually high
    density of genes ‘nearby’ (that is, similar to) this idealized pattern, as
    compared to equivalent random patterns.” [Golub T.R., Slonim D.K., et
    al. “Molecular Classification of Cancer: Class Discovery and Class
    Prediction by Gene Expression Monitoring,” Science, 531-537 (1999).

    c) Heat Map Viewer - shows you differential expression by displaying
    gene expression values in a heat map format. Each colored cell in the
    heat map represents the gene expression value for a probe in a
    sample. The largest gene expression values are displayed in red (hot),
    the smallest values in blue (cool), and intermediate values in shades of
    red (pink) or blue.

    2) Class Prediction (Supervised Learning) -- Supervised learning, also
    known as class prediction, is the search for a gene expression
    signature that predicts class (phenotype) membership. The basic
    methodology for class prediction is to start with two (2) data sets, a
    training set and test set; use your training data set to build a classifier
    (class predictor) based on your chosen classification method; and use
    your test data set to test the classifier. GenePattern provides the
    following support for class prediction:

    a) GenePattern supports class prediction based on several
    classification methods, including classification and regression trees
    (CART), K-nearest neighbors (KNN), probabilistic neural network
    (PNN), Weighted Voting, and Support Vector Machines (SVM). Most of
    the class prediction methods supported by GenePattern have been
    used in research published by scientists at the Broad Institute.

    b) For each classification method, GenePattern also supports class
    prediction based on leave-one-out cross-validation. For small data sets,
    rather than creating training and test data sets, cross-validation divides
    a data set into n folds. For each fold, the analysis trains on n-1 folds and
    tests on the remaining fold. After iteratively training and testing all folds,
    the analysis combines the results to determine the classifier.

    c) GenePattern provides a tool for splitting a single data set into non-
    overlapping training and test data sets.

    3) Class Discovery (Unsupervised Learning) -- Unsupervised learning,
    also known as class discovery, is the search for a biologically relevant
    unknown taxonomy identified by a gene expression signature or a
    biologically relevant set of co-expressed genes.

    The basic methodology for class discovery is clustering: you cluster the
    data based on your chosen clustering method and then validate the
    clusters through gene annotations, enrichment analysis (are the
    clusters enriched by genes from functionally important categories,
    pathways, or processes), or by replicating the results in other data sets.
    GenePattern provides the following support for clustering:

    a) GenePattern supports several traditional clustering methods,
    including consensus clustering, hierarchical clustering, and self-
    organizing maps (SOM clustering).

    b) For validating clusters, GenePattern provides tools for retrieving
    annotations and for splitting a single data set into non-overlapping
    training and test data sets.

    Clustering is the traditional method for class discovery. GenePattern
    also supports the following less traditional methods:

    c) Non-negative matrix factorization (NMF) is an algorithm used in
    various fields, such as text mining and music analysis, to decompose
    multivariate data.

    d) Principal components analysis (PCA) is a statistical technique used
    in various fields, such as face recognition and image compression, to
    determine the key variables in a multi-dimensional data set that can
    explain the differences in observations.

    4) Pathway Analysis -- Pathway analysis is the search for sets of genes
    differentially expressed in distinct phenotypes. GenePattern provides
    the following support for pathway analysis:

    a) KSscore computes a Kolmogorov-Smirnov non-parametric rank
    statistic representing the positional distribution of a set of genes within
    an ordered list of genes. You can use this analysis to examine the
    enrichment of a set of genes at the top of an ordered list; the KSscore is
    high when the genes in the gene set appear near the top of the ordered
    list.

    b) Gene Set Enrichment Analysis (GSEA) determines whether an a
    priori defined set of genes shows statistically significant, concordant
    differences between two biological states (e.g. phenotypes). The GSEA
    software packages the method, making it easy to run the analysis and
    review the results. GSEA will soon be available as a GenePattern
    module.

    c) In addition, GenePattern provides tools for retrieving annotations,
    which aid in understanding gene sets and gene set enrichment results.

    System Requirements  (from GenePattern 3.1 release notes)

    Supported operating systems: GenePattern installers are available for
    Windows, Mac OS X, and Linux. GenePattern should work with any
    operating system that has a Java 1.5 virtual machine installed. We have
    tested it on the following OS platforms:

    Windows        XP, Vista
    Mac        OS X 10.4 (Tiger), OS X 10.5 (Leopard)
    Linux        Ubuntu 7.10, SuSE

    Users are also running GenePattern on the Red Hat, Debian, Gentoo,
    Mandrake and Fedora distributions of Linux.

    Supported browsers: The GenePattern Web Client has been tested on
    the following browsers:

    Windows        Firefox 2.0, MS Internet Explorer 6.0 and 7.0
    Mac        Firefox 2.0, Safari 2.0
    Linux        Firefox 2.0
    Safari: By default, Safari sets an open "safe" files after downloading
    preference. This setting prevents GenePattern from correctly exporting
    and importing zip files. To clear this preference: open Safari, select
    Safari>Preferences, select General preferences, and clear the Open
    "safe" files after downloading check box.

    Current technology versions: Following are the technology versions
    used in GenePattern 3.1.
    Updated technologies are shown in bold face.
    o        Java 1.5
    o        R 2.5.0
    o        Perl 5.8.8
    o        Tomcat 5.5.* series
    o        HSQL 1.8.0

    Hardware requirements: GenePattern's hardware requirements are
    found on almost all currently available machines:
    o        256 MB RAM
    o        500 MHz Pentium 3 or equivalent
    o        Hard drive space:
            Server: 252 MB
            Client: 84 MB

    As of December 2007, installing all GenePattern modules from the
    Broad repository requires approximately 1 GB of hard drive space. The
    SNPFileCreator module may require additional RAM depending on the
    chip type and number of CEL files being processed.

    Manufacturer   

    Broad Institute of MIT and Harvard
    7 Cambridge Center
    Cambridge, MA 02142
    Ph: 617.452.3000
    Fax: 617.452.4588
    or
    320 Charles Street
    Cambridge, MA 02141-2023
    Ph: 617.258.0900
    Fax: 617.258.0901
    E-mail: gp-help@broad.mit.edu

    Manufacturer's Web Site  www.broad.mit.
    edu/cancer/software/genepattern/desc/expression.html

    Price   Contact manufacturer

    G6G Product Number  20181

    G6G Manufacturer Number 101795
The G6G Directory of Omics and Intelligent Software
Search www.G6G-SoftwareDirectory.com