OSCAR: Open System for Clustering Analysis of Microarray Data
Category Genomics>Gene Expression Analysis/Profiling/Tools
Abstract OSCAR: Open System for Clustering Analysis of Microarray Data server is an online interactive tool for clustering analysis and cross-species analysis of microarray data, with an automated procedure to incorporate and manage all clustering algorithms.
It provides a comprehensive and friendly environment to both users and algorithm developers.
A database system was developed to manage all the algorithms, including their documentation, their parameters, each parameter's description, type, bounds and default value.
When a user accesses the OSCAR website, the server will automatically list all the algorithms currently available, together with a URL to the documentation for each of the algorithms listed.
When a user chooses a particular algorithm, all information about the parameters and input files of the algorithm is retrieved from the algorithm database and automatically displayed to the user.
Users can use the interactive web forms to adjust the parameters, upload input data and execute the computation on the server.
Algorithm developers can use the interactive web forms to incorporate their own algorithms to OSCAR without revealing their source code.
The submitted algorithm will be managed by OSCAR's database, sharing the same output format and be accessible to all users.
OSCAR for users -- OSCAR provides an intuitive web interface to users.
When a user accesses the OSCAR main page, all currently available algorithms and hyperlinks to their documentation will be retrieved from the algorithm database and displayed.
The user can select any algorithm listed. Upon selection, the specifications and default values for all the parameters required by the user-selected algorithm will be retrieved from the database and displayed.
The user can modify the default parameters, upload input data and execute the computation.
Sample inputs files are provided. Some users may want to quickly try out each algorithm and get a sense of what each one is doing. This can be achieved by clicking the ‘Submit using Sample Files’ button.
Output is provided in two formats: Text and an Interactive Heat map.
The user can save both outputs to a local computer by clicking the disk icon in the upper right corner of the web page.
Users can alter the color schemes used by the Heat map by clicking the ‘Change Color’ button. Two schemes are provided: red-green and blue- yellow.
Mousing over sample names or any spot within the heat map will invoke a small pop-up window next to the cursor, containing either information about the sample or the gene expression value used to draw the color in the cursor covered area.
Clustering Algorithms -- The algorithms currently available to OSCAR users are:
1) Hierarchical clustering,
2) K-means,
3) partition around medoids (PAM),
4) self-organizing map (SOM),
5) tight clustering,
6) Consensus Tight-clustering (new), and
7) two-species coherent clustering (new).
Users can choose any of the following distance metrics to be used in hierarchical clustering, K-means and PAM:
- a) Pearson correlation,
- b) absolute value of the Pearson correlation,
- c) uncentered Pearson correlation,
- d) absolute uncentered Pearson correlation,
- e) Spearman's rank correlation,
- f) Kendall's tau,
- g) Euclidean distance, and
- h) city-block distance.
Three linkage definitions are allowed in hierarchical clustering:
1) single linkage,
2) complete linkage, and
3) pairwise average.
For the purpose of comparison, hierarchical clustering will give the usual co-expression groups as outputs instead of hierarchical trees.
This is achieved by trimming the hierarchical tree by the allowed maximum number of clusters, provided by the user.
Three (3) new algorithms are offered to users through OSCAR:
One (1) General Clustering tool and two (2) Two-species analysis tools.
General Clustering tool - ‘Consensus Tight-Clustering’ is a new algorithm that blends the advantages of two recently published non- parametric clustering algorithms: tight clustering and robust multi-scale clustering.
Two-species analysis tools - the two (2) Cross-species Clustering tools are: Coherence Clustering (coherentCluster) and Coherent Subset (coherentSubset).
Both of these tools identify clusters in which homologous genes show co-expression in both species.
The biological motivation is: if a group of genes show conserved expression patterns in the data set from two different species, then it is very likely that the group is under evolutionary constraint and thus a functionally related group.
This will allow one to distinguish between gene clusters with biological significance and the clusters that are consequences of experimental or computational artifacts.
System Requirements
Web based
Manufacturer
- Sheng Zhong Lab
- Computational Biology Lab
- Department of Bioengineering
- University of Illinois at Urbana Champaign
Manufacturer Web Site OSCAR
Price Contact manufacturer.
G6G Abstract Number 20329
G6G Manufacturer Number 102859




