The DICS repository

Category Cross-Omics>Knowledge Bases/Databases/Tools

Abstract The DICS (Dense modules from protein interaction networks) repository (database) is a dynamic web repository (server) of computationally predicted 'functional modules' from the human protein- protein interaction network.

It provides references to the CORUM, DrugBank, KEGG, and Reactome (see G6G Abstract Number 20267) pathway databases.

DICS can be accessed for retrieving sets of ‘overlapping modules’ and ‘protein complexes’ that are significantly enriched in a gene list, thereby providing valuable information about the functional context.

DICS offers experimental researchers carefully ‘benchmarked modules’ at multiple granularities that are compiled from the human protein- protein interaction network.

The web-server supports the exploration of measurement data, such as those resulting from genome-wide expression, proteomics and whole genome association studies, by providing enriched 'functional modules' and protein complexes, as well as disease-related annotation.

The DICS Server -- At the heart of the DICS server is an algorithm that exhaustively enumerates all modules from the human protein-protein interaction network whose density exceed a pre-specified threshold (Uno, 2007).

The density of a module is defined by the number of known direct interactions between genes within the module divided by the number of interactions in a clique formed using those genes.

Briefly, the algorithm adopts the reverse search paradigm to organize the modules efficiently in a search tree such that their density is monotonically decreasing.

Human protein-protein interaction data are collected from the IntAct (a protein interaction database and analysis system), BIND (Biomolecular Interaction Network Database), MINT (database of functional relationships between proteins, DNA and RNA), and HPRD (Human Protein Reference Database) databases.

Confidence scores are assigned to each interaction by assessing the corresponding set of experimental techniques (Jansen et al., 2003).

DICS is updated on a 3-month basis according to updates in the reference databases.

DICS Collection of modules -- To obtain modules, one needs to determine cut-off values for the 'module density' and for the 'interaction confidence score'.

There are two competing goals: (1) to most accurately recover the protein complexes in the CORUM database (provides a resource of manually annotated protein complexes from mammalian organisms) (Ruepp etal., 2008) and (2) to extend the coverage of disease-related genes as much as possible.

By default, the manufacturer set the density threshold to 1.0, i.e. fully interconnected modules, and removes 30% of the interactions considered to be least reliable.

The resulting 9,859 modules cover 598 out of 1,077 protein complexes with an average reliability of 0.58 for complex prediction.

Furthermore, the modules cover 40% of the disease-related genes listed in the HGMD database (Stenson et al., 2003).

Protein complexes cover only 11% of the disease genes.

Optionally, the user can choose three (3) pre-computed module sets with selected parameter combinations.

DICS Enrichment analysis -- DICS can be accessed for identifying 'dense modules' and known 'protein complexes' that are significantly enriched in gene lists provided by high-throughput studies.

The 'significance of association' between a given gene set and each module or complex is estimated by a Monte-Carlo simulation procedure (Antonov et al., 2008).

Protein complexes and modules can be listed individually; or unions of the significant modules are provided for all pairs of modules, whose overlap score exceeds a specified threshold.

The overlap score is defined as (N*N)/N1*N2, where N, N1 and N2 are the number of proteins in the overlap, and the modules 1 and 2, respectively.

The web-server (DICS) provides modules that are significantly associated with the disease mutations extracted from the HGMD database and mouse phenotypes from the MPD database (Bogue et al., (2007).

Due to the small number of interactions experimentally determined for mouse proteins, orthologous modules are inferred in the mouse using the groups of orthologous proteins provided by the InParanoid database (O’Brien et al., 2005).

DICS Analysis - Examples --

Disease-related modules- Human functional modules that is associated with a disease. Genes reported to cause a disease when mutated are compiled according to the Human Gene Mutation Database (HGMD).

Proteomics studies- Human functional modules obtained for 171 studies from the Proteomics journal. The results tables were extracted automatically and analyzed for enriched modules.

Phenotype-related modules- Mouse functional modules that are orthologous to human modules and associated with a phenotype. Genes associated with a phenotype on knock-out are compiled according to the Mouse Phenome Database (MPD).

The 'significance of association' is estimated by a hypergeometric test. The results are corrected for multiple hypotheses testing using a false discovery rate of 0.05.

Legends - Info -- The interactions are visualized via an interactive applet (Hooper and Bork, 2005, Bioinformatics, 21:4432-33).

Input genes are indicated by dark blue, further genes in the module by light blue.

System Requirements

Web-based.

Manufacturer

Institute of Bioinformatics and Systems Biology (IBI) is part of the German Research Center for Environmental Health (HZM) and hosts the Munich Information Center for Protein Sequences (MIPS) Neuherberg, Germany

Manufacturer Web Site The DICS repository

Price Contact manufacturer.

G6G Abstract Number 20336

G6G Manufacturer Number 101822