ProdoNet

Category Cross-Omics>Pathway Analysis/Gene Regulatory Networks/Tools

Abstract ProdoNet is a web-based application for the mapping of prokaryotic genes and the corresponding proteins to common ‘gene regulatory’ and metabolic networks.

For a given list of genes, the system detects shared operons, identifies co- expressed genes and deduces joint regulators. In addition, the contribution to shared ‘metabolic pathways’, becomes visible on KEGG maps.

Furthermore, the co-occurrence of genes of interest in ‘gene expression’ profiles can be added to the visualization of the global network. In this way, ProdoNet provides the basis for functional genomics approaches and for the interpretation of transcriptomics and proteomics data.

The ProdoNet dataset on transcriptional regulation is based on the PRODORIC Prokaryotic Database of Gene Regulation and the Virtual Footprint tool (see below...).

ProdoNet visualizes the functional relations within a set of prokaryotic genes or proteins with regard to the joined gene regulatory network. ProdoNet uses data derived from the PRODORIC database (as stated above…) and displays the hierarchical structure of the underlying network of genes, operons and regulators.

Moreover, information is provided on the co-occurrence of analyzed genes and proteins in ‘gene expression’ profiles and metabolic pathways, respectively.

To further support the functional exploration of the obtained results, hyperlinks to UniProtKB, PRODORIC and the KEGG pathway maps are provided.

To complement the PRODORIC dataset that comprises exclusively carefully curated data derived from reliable publications, predictions on operons and regulons are added. For transparency, the ProdoNet visualization allows a clear distinction between experimentally proven and predicted data.

The current version of ProdoNet comprises data from the well- characterized model organisms E. coli, B. subtilis and Pseudomonas aeruginosa.

Virtual Footprint tool --

Virtual Footprint is a software suite for recognizing single or composite DNA patterns. It was especially designed to analyze transcription factor (TF) binding sites in whole bacterial genomes and their underlying regulatory networks.

A pattern can consist of up to two (2) sub-patterns separated by a variable spacer region. Virtual Footprint can deal with three (3) types of sub- patterns: Position Weight Matrices, IUPAC Code and Regular Expressions.

ProdoNet's Input data and query options --

ProdoNet identifies and displays the gene regulatory network and metabolic pathways for a user-defined list of genes or proteins. The input list is accepted in most common formats, including comma separated or tab delimited lists.

The use of gene or protein symbols within the input list is highly flexible, which means that ProdoNet accepts short names, locus tags and accession numbers from UniProtKB, GenBank, RefSeq and other databases.

In a first step, ProdoNet matches the input data with its own gene dataset and returns a ‘table’ of recognized genes, comprising their full names. In the case where one query name matches more than one gene from the database, corresponding matches are delineated as ambiguous.

In the next step, the user can re-select the genes to query and choose the type of analysis to perform. The default selection (network of operon and regulon settings) results in the visualization of the operons and regulons that map the selected genes.

Optionally, displayed results can include the expression profiles of genes found in DNA array experiments and predicted transcriptional regulations. The query can be limited to the search for matches in the ‘gene expression’ profile dataset, which generates a table of experimental conditions where candidate genes found in the user list were co-regulated.

Alternatively, the user can limit the request to the occurrence of the corresponding proteins in metabolic pathways. All tables can be downloaded as tab-delimited files, allowing for a convenient export and local storage of the requested data.

ProdoNet's Network of Operons and Regulons --

In the default mode, the user-defined set of genes is processed and integrated into a directed graph that represents the corresponding gene regulatory network, using the ‘Prefuse toolkit’ for interactive information visualization.

Hereby, the nodes of the network are regulators, operons or genes while the edges symbolize transcription factor-operon interactions, genes belonging to one operon or the co-occurrence of genes in expression profiles.

The result is temporarily stored as GraphML and GML formatted files, which can be downloaded by the user and re-used in other network analysis applications, e.g. in Cytoscape (see G6G Abstract Number 20092).

For the functional exploration of a particular gene of the network, a gene context menu provides the full name of the corresponding protein and links to the PRODORIC and UniProtKB databases. In the case of a gene coding for an Enzyme, the menu offers the ‘metabolic pathways’ option, leading to a table that provides links to the KEGG pathway maps for this Enzyme.

Similarly, expression profile connections are featured with a clickable dot that displays the name of the experiment, the link to its entry in PRODORIC and a ‘select’ option to hide all other expression profiles shown in the graph.

By default, the outlined network view presents only the part of the complete ‘gene regulatory network’ that was queried by the selected input genes. For deeper insights into the network, the view can be interactively expanded by choosing a higher expansion level.

In this case, all transcription factor nodes are extended both by upstream regulators and further downstream target operons. At a higher expansion level, regulatory cascades or circuits are fully shown and thus provide a broader view on the overall network involved.

Furthermore, the network can be expanded by requesting additional genes with the ‘add genes’ field. Convenient and interactive navigation within the network is ensured by options to search for gene names, zoom in and out; ‘drag and drop’ genes and operons move the graph within the screen and re-establish the original view.

In addition, nodes and their corresponding edges can be hidden by a right- click of the mouse onto the node. The reset visibility button allows the re- emergence of the hidden nodes.

ProdoNet's Data sources and System structure --

For the outlined purposes, the ProdoNet web application utilizes several different data sources. The main source is the PRODORIC database, which provides manually curated info on transcription factor binding sites, operon annotation and regulatory interactions between transcription factors and their corresponding binding sites.

Further, PRODORIC supplies processed ‘gene expression’ profiles that were thoroughly evaluated from the literature. In addition to these experimental evidences, predictions for operons and regulons were introduced into the ProdoNet dataset.

Operon predictions were included based on the distance between various genes. Regulon predictions were performed with the ‘Virtual Footprint’ algorithm (see above...) using stringent specificity parameters to decrease the false prediction rate.

The general gene annotation for the involved organisms was extracted from the ‘Genome Review’ database and the ‘Pseudomonas Genome’ Database (this database has been replaced by OGeR see below...).This includes for each gene, the name, the unique locus tag and identifiers in other databases, e.g. the UniProtKB accession number.

Open Genome Resource (OGeR) - OGeR is a newly developed open source database and tool platform for the web-based storage, distribution, visualization and comparison of prokaryotic genome data.

Furthermore, identifiers from GenBank, RefSeq and the NCBI GeneID were gathered from the NCBI Microbial Genomes. Although these identifiers are Not directly presented on the ProdoNet website, they are used to match the user's input to the ProdoNet gene dataset. Enzyme and pathway info was extracted from KEGG, BioCyc (see G6G Abstract Number 20230) and ENZYME.

The extracted data were imported into an integrated database, which provides the basis system for all offered analyses. This ProdoNet database is freely available for download on the manufacturer's website. The data import is fully automated and regular releases ensure an up-to- date dataset.

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site ProdoNet

Price Contact manufacturer.

G6G Abstract Number 20590

G6G Manufacturer Number 104193