G6G Directory of Omics and Intelligent Software

Pathway Interaction Database

Category Cross-Omics>Pathway Knowledge Bases/Databases/Tools

Abstract The Pathway Interaction Database (PID) is a highly- structured, curated collection of information about known ‘biomolecular interactions’ and key cellular processes assembled into 'signaling pathways'.

It is a collaborative project between the US National Cancer Institute (NCI) and Nature Publishing Group (NPG), and is an open access online resource.

PID is aimed at the cancer research community and others interested in cellular pathways, such as neuroscientists, developmental biologists, and immunologists.

The database focuses on the biomolecular interactions that are known or believed to take place in human cells.

It can be browsed as an online encyclopedia, used to run computational analyses, or employed in ways that combine these two approaches. In addition to PID's pre-defined pathways, search results are displayed as dynamically constructed interaction networks.

These features of PID render it a useful tool for both biologists and bioinformaticians.

The database is supplemented by a concise editorial section that includes specially written synopses of recent important research articles in areas related to cancer research, and specially commissioned ‘Bioinformatics Primers’ that provide practical advice on how to make the most of other relevant online resources.

The database and editorial content are updated monthly, and users can opt to receive a monthly email alert to stay informed about new content.

Database Content -- The PID contains information about molecular interactions and biological processes in biomolecular pathways.

All interactions are assembled into pathways, and can be accessed by performing searches for biomolecules, or processes, or by viewing predefined pathways.

The 'Browse Pathways' page lists the predefined pathways and provides a good overview of the database content.

Source of Data in the PID -- There are three (3) sources: NCI-Nature curated data, BioCarta and Reactome data. The NCI-Nature curated data are created by Nature Publishing Group editors and reviewed by experts in the field.

Biomolecules are annotated with UniProt protein identifiers and relevant post-translational modifications. Interactions are annotated with evidence codes and references.

In contrast, BioCarta data from June 2004 was imported without expert review, and biomolecules are annotated with 'Entrez Gene' identifiers without associated post-translational modifications, evidence codes or references.

Data from the Reactome is updated when new content is made available. Reactome data is annotated with UniProt identifiers, post- translational modifications and references.

Data Representation in the Database -- Data is represented in a highly structured and granular data model.

Network maps - The graphics are interactive in both Graphics Interchange Format (GIF) and Scalable Vector Graphics (SVG) formats allowing users to click on items in the predefined pathways or dynamically generated 'network maps' for further information.

Network maps are displayed in the GIF format by default. The networks can be large and complex, so the manufacturers recommend viewing them in SVG format, which can be easily zoomed and panned. Searching the PID -- The simplest way to search the database is to 'Browse the pre-defined Pathways' list.

In addition, a 'search box' on the PID homepage allows users to query multiple object types within the database.

To 'query biomolecules', UniProt protein accession numbers, HUGO gene symbols, Entrez Gene identifiers, aliases listed in Entrez Gene, CAS numbers and compound names may be used.

To 'query biological processes', Gene Ontology (GO) identifiers (entered as GO:xxxxxxx), GO biological process terms, NCI thesaurus terms, and NCI thesaurus identifiers may be used. A user may enter any combination of the above-mentioned terms and identifiers.

An 'Advanced Search' can be performed using biomolecule names or identifiers, pathway names, GO biological process terms or identifiers, NCI thesaurus terms or identifiers or any combination of these, with an option to limit the search by evidence-type tags, called evidence codes, and an option to include upstream and/or downstream interactions.

A 'Connected molecules' search finds one path connecting two or more query biomolecules.

A 'Batch query' allows users to upload potentially long lists of biomolecules, entered in a single column, and view their relationships in pathways or interaction network maps.

Two (2) lists can be uploaded simultaneously, with the biomolecules colored accordingly in the search results.

Computational Analyses that can be performed -- On the 'Batch query" page, you can upload long lists of UniProt and/or Entrez Gene names or identifiers derived from, for example, high throughput expression data, mass spectrometry data or any other high throughput data.

You may then select to obtain 'interaction network maps' for all biomolecules in your list or overlay your biomolecules onto predefined pathway(s). Biomolecules from the list(s) will be color-coded within the pathway or network map.

Available Data formats -- Pathways are available in graphical GIF and SVG formats, as well as text-based PID Extensible Markup Language (XML) and Biological Pathways Exchange (BioPAX) Level 2 pathway data exchange formats. The SVG format requires an SVG-aware web browser.

System Requirements

Contact manufacturer.

Manufacturer

US National Cancer Institute (NCI) and
Nature Publishing Group (NPG)

Manufacturer Web Site Pathway Interaction Database

Price Contact manufacturer.

G6G Abstract Number 20245

G6G Manufacturer Number 101845

The G6G Directory of Omics and Intelligent Software

Pathway Interaction Database