PathSys

Category Cross-Omics>Pathway Knowledge Bases/Databases/Tools

Abstract PathSys is a graph-based system for creating a combined database of biological pathways, gene regulatory networks and protein interaction maps. PathSys is also a general-purpose, scalable warehouse of biological information, complete with a graph manipulation and a query language, a storage mechanism and a generic data-importing mechanism through schema-mapping. PathSys integrated database consists of over 20 curated and publicly contributed data sources for the eight (8) representative organisms (see list below), as well as Gene Ontology (GO) information, which is structured as an acyclic graph.

The organisms are:

1) Budding Yeast (Saccharomices cerevisiae);

2) Schizosaccharomyces pombe;

3) Fly (Drosophila melanogaster);

4) Caenorhabditis elegans;

5) Arabidopsis thaliana;

6) Mouse (Musmusculis);

7) Human (Homo sapience);

8) Zebrafish (Danio rerio).

PathSyS System Architecture -

PathSys system consists of six (6) major parts: Client Side Application, Graph Query Engines, Data Importer, Integration Client/Manager, Schema Mapping Tools and Data Warehouse.

The Client Side Application implements all business logic and a significant part of the user interface. Two (2) novel Graph Query Engines store and query the molecular interaction network and direct acyclic graphs such as ontologies and taxonomies, using specialized algorithms customized for each kind of graph. The Data Importer can accept data from an external data source, validate it against the schema, and then store the new data in the data warehouse. The Integration Client/Manager is used to specify a new database that needs to be integrated into the system. The end-user provides the system with the schema of a new source, and the schema is validated and stored in the Schema Library.

Database Schema -

The PathSys database was designed to represent generic network data. The current implementation defines three (3) classes of vertices – primary nodes (primary objects), connector nodes (events of interaction or regulation between primary objects) and graph nodes [complex objects (protein complexes, cell processes) that might contain graphs]. Connector nodes are identified by mechanism and effect type. Vertices are stored in the table Nodes. Nodes themselves can be of several types (Proteins, Small Molecules, Cell Processes, Expression Controls, Binding, Protein Modification etc.), which are recorded in the table NodeType.

The current implementation also supports three (3) classes of links between vertices– directed and non-directed (as defined by the field Direction in the table Edges) and membership (as defined by the field Relation in the table Edges). Directed links can describe biological notions of regulation, such as “protein A activates protein B”; non- directed links are used to describe binding events: “protein A binds to protein B”; membership links describe situations such as “protein complex P contains protein A”. The database structure does Not limit the number of different classes of nodes and edges. Also, there can be any number of node types and attribute types.

Data Integration -

While the logical model of information in PathSys is graph based, there are in reality a variety of information sources that provide different components of the integrated graph. PathSys uses a very generic internal model to accommodate different kinds of sources, such that a 'new source', providing a new set of nodes, edges, or node/edge properties can be dynamically incorporated into an existing integrated database.

Source Definition -

Currently, the external data source(s) can be:

1) a relational database schema;

2) a tree-structured Extensible Markup Language (XML) document;

3) an Resource Description Framework (RDF)-styled triplet that describes an edge set of a graph;

4) a Directed Acyclic Graph (DAG) structured Web Ontology Language (OWL) document.

Data Visualization -

PathSys data visualization is done through the Client Side Application, BiologicalNetworks (see G6G Abstract Number 20067U), which implements all business logic and a significant part of the user interface.

System Requirements

BiologicalNetworks pathways analysis software supports Windows 2000 and XP, Linux and Macintosh operating systems.

Manufacturer

Manufacturer Web Site PathSys

Price Contact manufacturer.

G6G Abstract Number 20068

G6G Manufacturer Number 101833