Biologic Interactions and Network Analysis (BIANA)

Category Cross-Omics>Pathway Analysis/Gene Regulatory Networks/Tools and Cross-Omics>Knowledge Bases/Databases/Tools

Abstract BIANA (Biologic Interactions and Network Analysis) is a biological database integration and network management framework written in Python.

BIANA is designed to achieve two (2) major goals:

1) The integration of multiple sources of biological information, including biological entities and their relationships; and

2) The management of biological information as a network where entities are nodes and relationships are edges.

Moreover, BIANA uses properties of proteins and genes to infer latent biomolecular relationships by transferring edges to entities sharing similar properties.

BIANA is also provided as a plug-in for Cytoscape - (see G6G Abstract Number 20092), which allows users to visualize and interactively manage the data. A web interface to BIANA providing basic functionalities is also available.

How BIANA works --

BIANA uses a high level abstraction schema to define databases providing any kind of biological information (both individual entries and their relationships).

Any data source that contains biologic or chemical data parsed by BIANA is defined as an external database. Similarly, BIANA’s integration approach adopts the concept of external entity, corresponding to entries in external databases.

For example, a UniProt entry (a protein), a GenBank entry (a gene), an IntAct interaction (an interaction), a KEGG pathway (a metabolic relation) or a PFAM alignment are all represented as external entities.

In order to achieve data uniformity, in the cases where the data repository supplies relations, both participants and relation itself are considered as external entities, whereas the relation itself is annotated as an external entity relation which is a subtype of the external entity.

External entity objects are characterized by several attributes, such as database identifiers, sequence, taxonomy, description or function. Each external entity relation object is further characterized by some attributes such as, detection method and reliability.

Alternatively, the participants in external entity relations can have their particular attributes such as, role and cardinality.

BIANA unifies external data inserted into its database using its parsers based on a specific protocol. This protocol, called the unification protocol, consists of a set of rules that determine how data in various data sources are combined (crossed).

Each rule is composed of attributes that have to be crossed, and the external databases which are going to be used. The set of external entities that are decided to be “equivalent” with respect to a given unification protocol is called a user entity.

User entities inherit all the attributes of their included external entries. Thus, BIANA utilizes user entries specific to a certain unification protocol chosen by the user.

The user can either use provided built-in unification protocols or create his/her own unification protocols.

As an example, a user may be interested in creating a unification protocol defined by crossing similar sequences and the same taxonomy between two (2) or more databases and crossing entities with a UniProt accession code.

The advantages of this integration approach are:

1) The BIANA database only contains raw data (with exactly the same nomenclature and identifiers of the original data source); therefore it does Not entail any assumption on the data integration process allowing the user to specify how the integration should be done;

2) The user can use information from a single database or the combination of multiple databases, selecting which ones he/she wants to use at each process; and

3) The user can know exactly how the original data was processed, and be able to do a backtracking of his/her integration approach.

BIANA Software architecture --

BIANA is a Python framework (as stated above...) composed of four (4) different modules:

1) Database Management (handles communication between BIANA and the MySQL database);

2) Parser Management (imports data into the BIANA database);

3) Network management (performs networking operations using the NetworkX package); and

4) Session Management (to manage biological data sets and their networks).

The Cytoscape Plug-in is a separate and user friendly interface to BIANA (the plug-in communicates with BIANA & Cytoscape through a socket).

BIANA Analysis of networks --

BIANA grants access to most of the existing methods for the analysis of networks through NetworkX and Cytoscape: finding the shortest paths and connected components, calculating node degrees and network connectivity, etc.

In addition, BIANA includes new methods such as network randomization, node and edge tagging, calculation of linker degree based on node tags, intersection and merging of networks.

Recently, BIANA has been used in simplifying the improvement of fold recognition using protein-protein interactions and in modeling and analysis of aneurism-related molecular interactions using text-mining seed-nodes.

BIANA Prediction of edges --

BIANA predicts novel relationships by transferring existing edges between nodes with common properties. Basically, let x, y, z be biological entities obtained with the unification approach. An interaction is predicted between x and y if:

1) x is observed to interact with z; and

2) y shares some attributes (decided by the user, i.e. PFAM domains, SCOP domains, or sequence similarity using cut-offs based on e-value or percentage of identity) with node z.

This is an extension of the definition of interologs using other relationships that are different than orthology.

(An interolog is a conserved interaction between a pair of proteins which have interacting homologs in another organism.)

For example, the manufacturers generated protein-protein interaction (PPI) networks from proteins and they compared them with networks of protein-protein interaction predictions based on the transfer of interactions between proteins (i.e. y and z) whose 90% of its sequence, could be aligned with at least 90% of sequence similarity.

BIANA's approach to data unification solves many of the nomenclature issues common to systems dealing with biological data. BIANA can easily be extended to handle new specific data repositories and new specific data types.

The unification protocol (see above...) allows BIANA to be a flexible tool suitable for different user requirements: non-expert users can use a suggested unification protocol while expert users can define their own specific unification rules.

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site BIANA

Price Contact manufacturer.

G6G Abstract Number 20679

G6G Manufacturer Number 104257