## Bayesian Network Finder (BNFinder)

** Category** Intelligent Software>Bayesian Network Systems/Tools and Cross-Omics>Pathway Analysis/Gene Regulatory Networks/Tools

** Abstract** BNFinder is an exact and efficient software method for learning Bayesian networks.

It allows you to do Bayesian network reconstruction from experimental data. BNFinder supports dynamic Bayesian networks and, if the variables are partially ordered, also supports static Bayesian networks.

The main advantage of BNFinder is the use exact algorithm, which is a very efficient (polynomial with respect to the number of observations).

The BNFinder system is based on a novel polynomial-time algorithm (as stated above...) for learning an optimal Bayesian network structure.

The algorithm was designed to save reasonable time (speed) and perfect quality of learning in a wide class of problems occurring in the computational molecular biology field.

It works under the assumption that there is No need to examine the acyclicity of the graph, which is satisfied in the following cases:

1) When dealing with dynamic Bayesian networks, a dynamic Bayesian network describes stochastic evolution of a set of random variables over discretized time.

Therefore, conditional distributions refer to random variables in neighboring time points and the graph is always acyclic.

2) In case of static Bayesian networks the set of possible network structures must be restricted.

BNFinder lets the user divide the set of variables into an ordered set of disjointed subsets of variables, where edges can only lead from upstream to downstream subsets.

If such ordering is Not known beforehand, one can try to run BNFinder with different orderings and choose a network with the best overall score.

BNFinder learns optimal networks with respect to two (2) generally used scoring criteria: Bayesian-Dirichlet equivalence (BDe) and minimal description length (MDL).

The (default) BDe score originates from Bayesian statistics and corresponds to the posterior probability of a network-given data.

The MDL score originates from information theory and corresponds to the length of the data compressed with the compression model derived from the network structure.

It also has a statistical interpretation as an approximation of the posterior probability. The algorithm works in polynomial time for both scores, but computations with the MDL are faster, especially for large datasets.

However, the manufacturer recommends using the BDe score due to its exactness in the statistical interpretation.

Both MDL and BDe scores were originally designed for discrete variables.

Continuous variables are handled with corresponding scores, derived under the assumption that conditional distributions belong to a family of Gaussian mixtures.

BNFinder may learn either dynamic Bayesian networks (from time series data) or static Bayesian networks (from independent experimental data). In the second case it is necessary to specify constraints on the network’s structure forcing its acyclicity.

A special treatment is required for experiments, in which the values of some variables were perturbed (e.g. knockout experiments). Since perturbations change the structure of interactions, the learning procedures have to use data selectively.

BNFinder handles perturbations following (Dojer N, et al. Applying dynamic Bayesian networks to perturbed gene expression data. BMC Bioinformatics 2006;7:249), i.e. for scoring sets of parents of a variable v, it takes into account only the experiments where v was Not perturbed.

A prior distribution on the network structure may be specified through assigning weights to potential variable interactions following Tamada Y, et al. Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection. Bioinformatics 2003;1 (Suppl. 2):ii227-ii236.

Moreover, the size of the regulator sets of each variable may be bound to a given number and the spaces of possible conditional probability distributions of selected variables may be restricted to noisy-and or noisy-or distributions.

There are important biological applications of Bayesian networks, in which the usual amount of learning data is small relative to the network size (e.g. reconstruction of gene regulatory networks (GRNs) from microarray data).

Typically in such cases suboptimal models explain the data nearly as well as the optimal (highest scoring) ones.

For this reason, Friedman N, Koller D. (Being Bayesian about network structure. a bayesian approach to structure discovery in Bayesian networks. Mach. Learn 2003;50:95-125), propose that you should pay attention to network features frequently appearing in suboptimal networks.

Following this idea, BNFinder splits a potential network structure into independently learned features, each one composed of a vertex and its parents set.

For each vertex BNFinder returns as an output, a user-specified number of suboptimal parent features with their relative posterior probabilities.

Setting this parameter to 1, causes BNFinder to learn the optimal network structure composed of the highest scoring features. Otherwise returned features constitute a class of suboptimal networks.

Note: Output may be written in a few formats, supported in various graph and Bayesian network applications.

BNFinder Implemetation and Documentation --

The BNFinder software is implemented in the Python programming language so it can be installed and run on all popular operating systems. The only requirement is the availability of a recent version (>2.4) of the Python interpreter.

Besides the stand-alone version of BNFinder, the manufacturer has created a publicly available web server which allows you to use BNFinder running on their servers, using your input data.

The server uses a very simple web form for input and sends the results to the E-mail address provided.

To save computing resources, the manufacturer has limited the web version to handle at most 20 variables and 500 observations.

Extensive documentation and installation instructions are also available on the manufacturer's web-site.

*System Requirements*

Contact manufacturer.

*Manufacturer*

- Institute of Informatics
- University of Warsaw
- Poland

** Manufacturer Web Site**
BNFinder

** Price** Contact manufacturer.

** G6G Abstract Number** 20697

** G6G Manufacturer Number** 104270