## Visual Integration for Bayesian Evaluation (VIBE) 2.0

** Category** Cross-Omics>Agent-Based Modeling/Simulation/Tools

** Abstract** Visual Integration for Bayesian Evaluation (VIBE) 2.0 is a stand-alone software tool that allows a user to explore the effects of including or excluding specific data sources such as transcriptomic, proteomics, metabolomics, biochemistry, and function in a Bayesian fusion analysis.

It offers a simple and flexible approach to fuse complementary datasets and dynamically evaluate the contribution of each dataset.

VIBE works by integrating probability models from multiple data streams. The software can either ingest precomputed ‘probability models’ or create them from the raw data.

The statistical methods used to derive the probability models and the data that is included in the fusion can be modified on the fly to analyze the system dynamically.

VIBE 2.0 Analysis Capabilities --

VIBE 2.0 takes as input either raw datasets or precomputed probability models for each data type.

The probability model is a matrix where each value (i, j) is the probability of observing a specific known experimental group (j) given a sample (i) associated with one or more datasets.

To create these probability models VIBE uses statistical learning algorithms, including naïve Bayes classification, degree of association, k-nearest neighbors, and multinomial logistic regression.

These statistical learning algorithms compute the probability of observing the specific data associated with a sample given a particular experimental group.

Bayesian statistics are used to generate the posterior probabilities that are represented in the probability matrices used by VIBE.

For each data source, VIBE 2.0 then calculates the classification accuracy (the fraction of samples assigned to the correct experimental group) providing the user a baseline that shows quantitatively the effectiveness of each individual analysis platform.

A class assignment table is also graphically displayed, depicting the experimental groups into which the true samples from an experimental group are classified.

The visualization allows the user to gain insight into the efficacy of the individual platforms, for example, showing that a particular data type is unable to distinguish between two (2) of the experimental groups.

The user then selects a subset (or the full set) of the data sources to be included in the integrated analysis and VIBE 2.0 performs a Bayesian fusion and gives the classification accuracy based on the integrated probability model.

As the fusion calculation is almost instantaneous, the user can experiment with multiple combinations of the input data sets to evaluate the impact of including each data set in the fused analysis.

VIBE 2.0 Implementation --

The application consists of three (3) graphical user interface (GUI) screens. The first two screens are associated with the user ‘input’ where the data sources are specified, as well as the statistical methods that will be used to analyze each dataset.

The third or ‘analysis’ screen then provides visualization of the data integration analysis.

VIBE 2.0 Input screen --

On the input screen, the user specifies the source data files containing the class matrix (defining the true experimental group of each sample in the experiment) and the raw data or probability matrices for each individual data source to be used in the integration.

The data handling screen is used to select the type of data for upload and the statistical method to be used to create the probability model.

VIBE 2.0 does Not perform any data quality checks beyond assuring dataset sample sizes match and the data have appropriate values for the statistical method to be employed. The assumption is that the data are of adequate quality and has been properly normalized prior to analysis.

VIBE 2.0 does offer auto-scaling of the data, which will normalize all variables to have a common mean of zero and unity variance.

The uploaded files can be MATLAB® (.mat), Microsoft Excel (.xls or .xlsx) or flat text (.txt) files. There are also fields in the input screen where the user may also enter a name, an abbreviation and a brief description for each dataset, as well as names for each experimental group if they are Not specified in the class file.

Once all information is entered, the user presses the ‘Continue’ button to launch the ‘analysis’ screen.

VIBE 2.0 Analysis screen --

The analysis screen has two (2) visualization sections, one displaying the analysis results of each individual data source and a second displaying the results of the data fusion. Up to six data sources are viewable simultaneously.

Upon launch, the classification accuracy and the class assignment table for each individual data source are calculated and displayed.

The class assignment table is displayed as a plot with true class along the left axis and predicted class along the top axis, where the color at each location represents the fraction of samples classified into the associated class. Thus, a diagonal line of red boxes running top left to bottom right represents perfect classification.

The user selects the subset of datasets to use in the integrated analysis via the ‘Use in Integration’ buttons adjacent to each data source (default is all selected).

The ‘Integrate’ button calculates the integrated probability model and displays the classification accuracy and the class assignment table for the fused analysis. Multiple combinations can be explored interactively as the calculation is nearly instantaneous (as stated above…).

Additional features are available to facilitate the use of the integrated results in further analysis. Optional annotations can be added and the ‘Save Screen’ button saves a jpeg image of the current state of the analysis screen.

The ‘Output File’ button exports a (.xls) file containing results from the integrated analysis giving, for each sample in the experiment, the true class, the predicted class and probability of being assigned to the predicted class.

*System Requirements*

VIBE 2.0 was built in MATLAB 2009b from The Mathworks, Inc.® and is packaged, using Version 4.11 MATLAB Compiler, as a stand-alone executable for the Windows platform.

*Manufacturer*

- Computational Mathematics and
- Computational Biology and Bioinformatics
- Pacific Northwest National Laboratory (PNNL)
- Richland, WA 99352, USA

** Manufacturer Web Site**
VIBE

** Price** Contact manufacturer.

** G6G Abstract Number** 20680

** G6G Manufacturer Number** 104258