## Pointillist

** Category** Cross-Omics>Pathway Analysis/Tools

** Abstract** Pointillist is a collection of programs (a system) for inferring
the 'set elements' affected by a perturbation of a biological system,
based on a collection of evidences.

It contains four (4) programs: Data Manager, Data Normalizer, Significance Calculator, and Evidence-Weighted Inferer.

The software implements the full set of 'data integration methods' discussed in the following article:

Hwang D, Rust AG, Ramsey S, Smith JJ, Leslie DM, Weston AD, de
Atauri P, Aitchison JD, Hood L, Siegel AF, Bolouri H. "A data integration
methodology for systems biology." *Proc Natl Acad Sci U S A* 2005; 102
(48) 17296-17301.

Pointillist allows users to select the 'integration method' most appropriate to their needs.

The manufacturer's methodology provides a simple and efficient means for combining multiple sets of noisy data to produce probabilistic models.

The final outcome of this data integration procedure is a 'network model' in which nodes represent 'biomolecular species' (e.g., genes or proteins) and edges represent interactions (e.g., transcriptional regulation).

The manufacturer's methodology associates a P value with each node and edge in the network model.

These P values indicate the degree of confidence in a node or edge being a true component of the system of interest (compared with background/control).

1) Data Manager -- The Data Manager program enables merging of data from multiple data files that contain observations for different, partially overlapping collections of network elements.

In addition, the Data Manager allows averaging over multiple measurements of the same evidence type for the same network element (whether across multiple files or within a single data file).

The Data Manager displays the data loaded from one or more data files in a data table, with each row corresponding to a single element and each column corresponding to a single evidence type. The table may be sorted by evidence or element name.

Either the entire table, or specific selected columns, may be saved to a data file. The data file of observations that is loaded into the Data Manager must conform to a specific format.

In addition to the format requirements listed above, the Data Manager may require that element names be unique within the input data file, depending on the state of the "average duplicates" check-box.

2) Data Normalizer -- The Data Normalizer is a program that can be used to perform a normalization of microarray expression data that is arranged in a matrix format in a single data file. The data file must conform to a specific format.

Currently, the only normalization method supported is Quantile Normalization. The quantile normalization algorithm implemented here is based on a prototype written by Daehee Hwang at the Institute for Systems Biology, and it is similar to the quantile normalization algorithm proposed by Bolstat et al. in their paper:

Bolstad, B.M., Irizarry R. A., Astrand M., and Speed, T.P. (2003), "A
Comparison of Normalization Methods for High Density Oligonucleotide
Array Data Based on Bias and Variance." *Bioinformatics* 19(2):185-193.

Note that only the quantile normalization step of the RMA (Robust Multi- Chi Average) procedure is implemented in this class; background adjustment is Not implemented here, and is assumed to have been applied to the raw observations before this program is applied to the data.

Each column of the data file corresponds to a different microarray experiment, and each row of the data file corresponds to a different probe.

3) Significance Calculator -- The Significance Calculator is a program that can analyze the 'probability distribution' of a set of observations, and compute the statistical significance of each observation on the basis of the distribution.

Alternatively, the significances can be computed based on the distribution of a separate set of "negative control" observations.

The "significance" of an observation is here defined as the probability that the observation would occur by chance, given either the global or the "negative control" distribution for observations.

The input to the program is a matrix of observations, which must conform to a specific format. Missing observations are allowed, and are denoted by an empty cell or the string "null".

4) Evidence-Weighted Inferer -- The Evidence-Weighted Inferer is a 'classification program' that attempts to divide a set of elements into two (2) sets, affected and unaffected.

It compares multiple evidences to determine which elements of a network are most likely affected by a perturbation of a system.

The input to this program is a file containing a matrix of significances for observations for evidence types (columns) and network elements (rows), in a specific format. Missing data is allowed; a missing significance value is denoted by the value "-1", a "null" string, or (in certain cases) an empty cell in the input file.

The smaller a significance value for an observation, the more likely it is that the associated element is affected by the perturbation of the system; in this sense, the significance is analogous to a probability that a given observation would occur, given that the associated element is Not a member of the set of affected elements.

The significances may be calculated using the Significance Calculator (see above), or they may be generated by any other procedure that can assign a statistical likelihood or probability.

Each evidence type is assigned a weight, based on consistency with the other types of evidence. The weights are used to compute an 'effective significance' for each significance value in the matrix.

External Libraries -- The Pointillist system relies upon a number of external open-source libraries. These libraries are bundled with the Pointillist program and are installed within the Pointillist directory when you install Pointillist on your system.

*Note: The Pointillist system has multiple versions in both MATLAB and
Java. (Contact the manufacturer for details).*

*System Requirements*

Contact manufacturer.

*Manufacturer*

- Bolouri Group
- Institute for Systems Biology
- 1441 North 34th Street
- Seattle, WA 98103
- USA

** Manufacturer Web Site**
Pointillist

** Price** Contact manufacturer.

** G6G Abstract Number** 20408

** G6G Manufacturer Number** 104038