The OpenMS Proteomics Pipeline (TOPP)

Category Cross-Omics>Workflow Knowledge Bases/Systems/Tools and Proteomics>Mass Spectrometry Analysis/Tools

Abstract TOPP (The OpenMS Proteomics Pipeline) is a pipeline for the analysis of high-performance liquid chromatography/mass spectrometry (HPLC/MS) data.

It consists of several small applications that can be chained to create analysis pipelines (workflows) tailored for a specific problem.

TOPP also includes the MS data viewer TOPPView --

TOPPView - is an integrated data visualization and analysis tool for mass spectrometric data sets. TOPPView allows the visualization and comparison of individual mass spectra, two-dimensional liquid chromatography/mass spectrometry (LC/MS) data sets and their accompanying metadata.

By supporting standardized XML-based data exchange formats, data import is possible from any type of mass spectrometer. The integrated analysis tools of TOPP allow efficient data analysis from within TOPPView through a convenient graphical user interface (GUI).

And, an assistant for GUI-driven TOPP workflow design TOPPAS --

TOPPAS - allows you to create, edit, open, save, and run TOPP workflows.

Pipelines can be created conveniently in the GUI by means of mouse interactions. The parameters of all involved tools can be edited within the application and are also saved as part of the pipeline definition in a .toppas file.

Furthermore, TOPPAS interactively performs validity checks during the pipeline editing process, in order to make it more difficult to create an invalid workflow.

The TOPP tools are divided into a number of subgroups:

TOPP File Handling --

1) DTAExtractor - Extracts scans of an mzML file to several files in the DTA format.

The retention time, the m/z ratio (for MS level > 1) and the file extension are appended to the output file name. You can limit the exported spectra by m/z range, retention time range or MS level.

2) FileInfo - Shows basic information about the data in a file. This tool can show basic information about the data in several peak, feature and consensus feature files.

3) FileConverter - Converts between different MS file formats. This converter tries to determine the file type from the file extension or from the first few lines of the file. If file type determination is Not possible, you have to give the input or output file type explicitly.

4) FileFilter - Extracts or manipulates portions of the data from an mzML, featureXML or consensusXML file.

With this tool it is possible to extract m/z, retention time and intensity ranges from an input file and to write all data that lies within the given ranges to an output file.

5) FileMerger - Merges several files into an mzML file. The Meta information that is valid for the whole experiment (e.g. MS instrument and sample) is taken from the first file.

The retention times for the individual scans are taken from the input file Meta data, from the input file names or are auto-generated.

6) IDMerger - Merges several (protein/peptide identification) IdXML files into one IdXML file. You can merge an unlimited number of files into one IdXML file. This tool is typically applied before ConsensusID or IDMapper (see below...).

7) IDFileConverter - Converts identification engine file formats. Conversion from the TPP file formats pepXML and protXML to OpenMS' idXML is quite comprehensive, to the extent that the original data can be represented in the simpler idXML format.

8) SpectraMerger - Merges spectra from an LC/MS map, either by precursor or by RT blocks.

9) TextExporter - Exports various XML formats to a text file.

This application converts several OpenMS XML formats (namely featureXML, consensusXML and idXML) to text files. The primary goal of this tool is to create a readable format for Excel and OpenOffice.

Topp Signal processing and preprocessing --

1) BaselineFilter - Executes the top-hat filter to remove the baseline of an MS experiment. This nonlinear filter, known as the top-hat operator in morphological mathematics, is independent of the underlying baseline shape.

It is able to detect an over brightness even if the environment is Not uniform. The principle is based on the subtraction of a signal from its opening (erosion followed by dilation).

2) NoiseFilter - Removes noise from profile spectra by using different smoothing techniques. It executes a “Savitzky Golay” or a Gaussian filter to reduce the noise in an MS experiment.

The idea of the Savitzky Golay filter is to find filter-coefficients that preserve higher moments, which means to approximate the underlying function within the moving window by a polynomial of higher order (typically quadratic or quartic) - (see A. Savitzky and M. J. E. Golay, “Smoothing and Differentiation of Data by Simplified Least Squares Procedures”).

The Gaussian is a peak area preserving low-pass filter and is characterized by narrow bandwidths, sharp cutoffs, and low pass-band ripple.

3) PeakPicker - Can be used to find mass spectrometric peaks in profile mass spectra.

4) Resampler - Can be used to transform an LC/MS map into a re-sampled map or a Portable Network Graphics (PNG) image.

5) SpectraFilter - Can be used to apply different spectrum modification filters to the data.

6) MapNormalizer - Can be used to normalize peak intensities to the percentage of the maximum intensity in the HPLC-MS map.

7) InternalCalibration - Can be used to perform an internal calibration on an MS experiment.

8) TOFCalibration - Can be used to perform an external calibration for Time of Flight (TOF) spectra.

9) PrecursorMassCorrector - Can be used to correct the precursor entries of tandem MS scans.

Topp Quantitation --

1) AdditiveSeries - Computes an additive series to quantify a peptide in a set of samples.

2) Decharger - Decharges and merges different feature charge variants of the same chemical entity.

3) FeatureFinder - Detects two-dimensional features in LC-MS data.

4) ProteinQuantifier - Computes protein abundances from annotated feature/consensus maps.

5) SILACAnalyzer - Determines the ratio of peak pairs in LC-MS data.

6) ITRAQAnalyzer - Extracts and normalizes iTRAQ information from an MS experiment.

7) SeedListGenerator - Generates seed lists for feature detection.

Topp Protein/Peptide Identification --

1) CompNovo - Performs a peptide/protein identification with the CompNovo engine.

2) InspectAdapter - Identifies MS/MS spectra using Inspect (external).

3) MascotAdapter - Identifies MS/MS spectra using Mascot (external) - (see G6G Abstract Number 20087).

4) MascotAdapterOnline - Identifies MS/MS spectra using Mascot (external).

5) OMSSAAdapter - Identifies MS/MS spectra using OMSSA (external).

6) PepNovoAdapter - Identifies MS/MS spectra using PepNovo (external).

7) XTandemAdapter - Identifies MS/MS spectra using XTandem (external).

8) SpecLibSearcher - Identifies peptide MS/MS spectra by spectral matching with a searchable spectral library.

Topp Protein/Peptide Processing --

1) ConsensusID - Computes consensus identification from peptide identifications of several identification engines.

2) FalseDiscoveryRate - Estimates the false discovery rate on peptide and protein level using decoy searches.

3) IDDecoyProbability - Estimates peptide probabilities using a decoy search strategy.

4) IDFilter- Filters results from protein or peptide identification engines based on different criteria.

5) IDMapper - Assigns protein/peptide identifications to feature or consensus features.

6) IDPosteriorErrorProbability - Estimates posterior error probabilities using a mixture model.

7) IDRTCalibration - Can be used to calibrate RTs of peptide hits linearly to standards.

8) PeptideIndexer - Refreshes the protein references for all peptide hits.

9) ProteinInference - Infer proteins from a list of (high-confidence) peptides.

Topp Targeted Experiments --

1) InclusionExclusionListCreator - Creates inclusion and/or exclusion lists for LC-MS/MS experiments.

2) PrecursorIonSelector - A tool for precursor ion selection based on identification results.

Topp Peptide property prediction --

1) RTPredict - Predicts retention times for peptides using a model trained by RTModel.

2) RTModel - Trains a model for the retention time prediction of peptides from a training set.

3) PTPredict - Predicts the likelihood of peptides to be proteotypic using a model trained by PTModel.

4) PTModel - Trains a model for the prediction of proteotypic peptides from a training set.

Topp Map alignment --

1) MapAligner - Corrects retention time distortions between maps.

2) FeatureLinker - Groups corresponding features in one map or across maps (after alignment).

Topp Misc --

1) GenericWrapper - Allows the generic wrapping of external tools.

2) ExecutePipeline - Executes workflows created by TOPPAS.

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site TOPP

Price Contact manufacturer.

G6G Abstract Number 20660

G6G Manufacturer Number 104253