SUSPECTS

Category Genomics>Genetic Data Analysis/Tools

Abstract SUSPECTS is a web-based server which combines annotation and sequence-based approaches to prioritize 'disease candidate genes' in large regions of interest.

It uses multiple lines of evidence to rank genes quickly and effectively while limiting the effect of ‘annotation bias’ to significantly improve performance.

SUSPECTS is a novel, consolidated approach that combines the increased precision of annotation-based methods with the better recall of sequence-based methods.

Given a set of existing ‘candidate genes’ for a particular complex or oligogenic disease, it effectively automates further candidate gene selection from large regions on the principle that genes involved in that disease will tend to share the same or similar annotation, reflecting common ‘biological pathways’.

The aim of SUSPECTS is to efficiently automate the first steps of the ‘candidate gene’ approach.

In more depth - SUSPECTS is a system for matching Gene Ontology (GO) terms, InterPro domains and 'gene expression' data built on top of the PROSPECTR ‘candidate prioritization’ system.

PROSPECTR (see G6G Abstract Number 20428) uses 'sequence features' to rank genes in order of their likelihood of involvement in disease; with SUSPECTS you can drill down further to 'rank genes' involved in specific complex traits and syndromes.

How SUSPECTS works --

SUSPECTS operates on the assumption that genes involved in a complex trait will belong to 'similar pathways' and should thus be more likely to share domains, annotation and 'patterns of expression'.

The server takes two (2) inputs –

The first input is the coordinates of the genomic region that you are interested in.

You can specify these using markers, bands, chromosomal coordinates or genes.

The second input is a list of genes involved in the same 'complex disease' as the one you are interested in.

As a shortcut, you may simply enter the name of the disease;

SUSPECTS will find appropriate genes for you from the Online Mendelian Inheritance in Man (OMIM) database; the Human Gene Mutation Database (HGMD) and Genetic Association Database (GAD) (see G6G Abstract Number 20314).

This list is known as the "match set".

SUSPECTS retrieves a ‘list of genes’ in the region requested and scores them in order of likelihood of involvement in disease by looking at their 'sequence features'.

For each gene SUSPECTS then looks for Gene Ontology (GO) terms that are 'semantically similar at a significant level' to terms associated with genes in the match set.

Each gene is scored according to how well its GO annotation compares to the annotation found in the match set.

The manufacturer uses the information content of the terms in question to determine how big or small a score to give for each match.

SUSPECTS then looks for Interpro domains shared with the match set.

Note: InterPro is a database of protein families, domains, regions, repeats and sites in which identifiable features found in known proteins can be applied to new protein sequences.

The score given to each gene depends on how significant the match is, based on how often the domain in question is found in the genome.

Finally, SUSPECTS examines the ‘gene expression profile’ and compares it to the profiles from the ‘match set’ using Spearman's rho rank-order correlation.

Scores depend on how well correlated any matching profiles are.

A weighted average is then calculated and a ranked list of genes is displayed.

Genes near the top of the list are - in theory - better candidates than those further down.

SUSPECTS limitations --

Disease genes are far better annotated than other genes and you should bear this in mind when interpreting results.

Different types of matches are weighed differently; the weights assigned are arbitrary.

The ‘SUSPECTS server’ will sometimes time-out when dealing with complex queries. If this happens to you, try restricting the match set.

SUSPECTS similar systems --

You could also try/use ENDEAVOUR (see G6G Abstract Number 20416).

System Requirements

Contact manufacturer.

Manufacturer

Manufacturer Web Site SUSPECTS

Price Contact manufacturer.

G6G Abstract Number 20427

G6G Manufacturer Number 104056