Category Genomics>Genetic Data Analysis/Tools

Abstract GenoWatch is a 'disease gene' mining browser for association studies. This real-time batch single nucleotide polymorphism (SNP) and short tandem repeat polymorphism (STRP) overview system can be used to effectively extract up-to-date information from public domain websites.

Up to 100 markers can be processed in a batch so that researchers do Not have to repetitively perform tedious info retrieval steps.

GenoWatch utilizes real-time web integration to ensure that researchers obtain the same info as when they do manual browsing.

The system greatly increases the throughput of candidate region analysis, avoids acquiring obsolete data following public database updates and reduces possible errors in manual operations.

GenoWatch is very suitable for ‘disease candidate gene’ selection from candidate regions that are defined either by markers or by chromosome physical positions plus a flanking region.

The system accepts two (2) common types of genomic markers - SNPs and STRPs, and can batch process SNP inputs.

A SNP marker name can be input via a 'dbSNP rs ID' or Affymetrix Probe ID.

Once inputted, GenoWatch first locates the targeted chromosome regions. Each target region may be defined and displayed by a single marker or by a group of markers that are close to one another, within a one (1) Mbp range.

Subsequently, GenoWatch extracts gene information from major public websites such as the National Center for Biotechnology Information (NCBI), Universal Protein Resource (UniProt), Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), etc.

The information made available includes gene function, tissue specificity, disease, subcellular location, pathway, ontology and related PubMed articles from relevant journals.

During processing, the system continuously reports the process status for every subtask.

The system integrates extracted information from different databases into a carefully designed results page, which is displayed when all processes have been completed.

For a batch task, the system positions all input markers on chromosomes and places these on an overview map called the 'Genome View' located at the top of the ‘results page’, providing researchers a clear overall picture of their markers, which are ‘colored coded’ according to impact risk, and it also displays nearby genes in the whole human genome.

When a marker on the 'Genome View' is clicked on, a summary map, 'Gene View' (see below...), displays Not only the marker and its nearby genes, but it also displays the distance between them in the region.

Markers and gene info including their positions on the current assembly; gene ontology from GO; pathway(s) from KEGG; function and disease annotation from UniProt; and related articles in PubMed; can easily be accessed via a mouse click.

Clicking on a marker leads to ‘Gene View’ --

Clicking on a marker leads to Gene View, showing its physical location on a chromosome, the relative positions of neighboring markers and structured genes and a gene list with different levels of gene annotation info.

Genes shown in both forward and reverse strands are colored blue for a single isoform and purple for multiple isoforms. In addition, pseudo- genes are shown in red (for ‘unknown gene structure’).

The annotation for each gene is accessed through colored squares listed below the summary map.

The red letter ‘F’ square is for ‘Gene Function’ and gives a general description of functions of the gene or related proteins. The orange letter ‘T’ describes the ‘Tissue-specific expression’ of mRNA and protein from the gene.

The pink letter ‘D’ lists ‘Diseases’ associated with the gene or a deficiency of a protein from the gene. The green ‘L’ gives the ‘Subcellular Location’ of mature proteins related to the gene.

The blue ‘P’ for ‘Pathway’ describes the metabolic pathway with which the gene is associated. The purple ‘O’ (‘Gene Ontology’) provides information from the GO database.

These colored squares provide an overview of all gene annotations in a particular region.

In front of these colored squares, there is a special cross-reference square that provides links to allow the user to submit gene data directly to other online query systems, such as PrimerZ for primer design of a single gene, CrossPath for pathway mapping of a group of genes, VisualSNP for SNP prioritization of SNP markers or genes, HapMap for haplotype analysis, and other sites for microRNA, genomic variant searches, etc.

Furthermore, clicking on any of the markers, genes or squares on the maps will lead to the 'Table View' for additional detailed info.

All extracted and displayed info can be linked to its original source page for verification.

GenoWatch provides researchers an efficient and convenient way to analyze their markers or candidate regions in batch mode, with associated gene annotation data and to perform downstream assays.

Implementation --

GenoWatch, written in Java, takes advantage of the Jakarta Struts framework technology to implement the Model-View-Controller (MVC) architecture.

To accommodate most users’ needs at the GenoWatch input stage, JavaScript was used to implement a dynamic form with an interactive graph that provides various query options to designate a genome region.

To effectively extract data in real-time from different public sources, information-gathering processes are executed in parallel using multiple threads.

An asynchronous process is implemented to offer users a request ID for retrieving results later, or to send an email notification when the request is completed.

System Requirements



Manufacturer Web Site GenoWatch

Price Contact manufacturer.

G6G Abstract Number 20479

G6G Manufacturer Number 104104