CLC Genomics Workbench

Category Cross-Omics>Next Generation Sequence Analysis/Tools

Abstract CLC Genomics Workbench is a new solution for analyzing and visualizing Next Generation Sequencing (NGS) data. It incorporates cutting-edge technology and algorithms, while also supporting and integrating with the rest of your typical NGS workflow.

CLC Genomics Workbench includes all features of 'CLC Main Workbench' (see G6G Abstract Number 20096A) and the following additional functionalities:

1) De novo assembly -- The de novo assembly of CLC Genomics Workbench supports both short and long reads, it supports paired-ends reads, and it supports Sanger, 454, Illumina Genome Analyzer, Helicos, and SOLiD sequencing data.

The de novo assembly process has two stages: First, contig sequences are created by aligning all the reads. Second, all the reads are assembled using the contig sequence as reference.

2) Reference assembly -- The reference assembly of CLC Genomics Workbench supports both short and long reads, it supports paired-ends reads, and it supports Sanger, 454, Solexa, Helicos, and SOLiD sequencing data.

3) Reference assembly of mixed datasets (e.g. 454 and Illumina Genome Analyser).

4) Reference assembly of genomes of any size.

5) Assembly of standard read data and support for assembly of paired end reads / mate pair reads of any sequencing technology.

6) Advanced graphical tools for the detection of large scale mutations and rearrangements:

7) Multiplex Sequencing by Name - When you do batch sequencing of different samples, you can use multiplexing techniques to run different samples in the same run.

There is often a data analysis challenge to separate the sequencing reads, so that the reads from one sample are assembled together.

8) Support for Multiplex Sequencing by Tag - With many of the new high- throughput processes there is a need for being able to input several different samples to the same sequencing run.

One method is to tag the sequences with a unique identifier during the preparation of the sample for sequencing [Meyer et al., 2007].

9) Masking of reference assembly based on annotations like e.g. exons.

10) Integration with CLC bio’s High Performance computing solutions (see below), making assemblies very fast.

11) Interactive and zoom-able viewing of genome assemblies, including sequencing reads, quality data, and reference sequences. Full integration of the viewers included in the downstream analyses.

12) Quality reporting and statistics on raw data - Reporting of assembly output - CLC Genomics Workbench allows for three (3) types of output reporting:

13) Trimming and filtering sequences - CLC Genomics Workbench offers a number of ways to trim and filter out sequence reads prior to assembly:

14) Single Nucleotide Polymorphism (SNP) detection - Instead of manually checking all the conflicts of a contig to discover significant single-nucleotide variations, CLC Genomics Workbench offers automated SNP detection.

The SNP detection in CLC Genomics Workbench is based on the Neighborhood Quality Standard (NQS) algorithm of [Altshuler et al., 2000] (also see [Brockman et al., 2008] for more information).

Based on your specifications on what you consider a valid SNP, the SNP detection will scan through the entire contig and report all the SNPs that meet the requirements.

15) Support for integration with the CLC Bioinformatics Database.

16) CLC Genomics Workbench is fully integrated with 'CLC NGS Cell', CLC bio’s command line solution for ‘super fast assembly’ of Next Generation Sequencing data.

The command-line interface of CLC NGS Cell enables the functionalities to be included in scripts and other Next Generation Sequencing work-flows.

CLC NGS Cell is utilizing SIMD instructions to parallelize and accelerate the assembly algorithms, making the program one of the fastest Next Generation Sequencing assembler at present.

Note: SIMD (Single Instruction, Multiple Data) is a technique employed to achieve data level parallelism, as in a vector processor.

System Requirements

CLC Genomics Workbench is available on Windows, Mac OS X, and Linux platforms.

Manufacturer

Manufacturer Web Site CLC Genomics Workbench

Price Academic license US $4,995; Industrial license US $9,990. VAT number: DK 28 30 50 87

G6G Abstract Number 20277

G6G Manufacturer Number 100520