BIRCH (Biological Research Computing Hierarchy)

Category Cross-Omics>Workflow Knowledge Bases/Systems/Tools

Abstract BIRCH (Biological Research Computing Hierarchy) is an organizational framework for delivering bioinformatics resources to a user group, scaling from a single lab to a large institution.

The BIRCH core distribution includes many popular bioinformatics programs, unified within the bioLegato graphic interface (see below...).

Of equal importance, BIRCH provides the system administrator with tools that simplify the job of managing a multi-user bioinformatics system across different platforms and operating systems.

These include tools for integrating locally-installed programs and databases into BIRCH, and for customizing the local BIRCH system to meet the needs of the user base.

BIRCH can also act as a front end to provide a unified view of already- existing collections of bioinformatics software.

Documentation for the BIRCH and locally-added programs is merged in a hierarchical set of web pages. In addition to manual pages for individual programs, BIRCH tutorials employ step by step examples, with screen shots and sample files, to illustrate both the important theoretical and practical considerations behind complex analytical tasks.

Because of its network-centric design, BIRCH makes it possible for any user to do any task from anywhere.

The user' perspective --

A seamless view of the software -

The BIRCH core distribution comes with a wide range of commonly- used software packages pre-configured and ready to run. These include NCBI network BLAST, Cn3D, and Sequin, FASTA, PHYLIP, TCOFFEE, and Taverna (see G6G Abstract Number 20514).

All programs can be run from the command line, and most can also be launched from bioLegato.

bioLegato - bioLegato is best thought of as a program that runs other programs. bioLegato is the primary ‘graphic interface’ for launching programs in BIRCH.

For each program, bioLegato provides a menu that lets you set program parameters, launch the program, and view the output. bioLegato takes care of all of the background details, such as translating files from one format into another.

In many cases, output also goes to a new bioLegato window, making it possible to do ad hoc pipelining (workflow processing). This is one of the most advanced aspects of bioLegato, because it means that you can usually run additional programs using the output of one program as the input of the next.

There are four (4) bioLegato interfaces:

1) biolegato - DNA/protein Sequence data;

2) dbiolegato - lists of data items (Currently de-implemented Pending Revision);

3) mbiolegato - molecular marker data; and

4) tbiolegato - phylogenetic tree data.

Any user can do anything from anywhere --

BIRCH is scaleable from a single workstation to a server cluster. By choosing binaries and libraries at login, BIRCH makes it transparent to the user which platform they are actually using. Most importantly, a UNIX graphic desktop can be redirected from the server to be displayed anywhere.

Local customization of a BIRCH system --

No software package does everything, and each lab, department, or institution has different needs.

BIRCH has numerous mechanisms for adding programs and documentation that are Not part of the BIRCH core, and for customization to take advantage of the strengths of a local UNIX/Linux system, and to work around problems specific to each system.

The $BIRCH/local directory --

BIRCH is downloaded as a hierarchical directory structure which is usually installed in the $HOME directory of an account specifically used for administering BIRCH. This directory is referred to by the $BIRCH environment variable.

Local customization is made possible through the $BIRCH/local directory. Analogous to /usr/local in UNIX, $BIRCH/local is a part of the system that does Not change when an updated version of BIRCH is installed.

During an update the birchconfig install wizard automatically merges programs, documentation, and settings from $BIRCH/local into the new version of BIRCH.

Working in a heterogeneous computing platform --

BIRCH has unique design considerations for working in a heterogeneous operating environment consisting of workstations and hosts with different operating system/hardware platforms.

At login, BIRCH determines the Operating System (OS)/hardware platform. Depending on the platform, BIRCH then chooses binaries and libraries appropriate for that system. The BIRCH implementation of bioLegato can also handle cases in which a program is Not available on all platforms.

In a heterogeneous system, some hosts may have single CPUs and others multiple CPUs. At login BIRCH sets environment variables specifying whether or Not threaded applications can take advantage of multiple CPUs.

A single view of all documentation and data files --

One of the problems facing users on a system with many bioinformatics packages is that documentation is often scattered across many locations on the system. The software included with BIRCH is from a wide variety of authors, and documentation is written in different styles (e.g. UNIX manual pages, tutorials, user's guides), and in many formats (e.g. PDF, HTML, text).

To make documentation easy to find, documentation for the core BIRCH programs is catalogued in the birchdb database, and documentation for locally-installed programs is catalogued in the lbirchdb database. Both databases are implemented using ACeDB, a small database engine which includes an easy-to-use graphic interface.

When BIRCH is installed or updated, the contents of both databases are merged, and a hierarchical set of web documentation pages is generated, including programs listed by category, programs listed by package, and a program index.

For each program a separate web page is generated, listing the name and short description of the program, information on how to launch the program, links to documentation and sample data files, a listing of OS/hardware platforms for which the program is available, as well as a link to the web page describing the package to which the program belongs.

The user doesn't care whether programs are part of the BIRCH core, or are locally-installed. One of the goals of BIRCH is to make the documentation web pages appear as if they were written specifically for the local BIRCH site. This is particularly useful because most first-time BIRCH users will also be using UNIX for the first time.

Rather than giving the user a generic set of web pages, the BIRCH documentation pages have sections earmarked for system specific information, such as how to log in or which desktops are available.

During installation and updating, these sections of the BIRCH web pages are replaced with local content.

Simplifying BIRCH system administration --

BIRCH provides for the system administrator an organizational framework and tools that ensure that programs and documentation remain easily accessible to users. Because startup scripts are read from a central location, the user never needs to perform configuration steps when new software or databases are installed.

By the same token, installation and updating of a BIRCH site is automated by birchconfig, the BIRCH install wizard.

The BIRCH Administrator's Guide spans numerous topics, including customization of the BIRCH web site, managing systems with multiple servers or operating platforms, installing and merging 3rd party applications into BIRCH, and setting default applications for viewing and displaying data.

BIRCH tries to minimize the skill set needed for being a BIRCH administrator. Where a computer specialist is Not available, a biologist with basic knowledge of perhaps 20 of the most common UNIX commands, some knowledge of how to write and edit web pages, and some knowledge of ‘shell scripting’ should be able to install and update a BIRCH system for a lab or working group.

Minimizing this skill set has guided the design of BIRCH. Recognizing that tutorials are as important for the system administrator as they are for the user, the BIRCH Administrator's Guide covers all aspects of local customization and addition of new programs with step-by-step instructions, illustrated with screenshots.

System Requirements

Web-based.

Manufacturer

Manufacturer Web Site BIRCH

Price Contact manufacturer.

G6G Abstract Number 20530

G6G Manufacturer Number 104147