Volume 9, Number 11 February 8, 2002

General
Home
About Us
Issue Dates
Submissions
Ad Information
Back Issues
OCN Policies
This Issue
News Stories
Feature Articles
Opinion
Columns
Coming Events

New supercomputer tackles biology’s big questions

GNOME and friend, Prof. Anthony Kusalik.  

Photo by Colleen MacPherson

By Michael Robin

Despite its cuddly name, GNOME is a rather imposing presence. Caged in two black cabinets two metres high, it sits at the heart of the university’s new Bioinformatics & Computational Biology Research Laboratory.

GNOME is a ‘Beowulf cluster’, a parallel-processing supercomputer that arrived on campus last fall. Like the hero of the ancient epic, it too has monsters to slay – enormously difficult calculations, and immense data sets. In the emerging field of bioinformatics, these problems are legion.

“One, the amount of data is growing so rapidly and two, a lot of the problems are inherently extremely difficult,” says Prof. Anthony Kusalik, director of the project.

To illustrate, he uses a typical problem brought forward by genomics researchers – the multiple sequence alignment. This is where two or more samples of DNA are compared to find areas of similarity. It’s assumed that if these areas exist in different species, they are being conserved by nature and are therefore significant.

Kusalik posed a problem to his students in an exam: determine how long it could take to do a completely correct multiple sequence alignment of 50 DNA samples of 350 base pairs each.

“If you want to do it precisely and properly, it could take about 10 to the power 131 centuries to do this on a uni-processor machine that can perform the basic operation in 10 to the minus six of a second,” Kusalik says. “That’s how hard these problems are that Mother Nature’s given us.”

Often, the best that can be done is an approximation. This may miss some matches in a multiple sequence alignment, but at least an answer will be available before the passing of geological epochs. Of course, sheer muscle still matters. The faster the computer, the better the approximations, and the more useful the results. This is where GNOME comes in.

Multi-processor supercomputers can be extremely expensive. For instance, shared-memory parallel-processing units typically cost from $500,000 to $1 million. GNOME (a word-play on ‘genome’) came in at about $100,000.  A networked cluster of 32 Pentium-III processors, each with its own memory, GNOME offers supercomputer performance at a fraction of the price.

As with any bargain solution, there is a catch. A Beowulf cluster depends heavily on network connections, and programmers must take this into account when writing their programs, or they’ll get substandard performance. Still, if this limitation is taken into account, the results can be dramatic. Shortly after its delivery, GNOME had some free time, and the Department of Geology had a suitable problem. Even though the researchers had to spend about 10 hours preparing their data and recombining it afterward, they had their answers in three days – a saving of 12 to 15 weeks.

Kusalik sees bioinformatics as essential to life sciences research.

“If you’re going to be on the cutting edge in molecular biology today, you need a bioinformatics component in your lab,” he says. “You can’t be state of the art without it.”

As its name implies, the Bioinformatics & Computational Biology Research Laboratory spans several disciplines and claims participants right across campus. Funds for the lab were provided through a $360,000 Canada Foundation for Innovation grant, plus matching provincial government funds and money from research and industry partners.

Much of the budget has gone into high-speed network connections to move data across campus and to various partners such as the National Research Council’s Plant Biotechnology Institute (PBI) and the Agriculture & Agri-Food Canada Saskatoon Research Centre. Fibre-optic connections carry data at 1,000 megabits per second, a hundred times faster than the typical on-campus connection, and about 1,000 times faster than a high-speed home Internet connection. GNOME also has some serious storage capacity – 1,600 gigabytes, or 1.6 terabytes.

Already, researchers from PBI are working with GNOME to develop tools for high-throughput genetic analysis.

“The bioinformatics cluster has already proven to be a valuable facility to the University and to PBI,” according to PBI researcher Stephen O’Hearn. “It will become even more valuable as we collaborate further with the University and develop other cutting-edge, high-throughput software.”

Prof. Bruce Waygood, University Co-ordinator of Health Research, says it’s essential for the university to have a bioinformatics program, or risk falling behind. Other Canadian universities are aggressively hiring faculty in this area.

“In a little while, there’s going to be a huge demand for graduates both in academia, industry, and otherwise,” he says. “It’s exciting intellectually as well.”

The U of S bioinformatics program will be turning out an unusual breed of professional, one comfortable with both computational and biological sciences. Applications in the health sciences such as cardiology, oncology, and neurology are just beginning to be explored. Plus, a huge new area, structural genomics, awaits. This area of inquiry examines which genes make which proteins, and how they interact in three-dimensional space. Research planned for the Canadian Light Source will be looking at some of these questions.

“Genes, proteins, biological populations and ecology – all of this is part of bioinformatics,” Waygood says. “The program on this campus has just started. I think it’s just the first of several programs that will evolve as people bring in different expertises. Bioinformatics is an evolution.”


For more information, contact communications.office@usask.ca


Articles Index
Next Article

Home · About Us · Issue dates · Submissions · AD Information · Back Issues · Headline Index · OCN Policies