RESEARCH AND INNOVATION AT THE UNIVERSITY OF TORONTO
Summer 2012 · VOL.14, NO.1
Edge Home
Your life brought to you by...
computers
The unseen force that is modern day supercomputing
by Jenny Hall

stoplight
Did you drive to work? Take public transit?
Traffic lights are controlled by high performance computers, using information gathered from sensors and fibre optic cables throughout the city. Even your car itself is the product of computing. Today, thanks to computer modeling, cars are much more efficient — and they’re full of microprocessors. If you left the car at home and took public transit, vehicles that took you to work were scheduled, and, in the case of some trains, operated using complex computer systems.

graphed arrow
Did you check on the mutual funds in your RRSP?
Whether you manage your investments yourself or you have a portfolio of products like mutual funds, someone is modeling what should be bought and what should be sold. In fact, the financial world is one of the biggest users of large computing systems.

pills
Did you take any prescription medication?
Computers are used to model how molecules behave and how they interact with enzymes and proteins to make new drugs.

umbrella
Did you listen to the weather forecast?
Weather prediction is done on high performance computing systems. And it’s getting better and better, thanks to faster machines. In the early days of high performance computing, forecasts could be done for the day ahead — but it took five days to run the calculation, rendering the resulting “forecast” obsolete.

groceries
Did you stop at the grocery store on the way home?
Complex inventory control systems told the company that a person like you was going to be interested in buying exactly what you bought — and so there it was, on the shelf at your local store, waiting for you.


Computers are everywhere.

Yes, you’re probably thinking. I have several myself. My kid has a handheld gaming device. And my phone is a computer of sorts. Out in the world, I see computers everywhere: at stores, in the office, when I go to the doctor.

And personal computing is only becoming more widespread, as the technology becomes easier to access. The BBC reported in 2010 that there were 5 billion mobile phone connections. Microsoft estimates that a billion of them are smart phones.

But there are other computers, ones you don’t see, that arguably have a bigger impact on your life. Big computers. Invisible computers. And they’re everywhere. Housed in nondescript buildings you’d never take any notice of, they make much of life as we know it possible. They’re the unseen infrastructure of our lives.

A group of scientists from UCLA and the Open University of Catalonia recently looked at more than 1,000 sources, including UN statistics, and reported this past February in Science that the fraction of the world’s data that was stored digitally in 2007 was 94 per cent. Back in 1986, the corresponding number was 0.8 per cent.

We’re responsible for producing much of the world’s digital information. We take digital photos, we post to Facebook. But we don’t have such a direct relationship with most of it. It’s not something we see or think much about. But it, along with the systems that store and analyze it, makes life as we know it possible. That’s what this issue of Edge is about.

Professor Emeritus Calvin Gotlieb of the Department of Computer Science says, “I’m not going to say we need computers every minute of our lives in order to be safe or happy. But the civilization we have built wouldn’t be what it is without computers.”

Recall the Science data: the fraction of the world’s data that was stored digitally in 1986 was 0.8 per cent. By 2007, it had risen to 94 per cent.

Where is all this data coming from?

“We’ve wired the world,” says Dr. Chris Loken, Chief Technical Officer at SciNet, Canada’s largest supercomputer, which is housed at U of T. “Think of the stuff that Google and Facebook are storing. If you make a query to Google, it replies in no time. Before you’re even done typing it, it’s already filling in what you want. It knows the whole Internet.” All that data is housed somewhere.

Or, he says, “Imagine that you’re driving down the street. So are lots of other people. Cell phones can be used to track this movement. If you were to add everyone’s data together for the entire city, you’d know exactly how traffic is right now. You could start routing it as needed. But it’s a huge data problem because something has to do all that analysis in real time.”

Researchers face the same problems. “The Human Genome Project took 10 years to sequence the human genome,” says Dr. Daniel Gruner, Chief Technical Officer, Software, at SciNet. “Today, you can analyze a genome in a day and end up with 200 gigabytes of data. You have to store that data and process it.”

Much of the data collected by large-scale scientific experiments, he says, is lost, because there’s just too much of it. “If you look at the big telescopes, for example, 10 per cent or less of the data is actually kept and looked at.”

The list could go on: Imagine the computing power required by large research projects that analyze genetic interactions in the genomes of large populations, looking for clues to diseases like cancer.

Or think of the Large Hadron Collider, a particle accelerator scientists used to discover the Higgs boson. Its 150 million sensors generate 700 megabytes of data per second — enough to fill 100,000 DVDs every year.

The result is what’s often called “Big Data.” The much-cited analogy is that dealing with all this data is like trying to take a drink of water from a fire hose. It has the potential to lead us to huge scientific breakthroughs, but it’s hard to analyze, visualize and store. It’s one of the central problems of research today — and solving it will be about building bigger, smarter computing systems.

The computers we own have changed our lives. Most of us understand that. What’s less obvious is how big computing systems have changed our lived experience. We don’t realize that, as Gruner says, “we interact with supercomputers every day of our lives.”

Calvin Gotlieb, the “father of computing in Canada”

Calvin Gotlieb Calvin Gotlieb, professor emeritus in the Department of Computer Science, is widely acknowledged as the father of computing in Canada. His computing career began when he volunteered as a consultant to the British navy in World War II, helping design bombs — early computers helped predict their trajectories.

After the war, universities, including U of T, started designing their own computers. But in 1951, “the opportunity came up to buy what turned out to be the second computer ever sold in the world. It was a machine the British government had ordered for their atomic energy group, but when a new Labour government came in, they cancelled every order over $100,000, so the computer suddenly came on the market.”

U of T bought it for $300,000. It filled a room and had 3,200 bytes of disk storage.

Gotlieb went on to great acclaim and today, at 91, is still an active scholar and teacher. A member of the Order of Canada and a fellow of the Royal Society of Canada, he is well known for his work on databases and on social issues related to computing. His many accomplishments include creating simulations of computer-controlled traffic lights that led to the adoption in Toronto of the first such system in the world, participating in a 1969 United Nations panel on how computers could help developing countries and writing the report that led to the first privacy legislation developed in Canada.

Of the massive technological change he has witnessed, he says, “People always ask me, ‘did you expect this to happen?’ No! No one ever could have predicted that computers would replace typewriters. The computer we bought filled a huge room, had 10,000 vacuum tubes and its own team of maintenance engineers. Who ever thought that computers would be small enough and cheap enough so that instead of buying a typewriter, you’d rather have a computer or a phone?”

Will a computer save your life someday?
Fritz Roth is using bioinformatics to untangle genetic data by Patchen Barss

Fritz RothSome scientists call them “ridiculograms.” Others use the term “hairballs.”

They are scientific diagrams that contain important information, but that are so complex that no human being could decipher their secrets.

Hairballs often turn up in the field of genetics, especially among researchers who study genetic interactions. The human genome has between 20,000 and 25,000 genes. That’s complicated enough, but many hereditary characteristics are caused not by a single gene, but by two, or 20 or 200 mutations conspiring together.

These relationships can change how individual genes affect an organism. What’s more, any given gene can exhibit thousands of possible characteristics and functions. Researchers create genetic network maps to document the nature and intensity of these interactions.

But the data is so complex, nuanced and interrelated that the sheer volume quickly becomes unmanageable. The maps become hairballs, which is a problem, given that genetic interactions can make the difference between health and sickness or even life and death. As mutations in genes change how they behave, they alter the likelihood of their owner falling prey to hereditary conditions like heart disease, autism or many cancers.

Enter Fritz Roth and his research team.

“One of the obsessions in my lab has been with capturing shades of grey in what we know about genes,” says Roth, who is appointed in U of T’s Banting and Best Department of Medical Research and holds the Canada Excellence Research Chair in Integrative Biology. He specializes in bioinformatics research, solving problems of previously intractable complexity by means of powerful computers that do sophisticated analysis.

Much of his research uses “model organisms” — simple forms of life like yeast, fruit flies, fish or worms. These organisms share many genes — and many gene interactions — with human beings. That allows researchers to study gene mutations that conspire to, for example, damage a yeast cell, and draw conclusions about where to look for parallels in a human being.

Roth’s computers access huge — and growing — databases containing the combined results of many lifetimes of research into how genes behave in different species, what other genes they interact with and what kind of problems they cause when they get together. Bioinformatics turns hairballs into manageable information that can lead to the genes that, when mutated, perpetrate some of the most insidious attacks on human health.

Roth likens it to a detective story where illness is the crime and the genetic suspects rarely work alone.

“We have facts about a gene, and we have facts about its relationships with other genes,” Roth says. “Sometimes, there’s guilt by association: genes expressed in heart muscle are more likely to be involved in heart disease. Genes that regulate cell division might be implicated in causing cancer.”

Roth’s work isn’t quite like that of a police officer combing the databases for likely suspects. In his world, the computer itself is the detective, analyzing patterns, identifying suspicious- looking behaviour and shady acquaintances. The computer does the analysis and tells the researchers where they might best hunt for their culprits.

Bioinformatics arguably got its start with rudimentary databases created in the 1950s. Just in the past decade, though, a massive increase in computing power, a flood of new data and major advances in machine learning have transformed the field. As researchers like Roth turn hairball gene maps into solvable mysteries, it places scientists on the brink of a new level of sophistication in their understanding of how genes work together to make us who we are.

 

One pandemic. 5 million people. Now what?
Dionne Aleman’s pandemic management model is built on hundreds of factors — and a big computer by Paul Fraumeni

Dionne AlemanThink back to 2009, when we were told that a new strain of flu — called H1N1 — was about to besiege the population of the Greater Toronto Area (GTA).

“Get vaccinated,” said the authorities. And then the line-ups began and you could feel the tension throughout the GTA.

Fortunately, H1N1 didn’t turn out to be the massive killer that was feared. But it — like the SARS outbreak of 2002–2003 — reinforced the need for public health agencies to find ways to plan for large populations to cope with a pandemic. And, as we are seeing in so many aspects of modern living, computing is playing a huge role.

“We could plan for a pandemic without computing, but it would be wildly impractical,” says Dionne Aleman, a professor in Mechanical and Industrial Engineering. “It would be like flipping a coin to determine how a disease would spread. It could take two days of flipping coins over and over to simulate a day of activity.”

Aleman and her research team in the Medical Operations Research Lab are far past the coin-flipping methodology. The lab is in the final stages of building a computer model that will be of vital assistance to Ontario public health agencies.

How? “The simple answer is that we ask and answer as many ‘What if?’ questions as possible,” says Aleman. “That may sound simplistic, but it takes a lot of computational effort to make it happen.”

In building the model, Aleman’s team focuses on the GTA’s five million citizens. Each citizen is considered to be “an individual agent who has various properties, such as age, gender, where they live, do they ride public transit, what is their route, where do they work or go to school?” They get this information from sources such as the TTC, Statistics Canada and census studies.

The goal is to model whether — and how — these agents will come into contact with each other, thus spreading the disease at the core of the pandemic. “Each agent is like a block of memory. So to model the GTA, we need a computer with a lot of memory. Every time one agent has contact with another, that’s a computational effort that has to be made. We keep adding up these contacts and at the end of a simulated day, we want to see what’s the probability of each person being infected tomorrow.”

The technology the Aleman team uses is a small cluster of 256 processors, wired together in a dedicated room. “With the level of detail we have in our model, it takes about one to two seconds to simulate one day. We can simulate about 60 days in just a few minutes. This is great, because planning for a pandemic is all about probabilities.”

So, the researchers can introduce new information — as they are doing now in adding statistics about the number of children in various Toronto school systems — and then have the computer assess the probabilities of how disease will spread because of these additions.

One of the key problems Ontario public health officials asked the team to analyze is the importance of how fast people get vaccinated in terms of halting the pandemic. “We targeted 60 per cent of the population and asked if it mattered if those people were vaccinated over 10 weeks or 15 weeks. Then we ran a simulation and we found that, yes, if people got vaccinated earlier, it would stop the spread of disease much more quickly.”

Do the researchers have to be talented at using computers?

“You have to know how to program and code. There’s definitely room for growth at making supercomputing more accessible. I wonder why there isn’t more of a push to bring more people into this level of computing. It doesn’t have to be as easy as using an iPad, but there’s too much of a gap now. So that should be the next big step — enabling researchers to become more adept at actually using high performance computing.”

Dr. Andrew James
Changing lives in the NICU
Andrew James and the Artemis Project are analyzing big data to help the smallest patients by Jenny Hall

The neonatal intensive care unit (NICU) at Toronto’s Hospital for Sick Children (Sick- Kids) is home to 32 babies.

While the hospital’s littlest patients battle the complications of prematurity — problems with breathing, heart function, gastrointestinal function and eyes — a team of first-rate nurses and physicians looks after them.

Could a computer help them do a better job?

Quite possibly, says Dr. Andrew James, a professor of paediatrics at the University of Toronto and a neonatologist at SickKids. He’s working with collaborators on the Artemis Project, which aims to capture and make better use of the enormous volume of physiological data collected about babies in intensive care.

“A critically ill baby in the NICU usually has heart rate, respiratory rate and blood oxygen saturation continuously monitored by a bedside medical device,” James says. Doctors and nurses can watch the data as it’s being monitored, and in paper-based units, it’s usually recorded once an hour.

“But the data is available by the second,” explains James. “So most of it gets lost.” But even if more data were being captured, no human being could ever make sense of it all. There’s simply too much and it’s too complex.

Enter Artemis, which is being tested at SickKids and a few other hospitals around the world. It captures data at a very high frequency — 1,259 recordings per second for each baby — and streams it through the hospital’s network into the Artemis framework. The team is currently analyzing the results of the first phase of a clinical research study into earlier detection of bloodstream infections occurring after admission to the NICU and expects the results to be available by the end of the year.

The overall goal of the project is a clinical decision support system that would provide regular reports to nurses and doctors, allow them to make queries and, ultimately, provide alerts when something is about to go wrong.

For example, one of the biggest risks preterm babies face is infection. “To provide intensive care we use invasive technologies,” says James. “We have tubes into these babies’ lungs. We run catheters into their veins.” For the past decade or so, doctors have known that changes in a baby’s heart rate variability can signal that she might be developing an infection. These subtle heart rate changes can occur up to 24 hours before she actually becomes ill. Artemis could detect them and provide an alert to the baby’s caregivers.

Artemis was the brainchild of Dr. Carolyn McGregor, a computer scientist and Canada Research Chair in Health Informatics at the University of Ontario Institute of Technology. After finding herself in the NICU as a mother of a preterm baby, she switched her focus from business processes to health informatics. IBM provided researchers to work on the project and access to software that was still in development.

The team hopes to deploy Artemis across the 30 hospitals that are members of the Canadian Neonatal Network. James estimates that’s seven to eight years off.

Life in the NICU can be stressful, says James. “Our goal is to deliver meaningful, high-quality information to clinicians. We have a lot of babies who have such complex problems that we can’t resolve them. We can do some good, but there’s a lot more we’d like to be able to do.”

A message from the Vice-President, Research and Innovation

From the wheel to the computer
Professor R. Paul Young
R. Paul Young

Is computing essential? After all, human civilization grew and diversified and did rather well before the modern computer came along about 75 years ago. We learned to walk. We invented the wheel, language, medicine and agriculture. We learned to control fire. We developed religion and the great philosophies by which we live today. We built the pyramids of Egypt, the Taj Mahal, created boats and discovered every corner of the planet. And hundreds of thousands of writers, artists and musicians have given us the foundations of art and culture.

And all this was done without the computer.

So, I ask again, is computing essential? Does it deserve all the attention that it gets?

Of course it does.

Human development has been marked by many characteristics. At the core of our history is the progress that helps humans to have better lives. Progress needs tools. The wheel was a tool that opened up a new realm of possibilities for civilization. I think the computer has taken on the same importance as the wheel. Computing is, without question, a true enabler.

Health care, transportation planning and management, banking, pharmaceutical development, crime prevention and detection, environmental protection and, of course, how we communicate with each other are only of a few of the areas that have been revolutionized by computers.

Then there’s the “What’s next?” aspect. What is amazing today in computing will be outdone by new innovations that will make the impact of this incredible force even greater. And with the supercomputer this pace of change gets even faster and more powerful.

University research is central to the development of computing. And the good that our research does — in all fields — is made even more profound by one of the most powerful tools ever created.

 

Professor R. Paul Young, PhD, FRSC
Vice President, Research and Innovation

 

TOP

EDGE · SUMMER 2012 · VOL.14, NO.1