Ashley Washburn, October 28, 2016 | View original publication
New doctoral program links life sciences with big data
A stalk of corn contains more genes than a human being. A handful of the soil anchoring that stalk contains more microorganisms than the worldwide population of people depending on the food it nurtures.
The potential to overcome drought while feeding an ever-growing population may reside in those genes and microbes. But their dizzying complexity, diversity and sheer numbers make studying them – and the ways they interact – a challenge that demands computing power and statistical analyses on a massive scale.
A newly approved doctoral program at the University of Nebraska-Lincoln has assembled more than 100 faculty from across four colleges to train future life scientists in the acquisition, evaluation and analysis of this so-called big data. In doing so, the program’s creators are aiming to equip its graduates with the skills to answer questions and solve problems that only big data can handle.
“We came to the conclusion that these students really needed to get a foundation in data science, because it was pretty much central to almost all of this research that was going on (at the university),” said Brian Larkins, associate vice chancellor for life sciences.
Known as Complex Biosystems, the program will expose incoming doctoral students to five specializations that encompass a broad swath of quantitative approaches and life sciences – from mathematical models, simulations and algorithms to the ecosystem dynamics, disease-driving cellular mechanisms and host-microbe interactions they describe.
In their first year, students will also take two core courses that explore prominent questions in the life sciences, along with the ways that quantitative methods are addressing them. Among those questions: Why has a cure for cancer proven so elusive? Can scarce water supplies be optimized by developing drought-resistant crops? How do bacteria in the human gut affect the incidence of obesity and disease?
Melanie Simpson, a Willa Cather Professor of biochemistry and co-director of Complex Biosystems, said this interdisciplinary approach sets the program apart from nearly all others she has encountered. That breadth especially appealed to the program’s first cohort of students, Simpson said. The four students, all women, unofficially entered the program in 2015 and are on track to become its first graduates. One has decided to pursue engineering, with another studying plant pathology and a third delving into food science.
“Our philosophy is that if they start learning those (disciplinary) vocabularies from the get-go, at the same time that they’re improving their skill sets in data analysis, they won’t be afraid to talk to anybody,” Simpson said. “At the end of the (first) year, they can have a conversation with anyone in any of the specializations and be at least competent.”
Simpson and Larkins said the connections that students establish with each other during their first year in the program will help them avoid the pigeonholes that have confined students to a particular major, department or campus.
“It was really gratifying to see, as the (first four) were taking their courses, how they tutored one another in areas that, individually, they didn’t have expertise in,” Larkins said.
“They’re still very aware of each other,” Simpson said. “They still talk to each other and share what they’re learning. I like seeing that.”
The program’s interdisciplinary structure has already shown signs of attracting students who might not otherwise attend the university, said Larkins, who described it as a “transformational” step toward the next tier in life sciences scholarship and education.
“The other thing that’s striking is that I don’t think any of the initial students ended up doing what they expected they would be interested in when they came (to Nebraska),” Larkins said. “Rotating through different departments and seeing opportunities did make a difference. It’s exactly what you hope for.”
Big demand for big data
Exploring the application of big data across multiple disciplines bodes well both for students’ academic and career prospects, said program co-director Jennifer Clarke, who also heads the university’s Quantitative Life Sciences Initiative.
“We really need to impress upon students that there are many options, and they should explore each one, because it’s not a one-size-fits-all (situation),” Clarke said. “Some of them will say, right from the beginning, ‘I’m sure I want to be an academic.’ And other people go, ‘I really don’t know. What are my options?’ That should be part of their learning process: going around and finding out what all these people do. There’s a whole world out there.”
The program should help certain students overcome a discomfort with statistics and data analysis that may have nudged them toward life sciences in the first place, Clarke said. The rising prominence of large data sets in life sciences research, she said, has upended conventional notions of how that research is conducted and what it will look like going forward.
Employers outside of academia, meanwhile, are increasingly seeking graduates who have both the foundational knowledge of life sciences and a practiced familiarity with the techniques of data science, Clarke said. According to the National Science Foundation and the National Institutes of Health, more than 60 percent of graduates in the fields of science, technology, engineering and math are now taking non-academic positions after graduation.
“Lots of companies that are interested in life sciences, from startups to large corporations, say, ‘What are you guys doing in data? We need more data skills.’ In every conversation, it’s just a recurring theme,” Clarke said. “So we thought we could train students to have both skill sets. They’re not purely quantitative – they know life sciences and can speak that language – but they have enough data skills to make them very attractive to employers.”
‘A moving target’
The Complex Biosystems program follows in the wake of a wave of new faculty hires in the life sciences, Simpson said. Those faculty have shaped the development of the program by sharing insights gained from learning and applying computational approaches throughout their own graduate school and postdoctoral experiences.
“We made sure that we built a consensus and involved a lot of people in the conversation,” Simpson said. “With this type of an interdisciplinary training structure, it was important to bring in all of the early-career faculty who had just been hired en masse – who had much less traditionally discipline-based structures in their own graduate training – so that this was (better) meeting the needs of a contemporary graduate training program.”
Larkins and Clarke said the program also stands to benefit senior faculty who often have little time to familiarize themselves with new approaches while constantly juggling the responsibilities of research, teaching and advising.
“They’re all busy,” Clarke said. “They don’t have time to re-teach themselves statistics or math or computational biology and then try to keep up. Those fields move very fast, particularly these days. It’s a moving target.”
Over time, Larkins said, the program should cultivate a network of computationally savvy graduate students who can acclimate their advisors – and fellow life science students not in the Complex Biosystems program – to emerging software and statistical techniques.
Cradle of life sciences
The program’s developers emphasized that it would never have reached fruition without the support of Chancellor Ronnie Green, former Chancellor Harvey Perlman and the late Prem Paul, the university’s longtime vice chancellor for research and economic development.
“Visionary, progressive administrators are absolutely critical,” Simpson said. “This represents the biggest interdisciplinary graduate program that the university’s had.
“That’s so exciting. It’s a huge deal. This is transformative for our graduate education, but it’s also representative of a transformation in scientific approaches that’s being reflected in our training efforts.”
The Complex Biosystems program also stands poised to leverage recent facility and technology upgrades exemplified by the supercomputer cluster at the Holland Computing Center, which bears the computational load of the university’s big data research. Those resources, combined with the personnel hires made in recent years, have Simpson optimistic that more students will cast their eyes toward Lincoln when considering where to earn their doctorates.
“This is a really good place to do research,” she said. “We have strong mentors. We have supportive communities. We have big new labs and great computational power. If they’re focused on studying with really strong people and having someone care about them as a student, this is a great place for them to be.”