Computing in Earth science: a non-linear path
Machine learning is undeniably a tool that most disciplines like to have in their toolbox. However, scientists are still investigating the limits and barriers to incorporating machine learning into their research. Junior Sonia Reilly spent her summer opening up the machine learning black box to better understand how information flows through neural networks as part of the Undergraduate Research Opportunities Program (UROP). Her project, which investigates how machine learning works with the intention of improving its application to the observation of natural phenomena, was overseen by Sai Ravela in the Department of Earth, Atmospheric and Planetary Sciences (EAPS). As a major in Course 18C (Mathematics with Computer Science), Reilly is uniquely equipped to help investigate these connections.
“In recent years, deep learning has become an immensely popular tool in all kinds of research fields, but the mathematics of how and why it is so effective is still very poorly understood,” says Reilly. “Having that knowledge will enable the design of better-performing learning machines.” To do that, she looks more closely at how the algorithms evolve to produce their final most-probable conclusions, with the end goal of providing insights on information flow, bottlenecks, and maximizing gain from neural networks.
“We don’t want to be drowning in big data. On the contrary, we want to transform big data into perhaps what we might call smart data,” Ravela says of how machine learning must proceed. “The end goal is always a sensing agent that gathers data from our environment, but one that is knowledge-driven and does just enough work to gather just enough information for meaningful inferences.”
For Ravela, who leads the Earth Signals and Systems Group (ESSG), better-performing learning machines means more robust early predictions of potential disasters. His group’s research lies largely in how the Earth works as a system, primarily focusing on climate and natural hazards. They observe natural phenomena to produce effective predictive models for dynamic natural processes, such as hurricanes, clouds, volcanoes, earthquakes, glaciers, and wildlife conservation strategies, as well as making advances in engineering and learning itself.
“In all these projects, it’s impossible to gather dense data in space and time. We show that actively mining the environment through a systems analytic approach is promising,” he says. Ravela recently delivered his group’s latest work — including Reilly’s contributions — to the Association of Computing Machinery’s special interest group on knowledge discovery and data mining (SIGKDD 2019) in early August. He teaches an “infinite course” with a duology of classes taught in spring and fall semesters that provides an overview of machine learning foundations for natural systems science, which anyone can follow along with online.
According to Ravela, if Reilly is to succeed at advancing the mathematical basis for computational learning models, she will be one of the “early pioneers of learning that can be explained,” an achievement that can provide a promising career path.
That is ideal for Reilly’s goals of obtaining a PhD in mathematics after graduating from MIT and remaining a contributor to research that can positively impact the world. She’s starting with cramming as much research as she can manage into her schedule over her final two undergraduate years at MIT, including her experience this summer.
Although this was Reilly’s first UROP experience, it is her second time undertaking a research project that blends mathematics, computer science, and Earth science. Previously, at the Johns Hopkins University Applied Physics Laboratory, Reilly helped develop signal processing techniques and software that would improve the retrieval of useful climate change information from low-quality satellite data.
“I’ve always wanted to be part of an interdisciplinary research environment where I could use my knowledge of math to contribute to the work of scientists and engineers,” Reilly says of working within EAPS. “It’s encouraging to see that type of environment and get a taste of what it would be like to work in one.”
Ravela explains that the ESSG is fond of the mutually beneficial inclusion of UROP students. “For me, UROPs are better than grad student and postdocs if, and only if, one can create the right-sized questions for them to run with. But then they run the fastest and are the most clever of all.” He says he feels the UROP program is invaluable and could be beneficial for all students to incorporate, as it offers a chance to learn about other fields and interdisciplinary research, as well as how to incorporate what they learn into tangible results.
For Reilly, research builds on her foundation obtained from taking classes at MIT, which are a controlled and predictable environment, she says, “but research is nowhere near so linear.” She has relied on her foundation of mathematics and computer science from her courses during her UROP experience while having to learn how to connect and apply them to new fields and to consider topics often outside an undergraduate education. “It often feels like every step I take requires me to learn about an entirely new field of mathematics, and it’s difficult to know where to start. I definitely feel lost sometimes, but I’m also learning an incredible amount.”