I started my research journey working in AI for science. At the Jensen Group, I collaborated with (now Professor) Andrew Zahrt on machine-learning-guided discovery of electrochemical reactions. Through this work, I developed a deeper appreciation for major bottlenecks in ML development. Firstly, by working in a wet lab and curating a training dataset by myself, I learned about the challenge of collecting high-quality data at scale, raising the question: how can we improve techniques in unsupervised learning to effectively use the vast amounts of unlabeled data? Secondly, real-world deployment often requires models to discover or process new information. However, that is inherently difficult, as unfamiliar data often exhibits distribution shifts. This leads to another question: how can we train models to generalize beyond their training distribution robustly?
Driven by these questions, I joined the Coley Group working primarily with Wenhao Gao. My time with the Coley Group was pivotal in shaping my research philosophy. For over two years, I got the chance to understand the challenges of using AI in science, which , in my opinion, is one of the most fascinating research frontiers today. For example, given that many scientific benchmarks suffer from biases, how can ML models learn effectively from datasets impacted by strategic behavior? In chemistry journals, chemical reactions are often grouped by reactivity patterns. Leveraging this, we introduced ContraScope, an algorithm to train ML models using contrastive learning, treating reactions from different groups as negative pairs.
My long term goal is to apply ML to accelerate scientific discovery as it's a key driver of technological innovation and economic growth. Realizing that many of the challenges I encountered in AI for science are actually shared across many settings where learning is applied (e.g., robotics and decision making more broadly), I pivoted to study how to develop ML models that understand the world using the most fundamental sensory inputs available, such as vision. These high-bandwidth signals are abundant and represent the closest approximation to unprocessed observations of the universe, making them an ideal foundation for training models that can recreate and understand the world. Ideally, these world models should be easily transferable across modalities and generalize to out-of-distribution samples to help come up with new solutions.
So this leads us to the present time! Currently, I'm working with Professor Kaiming He and Professor Stephen Bates. I had the pleasure to work with Dr. Tianhong Li and I'm currently working with Yossi Gandelsman (from UC Berkeley) on something extremely cool. At the same time, I'm pursuing research in statistics in collaboration with Charlie Cowen-Breen on a new protocol for rare event estimation. I discovered my interest in statistics during my sophomore year in college, when I also had the pleasure to be an undergraduate teaching assitant in the fundamentals of statistics class at MIT. From some reason that I can't explain, this is my favorite type of math. Fortunately, this blends in perfectly with my interest in deep learning. I believe that AI should be developed responsibly, which is why I like thinking about first principles when I work.
Fun fact: I'm one of the first undergraduate students to intern at Professor He's group and at Professor Bates' group after they were established at MIT. The mentorship of both has been crucial in shaping my research philosophy, inspiring me to integrate both mathematical frameworks and practical tools in my work. I'm thankful for their mentorship, which has truly inspired me and allowed me to gain valuable insights on the "meta-level" of research. A great researcher, in my opinion, knows how to establish a delicate equilibrium between what is feasible and what is interesting. I'm very fortunate to work with people who have truly mastered the understanding of both.
Wrapping up with some high-level thoughts: I believe that the best inventions are as mathematically elegant as they are practically relevant. This has roots in the unreasonable effectiveness of mathematics in the natural sciences. It's amazing to me that learning in high-dimensional spaces is possible despite the curse of dimensionality. The Johnson-Lindenstrauss lemma is a cool idea to think about. In addition, it could be that the universe is, in fact, a degenerate distribution. In this case, we can unfold it, converge to the truth, and do this by using the scientific method. I like to reverse-engineer research ideas: I often think about the future that I'd like to see and take iterative forward steps to get there. I learn by backpropagation: primarily through the feedback of my peers who often complement my shortcomings and help me to apply an exploratory, epsilon-greedy policy. I (try to) work on ideas that take us from Zero to One. I like this quote: "learn the rules like a pro so you can break them like an artist" ~ Pablo Picasso.