Molecules and Machines
Hello, my name is Anya! I’m a high school junior passionate about biology, computer science, and solving real-world problems in medicine.
This blog will chronicle my independent research project over the next 8 months, where I’ll be diving deep into the intersection of genomics, cancer biology, and machine learning. It’s a mix of curiosity, challenge, and the hope that a student like me can contribute — even a little — to the world of science.
What I’m Working On
Over the next year, I’ll be building a machine learning model to try to predict whether a breast cancer patient is high-risk or low-risk for poor outcomes based on the activity of their genes.
Specifically, I’ll be studying something the Transcriptional Regulatory Networks (TRNs): the networks of genes and gene regulators that decide how genes are expressed. In cancer, these networks often break down. My project asks:
“Can we use patterns in these networks to help predict which patients are likely to do worse — and might need more aggressive treatment?”
This idea brings together public cancer data from The Cancer Genome Atlas (TCGA), machine learning tools like Random Forests and XGBoost, and network-based biology using tools like GENIE3 or pySCENIC
I’ll be doing this independently with mentorship, learning the tools of computational biology as I go: from Python and Jupyter notebooks to scientific writing and data visualization.
Why I’m Doing This — and Why Now
I’ve been fascinated by biology since middle school — not just the facts, but the patterns behind them. Over the years, my interests expanded into neuroscience, genetics, and most recently, how complex systems like the human body can be understood with data and models.
But this project is personal, too.
I’ve seen family members affected by cancer. I’ve read the survival statistics. I know that even with decades of progress, we’re still far from being able to predict which patients will respond well and which won’t. And I have kept wondering,
“Could we do better if we looked not just at individual genes or mutations, but at how entire gene networks behave?”
This summer felt like the right time to begin. I’ve just completed AP Biology and some advanced math and coding work. I’m a Junior Health Scholar at a hospital, where I’ve seen firsthand how unpredictable cancer can be. And I want to take all that inspiration and start contributing to science myself.
What This Blog Will Cover
Every week, I’ll be writing a short post:
What I learned that week
What I built or coded
What questions I’m still wondering about
From posts about survival plots or gene networks to complaints about debugging Python code at 11pm, this is where I’ll share the journey.
Future goals
My hopes with this project:
Learn how real biomedical research works
Deepen my understanding of cancer biology and AI
Share this journey through blog posts and visualizations
Eventually enter a science competition or publish what I find
Whether you’re a student, scientist, mentor, or just curious, I’d love to have you follow along!
Bringing together public datasets, biology, and machine learning is exactly the kind of interdisciplinary thinking that drives innovation in cancer research. You’re asking the right kind of 'big' question — one that doesn’t have an easy answer but is worth pursuing. As you continue to develop your model, perhaps your work could eventually consider and feed into diagnostic tools or clinical decision-making? The biggest gap today is in diagnostics. For example, could the patterns you uncover in transcriptional regulatory networks help stratify patients into different risk groups earlier in their treatment journey? Could your model be tested against existing diagnostic frameworks or incorporated into a decision-support tool that physicians might one day use?
Even if clinical application is a longer-term goal, thinking about the diagnostic potential of your research now might help guide how you frame your results, evaluate performance, or even visualize your findings. It’s exciting to see this level of depth and ambition at such an early stage — keep pushing forward!