Dream-High 2025

Dream-High 2025Dream-High 2025Dream-High 2025

Dream-High 2025

Dream-High 2025Dream-High 2025Dream-High 2025
  • Home
  • DREAM-High Scholars
  • More
    • Home
    • DREAM-High Scholars
  • Home
  • DREAM-High Scholars

Syona Arora

About

 Syona Arora is a member of the Class of 2026 at Bellevue High School in Bellevue, Washington. Interested in the intersection of art, science, and healthcare, Syona leads weekly watercolor and sketching sessions for residents at a local care home. There, she has seen how creativity fosters connection and healing. She brings this perspective into science as well, viewing coding and systems biology as creative processes that integrate data to reveal new insights. She hopes to pursue a career in life sciences research focused on understanding diseases and developing treatments to support patients.


Through DREAM-High, Syona explored breast cancer data using R coding language and heatmaps, learning how clinical and genomic information can be combined to guide research in real-world settings. She has enjoyed collaborating with mentors and peers to see how coding can turn large datasets into meaningful discoveries. Outside of DREAM-High, Syona enjoys painting portraits, listening to music, and watching French detective shows.

Summer 2025 DREAM-High Scholar

Through hands-on programming, DREAM-High Scholars visualize and analyze genomics, clinical, and physical data from breast cancer cells. DREAM-High is a partnership between the Columbia Center for Cancer Systems Therapeutics, the Palazzo Strozzi Foundation USA, the Stanford Center for Cancer Systems Biology, and the Institute for Systems Biology. 

R and RStudio

In the DREAM-High program, Scholars learn to program in R, a language for statistical computing and graphics. They manipulate and write code in a cloud-based RStudio environment to analyze a wide range of data on breast cancer patients and cancer cell lines. 

Heat Maps

I created heat maps as colorized representations of data matrices. I reordered features and observations so that similar entities are close to each other in the graph.  Heat maps make it easy to visualize and understand complex data.

Breast Cancer Clinical Data

I loaded and examined a data frame of clinical information from 1,082 breast cancer patients from The Cancer Genome Atlas (TCGA). I summarized clinical measurements on both the patients, such as  gender and age, and the patients’ tumors, such as estrogen receptor status and histology.

Clinically Relevant Gene Expression Patterns in Breast Cancer

I performed an integrative analysis of clinical measurements and gene expression data for 1,082 patients in the TCGA Breast Cancer cohort. By calculating heat maps and annotating them with clinical information, I detected patterns in the patients' expression profiles across 18,351 genes that correspond to luminal and triple negative breast cancers.

Differential Gene Expression Across Cancer Cell Lines

I discovered biological processes that distinguish cancer cell lines based on the aggressiveness of the cancers they model. For both breast cancer and colon cancer cell lines, I calculated, visualized, and functionally annotated differential gene expression profiles with data from the Physical Sciences in Oncology Cell Line Characterization Study.

Predictive Modeling of Breast Cancer Prognosis

I built linear regression models that are predictive of breast cancer survival from the METABRIC breast cancer dataset. I found that gene expression profiles of certain cancer genes are predictive of prognosis. Inclusion of additional features in my model increased its explanatory power.

Copyright © 2025 Dream High 2025 - All Rights Reserved.

Powered by