I'm a data scientist, with 6+ years of expertise in statistical design, modeling and analysis of high dimensional data-sets. At Serimmune. I most recently worked on statistical modeling of Long-Covid immune response using unbiased ML models. Currently, I'm also building a streamlined codebase for different ML, DL and bioinformatics analysis tools OmixHub. I have an MS in Biophysics from the University of California (supervised by the awesome Professor Hao Cheng) and my Bachelor of Biochemical Engineering from IIT-BHU, India. My masters thesis project involved Statistical analysis of GWAS by bayesian methods.
I pride myself in end to end development of software tools for Statistical Models . I'm experienced in python, R, julia, SQL and google cloud infrastructure.
I'm a real madrid fan (Hala Madrid), and enjoy scientific enterpreneurial debates and problem solving.
Grew up in mumbai, India; moved to Davis, California for my graduate study, Currently in Santa Barbara, CA .
Updates:
- September 2024: I released the documentation for OmixHub, a one-stop application of well-known ML and DL based classification, feature selection and other bioinformatics in high dimensional NGS cancer datasets from Genomic Data Commons.
- August 2024: I successfully completed Summer of Code(GSOC), 2024 as a Contributor! I will be looking forward to writing a paper on metadata harmonization with Sehyun Oh, Jonathan Davenport and Michele W.
- May 2024: I joined cBioPortal through Summer of Code(GSOC), 2024 as a Contributor! I will be working on Clinical metadata harmonization using advanced NLP models mentored by Sehyun Oh, Jonathan Davenport and Michele W.
- March 2024: Our Moderna paper Profiling antibody epitopes induced by mRNA-1273 vaccination and boosters got accepted at Frontiers in Immunology.
- November 2023: I started an open source tool Omixhub for a one stop application of well known ML and DL based classification, feature selection and other bioinformatics in high dimensional NGS cancer datasets from Genomic Data Commons .
- September 2023: Long COVID paper published in nature . The paper investigated the unbiased machine learning models to identify the key features that are most strongly associated with long COVID status(Led by Jon Klein's Group) .
- April 2022: I was invited by my masters PI, Hao Cheng to give a talk on Application of Serum epitope repertoire analysis (SERA) for early detection for Renal Cell Carcinoma at UC Davis, Animal Science Group.