Overview
OmixHub is a platform that interfaces with GDC using Python to help users apply ML-based analysis on different sequencing data. Currently, we support only RNA-Seq based datasets from the genomic data commons (GDC).
- Cohort Creation of Bulk RNA Seq Tumor and Normal Samples from GDC.
- Bioinformatics analysis: Application of PyDESeq2 and GSEA in a single pipeline.
- Classical ML analysis: Applying clustering, supervised ML, and outlier sum statistics.
- Custom API Connections:
- Search and retrieval of Cancer Data cohorts from GDC using complex JSON filters.
- Interacting with MongoDB in a Pythonic manner (DOCS coming soon).
- Interacting with Google Cloud BigQuery in a Pythonic manner (DOCS coming soon).
Getting Started
- Clone the repository:
git clone https://github.com/adhal007/OmixHub.git
- Create the correct conda environment for OmixHub:
conda env create -f environment.yaml
API Reference
For detailed API reference, please visit the full documentation:
View Full API ReferenceExamples
Useful query filters for GDC API endpoints
For examples of using various filters from the GDCQueryFilters class, please refer to the examples section in the full documentation.
View Query Filter ExamplesData Processing and Analysis Examples
Applications
Running Gradio App
Currently restricted to users. To contribute or try it, contact adhalbiophysics@gmail.com.
Running the app:
python3 app_gradio.py
For app navigation documentation, refer to: cd ../tutorial_notebooks/docs/UI Prototype/gradio_use.md