Skip to content

Latest commit

 

History

History
109 lines (79 loc) · 11.6 KB

File metadata and controls

109 lines (79 loc) · 11.6 KB
graph LR
    Genomic_Data_Models["Genomic Data Models"]
    Sequence_Processing["Sequence Processing"]
    Genomic_Data_Loaders["Genomic Data Loaders"]
    Genomic_Feature_Extractors["Genomic Feature Extractors"]
    Variant_Data_Management["Variant Data Management"]
    Genomic_Data_Models -- "provides data models to" --> Sequence_Processing
    Genomic_Data_Models -- "provides data models to" --> Genomic_Data_Loaders
    Genomic_Data_Models -- "provides data models to" --> Genomic_Feature_Extractors
    Genomic_Data_Models -- "provides data models to" --> Variant_Data_Management
    Sequence_Processing -- "applies transformations to data for" --> Genomic_Data_Loaders
    Sequence_Processing -- "applies transformations to extracted data" --> Genomic_Feature_Extractors
    Genomic_Data_Loaders -- "loads data using" --> Genomic_Feature_Extractors
    Genomic_Data_Loaders -- "integrates variant information from" --> Variant_Data_Management
    Genomic_Feature_Extractors -- "provides sequences and features to" --> Genomic_Data_Loaders
    Genomic_Feature_Extractors -- "extracts sequences for variant analysis" --> Variant_Data_Management
    Variant_Data_Management -- "provides variant data to" --> Genomic_Data_Loaders
    Variant_Data_Management -- "queries and filters variants for extraction" --> Genomic_Feature_Extractors
    click Genomic_Data_Models href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/kipoiseq/Genomic Data Models.md" "Details"
    click Sequence_Processing href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/kipoiseq/Sequence Processing.md" "Details"
    click Genomic_Data_Loaders href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/kipoiseq/Genomic Data Loaders.md" "Details"
    click Genomic_Feature_Extractors href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/kipoiseq/Genomic Feature Extractors.md" "Details"
    click Variant_Data_Management href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/kipoiseq/Variant Data Management.md" "Details"
Loading

CodeBoardingDemoContact

Component Details

The kipoiseq library provides a comprehensive framework for handling and processing genomic sequence and variant data. Its main flow involves loading biological data from various sources (FASTA, GTF, VCF), transforming these sequences for downstream analysis (e.g., one-hot encoding), extracting specific genomic features or sequences around variants, and managing genetic variant information. The library's purpose is to facilitate the preparation and manipulation of genomic data for machine learning and other bioinformatics applications.

Genomic Data Models

Provides fundamental data structures like Variant for genetic variations and Interval for genomic regions, which are used throughout the kipoiseq library to represent and manipulate biological data.

Related Classes/Methods:

Sequence Processing

Offers a suite of functions and classes for transforming biological sequences, including one-hot encoding, resizing intervals, and handling sequence axes, crucial for preparing data for machine learning models. It also includes general-purpose helper functions.

Related Classes/Methods:

Genomic Data Loaders

Provides various data loaders for handling genomic sequences based on intervals, BED files, and GTF annotations, supporting both string and one-hot encoded sequence outputs, including specialized loaders for splicing and protein data.

Related Classes/Methods:

Genomic Feature Extractors

Extracts genomic sequences from FASTA files and features from GTF files (e.g., CDS, UTRs). It also provides functionalities for extracting sequences across multiple genomic intervals and protein sequences.

Related Classes/Methods:

Variant Data Management

Manages core operations related to Variant Call Format (VCF) files, including fetching, querying, filtering, and matching genetic variants. It also handles the extraction of genomic sequences around variants and the generation of variant combinations.

Related Classes/Methods: