Glow Logo
latest
  • Introduction to Glow
  • Getting Started
  • Variant Data Manipulation
    • Read and Write VCF, Plink, and BGEN with Spark
    • Read Genome Annotations (GFF3) as a Spark DataFrame
    • Create a Genomics Delta Lake
    • Variant Quality Control
    • Sample Quality Control
    • Liftover
    • Variant Normalization
    • Split Multiallelic Variants
    • Merging Variant Datasets
    • Hail Interoperation
    • Utility Functions
  • Tertiary Analysis
  • Troubleshooting
  • Blog Posts
  • Additional Resources
  • Python API
Glow
  • Docs »
  • Variant Data Manipulation
  • Edit on GitHub

Variant Data ManipulationΒΆ

Glow offers functionalities to extract, transform and load (ETL) genomic variant data into Spark DataFrames, enabling seamless manipulation, filtering, quality control and transformation between file formats.

  • Read and Write VCF, Plink, and BGEN with Spark
    • VCF
    • BGEN
    • PLINK
  • Read Genome Annotations (GFF3) as a Spark DataFrame
    • Schema
  • Create a Genomics Delta Lake
    • VCF to Delta Lake table notebook
  • Variant Quality Control
    • Notebook
  • Sample Quality Control
    • Computing user-defined sample QC metrics
  • Liftover
    • Create a liftOver cluster
    • Coordinate liftOver
    • Variant liftOver
  • Variant Normalization
    • normalize_variants Transformer
    • Usage
    • Options
    • normalize_variant Function
  • Split Multiallelic Variants
    • Usage
  • Merging Variant Datasets
    • Aggregating INFO fields
    • Joint genotyping
  • Hail Interoperation
    • Create a Hail cluster
    • Convert to a Glow DataFrame
    • Schema mapping
  • Utility Functions
    • Struct transformations
    • Spark ML transformations
    • Variant data transformations
Next Previous

© Copyright 2019, Glow Authors Revision 5dba8cc4.

Built with Sphinx using a theme provided by Read the Docs.
Read the Docs v: latest
Versions
latest
stable
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.