Contributing
Glow began as an industry collaboration between databricks and the Regeneron Genetics Center. Glow enables scientists and engineers work together to solve genomics problems with data.
Contributing is easy, and we will collaborate with you to extend the project.
The sections below detail how to contribute.
Raise Issues
If you get stuck or hit errors when using glow, please raise an issue. Even if you solve the problem, there’s a good chance someone else will encounter it.
Important
Please raise issues!
Join the monthly office hours
Monthly office hours provide an opportunity keep up to date with new developments with Glow. Please send an email to glow [dot] contributors [at] gmail.com to join.
Contribute to the codebase
To contribute to glow, please fork the library and create a branch. Make your changes and create a pull request. It’s easy to get started.
Important
Please sign off all commits!
git commit -m "initial commit" --signoff
1. Modify or add notebooks
As you work through the example notebooks in the docs, please document issues. If you solve problems or improve code, please help contribute changes back. That way others will benefit and become more productive.
Export your notebook as html into the relevant directory under docs/source/_static/notebooks.
And run this python script (swapping the html file out for your own).
python3 docs/dev/gen-nb-src.py --html docs/source/_static/notebooks/tertiary/pipe-transformer-vep.html
The Glow workflow is tested in a nightly integration test in Databricks. If you add notebooks or rename them, please also edit the workflow definition json located in docs/dev/.
2. Improve the documentation
If you add a notebook, please reference it in the documentation. Either to an existing docs page, or create a new one. Other contributions to the docs include,
Tips for glow
Spark cluster configuration and tuning
glow use cases
Troubleshooting guides and gotchas
Fix typos, hyperlinks or paths
Better explanations of
what code snippets in the docs mean?
what cells in notebooks mean?
Unit tests for notebook code
New use cases
To build the docs locally,
first create the conda environment:
cd docs
conda env create -f source/environment.yml
activate the glow docs conda environment:
conda activate glow-docs
build the docs:
make livehtml
connect to the local server via your browser at: http://127.0.0.1:8000
3. Add libraries to the glow docker environment
Please edit glow docker files to add libraries that integrate with glow. Only include libraries that are used directly upstream or downstream of glow, or used with the glow pipe transformer.
Setup a dockerhub account
Edit the genomics docker file on your fork
This file contains command line tools, Python and R packages
Build and push the container
Use this bash script as a template
Test the container in your environment in a notebook
Once you are happy with the container and the test, open a pull request
We will build and push the container to the official projectglow dockerhub
Point to this container in the glow nightly continuous integration test jobs definition
Once the circle-ci continuous integration test passes, we will incorporate it into the project
4. Contribute new features / bug fixes
Here are example pull requests for new features or bug fixes that touch different aspects of the codebase,
Much of the codebase is in Scala, however we are increasingly moving to Python. Near-term focus is around integrating with Delta streaming and sharing. In the future we will optimize code in C++.