Skip to main content

Versioning & Containerization

By
Level
Beginner

Versioning code, data, and analysis software is crucially important to rigorous and open neuroscience workflows that maximize reproducibility and minimize errors.

 

Version control systems, code-capable notebooks, and virtualization containers such as Git, Jupyter, and Docker, respectively, have become essential tools in data science.

Course Features
Versioning
Git
Datalad
Why containers?
Docker/Singularity
Lessons of this Course
1
1
Duration:
57:52

This talk highlights a set of platform technologies, software, and data collections that close and shorten the feedback cycle in research. 

2
2
Duration:
00:51:55

In this presentation by the OHBM OpenScienceSIG, Saskia Bollmann and Steffen Bollmann cover common scenarios where Git can be extremely valuable. The essentials covered include cloning a repository and keeping it up to date, how to create and use your own repository, and how to contribute to other projects via forking and pull requests.

3
3
Duration:
2:15:50

Tutorial on collaborating with Git and GitHub. This tutorial was part of the 2019 Neurohackademy, a 2-week hands-on summer institute in neuroimaging and data science held at the University of Washington eScience Institute.

4
4
Duration:
59:34

In this lesson, Yaroslav O. Halchenko describes how DataLad allows you to track and mange both your data and analysis code, thereby facilitating reliable, reproducible, and shareable research.

5
5
Duration:
01:29:08

Datalad is a versatile data management and data publication multi-tool. In this session, you can learn the basic concepts and commands for version control and reproducible data analysis. You’ll get to see, create, and install DataLad datasets of many shapes and sizes, master local version workflows and provenance-captured analysis-execution, and you will get ideas for your next data analysis project.

6
6
Duration:
56:49

Introduction to the Brain Imaging Data Structure (BIDS): a standard for organizing human neuroimaging datasets. This lecture was part of the 2018 Neurohackademy, a 2-week hands-on summer institute in neuroimaging and data science held at the University of Washington eScience Institute.

7
6
Duration:
01:21:59

In this presentation by the OHBM OpenScienceSIG, Tom Shaw and Steffen Bollmann cover how containers can be useful for running the same software on different platforms and sharing analysis pipelines with other researchers. They demonstrate how to build docker containers from scratch, using Neurodocker, and cover how to use containers on an HPC with singularity.

 

 

8
8

Peer Herholz gives a tour of how popular virtualization tools like Docker and Singularity are playing a crucial role in improving reproducibility and enabling high-performance computing in neuroscience.