This lesson continues with the second workshop on reproducible science, focusing on additional open source tools for researchers and data scientists, such as the R programming language for data science, as well as associated tools like RStudio and R Markdown. Additionally, users are introduced to Python and iPython notebooks, Google Colab, and are given hands-on tutorials on how to create a Binder environment, as well as various containers in Docker and Singularity.
This demonstration walks through how to import your data into MATLAB.
This lesson provides instruction regarding the various factors one must consider when preprocessing data, preparing it for statistical exploration and analyses.
This tutorial outlines, step by step, how to perform analysis by group and how to do change-point detection.
This tutorial walks through several common methods for visualizing your data in different ways depending on your data type.
This tutorial illustrates several ways to approach predictive modeling and machine learning with MATLAB.
This brief tutorial goes over how you can easily work with big data as you would with any size of data.
In this tutorial, you will learn how to deploy your models outside of your local MATLAB environment, enabling wider sharing and collaboration.
This lesson provides a brief overview of the Python programming language, with an emphasis on tools relevant to data scientists.
This tutorial teaches users how to use Pandas objects to help store and manipulate various datasets in Python.
This lesson gives a quick walkthrough the Tidyverse, an "opinionated" collection of R packages designed for data science, including the use of readr, dplyr, tidyr, and ggplot2.
This lesson gives a description of the BrainHealth Databank, a repository of many types of health-related data, whose aim is to accelerate research, improve care, and to help better understand and diagnose mental illness, as well as develop new treatments and prevention strategies.
This lesson corresponds to slides 46-78 of the PDF below.
This lesson provides an overview of how to conceptualize, design, implement, and maintain neuroscientific pipelines in via the cloud-based computational reproducibility platform Code Ocean.
This lesson provides an overview of how to construct computational pipelines for neurophysiological data using DataJoint.
This hands-on tutorial walks you through DataJoint platform, highlighting features and schema which can be used to build robost neuroscientific pipelines.
This lesson provides an introduction to the DataLad, a free and open source distributed data management system that keeps track of your data, creates structure, ensures reproducibility, supports collaboration, and integrates with widely used data infrastructure.
This lesson introduces several open science tools like Docker and Apptainer which can be used to develop portable and reproducible software environments.
This lecture provides a detailed description of how to incorporate HED annotation into your neuroimaging data pipeline.
This lecture covers a wide range of aspects regarding neuroinformatics and data governance, describing both their historical developments and current trajectories. Particular tools, platforms, and standards to make your research more FAIR are also discussed.
This talk highlights a set of platform technologies, software, and data collections that close and shorten the feedback cycle in research.