Skip to main content

This lesson continues with the second workshop on reproducible science, focusing on additional open source tools for researchers and data scientists, such as the R programming language for data science, as well as associated tools like RStudio and R Markdown. Additionally, users are introduced to Python and iPython notebooks, Google Colab, and are given hands-on tutorials on how to create a Binder environment, as well as various containers in Docker and Singularity.

Difficulty level: Beginner
Duration: 1:16:04

This is a tutorial introducing participants to the basics of RNA-sequencing data and how to analyze its features using Seurat. 

Difficulty level: Intermediate
Duration: 1:19:17
Speaker: : Sonny Chen

This tutorial demonstrates how to perform cell-type deconvolution in order to estimate how proportions of cell-types in the brain change in response to various conditions. While these techniques may be useful in addressing a wide range of scientific questions, this tutorial will focus on the cellular changes associated with major depression (MDD). 

Difficulty level: Intermediate
Duration: 1:15:14
Speaker: : Keon Arbabi
Course:

Longitudinal Online Research and Imaging System (LORIS) is a web-based data and project management software for neuroimaging research studies. It is an open source framework for storing and processing behavioural, clinical, neuroimaging and genetic data. LORIS also makes it easy to manage large datasets acquired over time in a longitudinal study, or at different locations in a large multi-site study.

Difficulty level: Beginner
Duration: 0:35
Speaker: : Samir Das

This talk highlights a set of platform technologies, software, and data collections that close and shorten the feedback cycle in research. 

Difficulty level: Beginner
Duration: 57:52
Speaker: : Satrajit Ghosh

This demonstration walks through how to import your data into MATLAB.

Difficulty level: Beginner
Duration: 6:10
Speaker: : MATLAB®

This lesson provides instruction regarding the various factors one must consider when preprocessing data, preparing it for statistical exploration and analyses. 

Difficulty level: Beginner
Duration: 15:10
Speaker: : MATLAB®

This tutorial outlines, step by step, how to perform analysis by group and how to do change-point detection.

Difficulty level: Beginner
Duration: 2:49
Speaker: : MATLAB®

This tutorial walks through several common methods for visualizing your data in different ways depending on your data type.

Difficulty level: Beginner
Duration: 6:10
Speaker: : MATLAB®

This tutorial illustrates several ways to approach predictive modeling and machine learning with MATLAB.

Difficulty level: Beginner
Duration: 6:27
Speaker: : MATLAB®

This brief tutorial goes over how you can easily work with big data as you would with any size of data.

Difficulty level: Beginner
Duration: 3:55
Speaker: : MATLAB®

In this tutorial, you will learn how to deploy your models outside of your local MATLAB environment, enabling wider sharing and collaboration.

Difficulty level: Beginner
Duration: 3:52
Speaker: : MATLAB®

This tutorial teaches users how to use Pandas objects to help store and manipulate various datasets in Python. 

Difficulty level: Beginner
Duration: 1:21:40
Speaker: : Tal Yarkoni
Course:

This lesson gives a quick walkthrough the Tidyverse, an "opinionated" collection of R packages designed for data science, including the use of readr, dplyr, tidyr, and ggplot2.

Difficulty level: Beginner
Duration: 1:01:39
Speaker: : Thomas Mock

This lecture covers the rationale for developing the DAQCORD, a framework for the design, documentation, and reporting of data curation methods in order to advance the scientific rigour, reproducibility, and analysis of data.

Difficulty level: Intermediate
Duration: 17:08
Speaker: : Ari Ercole

The Medical Informatics Platform (MIP) is a platform providing federated analytics for diagnosis and research in clinical neuroscience research. The federated analytics is possible thanks to a distributed engine that executes computations and transfers information between the members of the federation (hospital nodes). In this talk the speaker will describe the process of designing and implementing new analytical tools, i.e. statistical and machine learning algorithms.  Mr. Sakellariou will further describe the environment in which these federated algorithms run, the challenges and the available tools, the principles that guide its design and the followed general methodology for each new algorithm. One of the most important challenges which are faced is to design these tools in a way that does not compromise the privacy of the clinical data involved. The speaker will show how to address the main questions when designing such algorithms: how to decompose and distribute the computations and what kind of information to exchange between nodes, in order to comply with the privacy constraint mentioned above. Finally, also the subject of validating these federated algorithms will be briefly touched.

Difficulty level: Intermediate
Duration: 20:26
Speaker: : Jason Skellariou