

Search for a Docker image, like rocker/tidyverse and create itĪccess a containerized RStudio installation at the URL shown in Kitematic (in this case 192.168.99.100:32769 (default username and password are both rstudio) With Kitematic you can search for public images and builds on Docker Hub and manage all your running containers without having to deal with the command line. Kitematic automatically installs Docker and all its dependencies and it runs a virtual Linux machine in the background, which then runs Docker inside. The easiest possible way to play with Docker is to use Kitematic, Docker’s own cross-platform GUI. Rather than tie up my computer and make it so I can’t put it to sleep, I create a virtual server on DigitalOcean and run these intensive scripts in a Docker container there. Complicated Bayesian models take forever to run, though, because of long Monte Carlo Markov chains (it takes hours to run dozens of Bayesian random effects effects models with rstanarm::stan_glmer(), for instance). Offloading computationally intensive stuff: I often build Bayesian models with Stan and rstanarm. In an ideal world, you can create a Dockerfile for a specific project, develop everything in RStudio in the browser, and the distribute both the Dockerfile and the project repository so others can reproduce everything. (Or, be lazy like me and keep developing in your local R installation without containers and periodically check to make sure it all works in a pristine development environment in a container).Īll someone needs to do to run your project is pull the Docker image for your project, which will already have all the packages and dependencies and extra files installed.

#KITEMATIC ADD IMAGE INSTALL#
(This idea comes from a Twitter conversation with ( noamross?).) Create a custom Dockerfile for your project where you install any additional packages your project needs ( more on that below), and then develop your project in the browser-based RStudio from the container. Instead of using packrat, you can develop an R project within a Docker container. R has packrat, which is incorporated into RStudio, but it’s a hassle and I hate using it and can never get it working right. Reproducibility and consistent development environments: Python virtual environments are awesome-anyone can install all the packages/libraries your script needs in a local Python installation. I’ve found two general reasons for running R in a Docker container: Running on a remote server without RStudio.Getting stuff in and out of the container.Instead, it’s a quick super basic beginner’s guide about how I use these Docker images in real-world R development and research. This post doesn’t explain how Docker works.
#KITEMATIC ADD IMAGE HOW TO#
These images are well documented and there are helpful guides explaining how to get started. The R community has also jumped on the Docker whale, and rOpenSci maintains dozens of pre-built Docker images.

All the cool data science kids seem to be using Docker these days, and being able to instantly spin up a pre-built computer with a complete development or production environment is magic.
