With packrat, we know that project A will always be able to use ggplot2 2.5.0 and project B will always be able to use ggplot2 3.1.0. Already integrated with RStudio Server, packrat ensures that all installed packages are stored with the project 2, and that these packages are available when a project is opened. We use packrat to control package versions. The solution is to control package versions at the project level. However, requiring all R users to use the same versions of all packages to facilitate collaboration is clearly out of the question. Since R is a slowly evolving language, it might be reasonable to require that a particular Linux instance have only one version of R installed. R projects that do not control package versions will eventually break and/or not be shareable or reproducible 1. We will keep the motivation for good versioning and reproducibility short: R projects evolve over time, as do the packages that they rely on. ![]() We assume all development takes place on an RStudio Server cloud Linux instance, ensuring that only one operating system needs to be supported. ![]() In this post, we offer an “opinionated” solution based on what we have found to work in a production environment. There are many possible workflows to accomplish this. ![]() When setting up R and RStudio Server on a cloud Linux instance, some thought should be given to implementing a workflow that facilitates collaboration and ensures R project reproducibility. Roland Stevenson is a data scientist and consultant who may be reached on Linkedin.
0 Comments
Leave a Reply. |