Skip to main content

R 4.0 Migration Retrospective

ยท 6 min read
Christopher J. 'CJ' Wright
Matthew R. Becker

While the R 4.0 migration has been functionally complete for quite a while, the recent migration of r-java and its dependents gives a good opportunity to write a retrospective on the technical issues with large-scale migrations in conda-forge and how we solved them.

The R 4.0 migration rebuilt every package in conda-forge that had r-base as a requirement, including more than 2200 feedstocks. A migration of this size in conda-forge faces several hurdles. First, since every feedstock is a separate GitHub repository, one needs to merge more 2200 pull requests (PRs). Second, conda-forge's packages on anaconda.org are behind a CDN (content delivery network). This service reduces web hosting costs for Anaconda Inc. but introduces an approximately 30 minute delay from when a package is uploaded to anaconda.org and when it will appear as available using conda from the command line. Thus, even if the dependencies of a package have been built, we have to wait until they appear on the CDN before we can successfully issue the next PR and have it build correctly. Finally, the existing bot and conda infrastructure limited the throughput of the migrations, due in part to the speed of the conda solver.

Given the size of the R 4.0 migration, we took this opportunity to try out a bunch of new technology to speed up large-scale migrations. The main enhancements were using GitHub Actions to automerge PRs, using mamba to quickly check for solvability of package environments, and enabling long-running migration jobs for the autotick bot. All told, the bulk of the feedstocks for R 4.0 were rebuilt in less than a week, with many PRs being merged in 30 minutes or less from when they were issued. These enhancements to the autotick bot and conda-forge infrastructure can be used to enhance future migrations (e.g., Python 3.9) and reduce maintenance burdens for feedstocks.

Automerging conda-forge PRsโ€‹

In a typical migration on conda-forge, we issue a PR to a feedstock and then ask the feedstock maintainers to make sure it passes and merge it. In the case of the R 4.0 migration, the maintainers of R packages on conda-forge use a maintenance team (i.e., @conda-forge/r) on the vast majority of feedstocks. This team is small and so merging over 2000 PRs by hand is a big undertaking. Thus, with their permission, we added the conda-forge automerge functionality to all R feedstocks that they maintain. The automerge bot, which relies on GitHub Actions, is able to automatically merge any PR from the autotick bot that passes the recipe linter, the continuous integration services, and has the special [bot automerge] slug in the PR title. This feature removed the bottleneck of waiting for maintainers to merge PRs and reduced the maintenance burden on the R maintenance team.

Checking Solvability with mambaโ€‹

While being able to automatically merge PRs removed much of the work of performing the R 4.0 migration, it relied on the PR building correctly the first time it was issued. Due to the CDN delays and the build times of a package's dependencies, the dependencies of a package may not be immediately available after all of their migration PRs are merged. If the bot issued the packages migration PR before the dependents are available, the PR would fail with an unsolvable environment and have to be restarted manually. This failure would negate any of the benefits of using automerge in the first place.

To control for this edge case, we employed the mamba package to check for the solvability of a PR's environments before the PR was issued. mamba is a fast alternative to conda that produces solutions for environments orders of magnitude more quickly. Since, we have to perform our checks of PR environments many times, an extremely fast solver was essential for making the code efficient enough to run as part of the autotick bot. We ended up using mamba to try to install the dependencies for every variant produced by the feedstock to be migrated. With this check in place, the autotick bot was able to issue migration PRs that passed on the first try and were thus automatically merged, many within 30 minutes or less.

Improving the Autotick Bot's Efficiencyโ€‹

Finally, we made several upgrades to the autotick bot infrastructure to increase the uptime of the bot and its efficiency. First, we moved from an hourly cron job to a set of chained CI jobs. This change eliminated downtime between the runs of the bot. Second, we started to refactor the autotick bot from one monolithic piece of code into a distributed set of microservices which perform various independent tasks in parallel. These independent tasks, used for things like checking the statuses of previously issued PRs, are run separately allowing the bot to spend more time issuing PRs. Finally, we optimized the internal prioritization of the PRs to make sure the bot was spending more time on larger migrations where there is more work to do. More work on the autotick bot infrastructure, including work done by Vinicius Cerutti as part of the Google Summer of Code program, will further streamline the bot's operation.

Despite some initial hiccups with the bot infrastructure, the migration ran quite smoothly for an endeavor of its size. The vast majority of migration PRs were completed within a week from when we started, which is a first for a migration of this size on conda-forge. The largest issue was solved recently, with the fixing of the openjdk recipe and the removal of aarch64 and ppc64le builds from r-java, enabling the last large piece of the R ecosystem to be updated.

Looking forward, the improvements we made for the R 4.0 migration seem broadly applicable to other migration tasks, including the yearly python minor version bump. These kinds of large-scale migrations are particularly suitable, since they usually involve few changes to the feedstock itself and usually fail on CI when a broken package would be produced. Faster migrations will help to provide the latest features to downstream users and keep transition times to a minimum, helping to foster greater stability of the ecosystem and the seamless experience users have come to expect from conda-forge.