Nadia Makarevich

Three simple tricks to speed up yarn install

Three simple tricks to speed up yarn install

Dev productivity and quality of life improvements are a passion of mine. And speeding up working with code is not only fun, but also essential for building products fast. Happier and faster developers equal happier customers, who get their features and bug fixes sooner!

When dealing with npm-based ecosystem and its myriads of packages, dependencies install time, especially in large projects, can be unreasonably long. Below are three simple tricks that can help you shave off a minute or two, or sometimes even reduce your yarn install time by half 😋

Not anymore!

Bloated yarn.lock problem

If your project runs on yarn, then more likely than not some of your dependencies will be duplicated, even if they satisfy semver conditions. Although yarn promises that deduplication isn’t necessary, this is not exactly the truth.

Imagine the situation: you’re adding @awesome/tools library to your dependencies, which also depends on utils@^1.0.0library, which is its latest version. After installing @awesome/tools you’ll see in your yarn.lock file:

After a few months you want to add another library that depends on those utils, let’s say @simple/button. Utils library released a few bug fixes and features in the meantime, and its latest version is now 1.5.1, and @simple/buttondepends on it. If you just run yarn add @simple/button, then in the yarn.lock you will see this picture:

Even though 1.5.1 and 1.0.0 versions are semver-compatible, yarn will not merge them into one as you’d expect, and you’ll end up with 2 versions of the same utils in the repo.

It gets worse than that. If you have a few different libraries that depend on version 1.0.0 and a few that depend on 1.5.1, yarn will hoist one of those versions to the root of node_modules folder, but another one would have no place to go (only one version can sit at the root), and they will be installed as copies in node_modules folders of the libraries that use them. You’ll end up with this folder structure:

And although it seems like you only have 2 versions of utils library, in reality, it can be 5–6-infinite-number of its copies living in your project, all of which need to be copied into their place, and all of which will steal your yarn install time.

Solution

Deduplicate! Either manually ensure that all semver-compatible libraries resolve to just one version, or use tools like yarn-deduplicate to automate it for you. This is what you want in your yarn.lock:

Just to give you a taste of what’s possible: in one of the projects I was working on the radical deduplication of all dependencies dropped yarn install time from 3 min to ~1.5min. 50%!

The dark side of monorepo and workspaces

If you subscribe to the monorepo school of thoughts and use yarn workspaces (or Lerna, which uses workspaces underneath) to manage your packages, you can easily find yourself in a situation where dependencies in your local packages are out of sync with each other. Similar to the situation above, if one of your packages depends on utils@^1.0.0 and others on utils@^1.5.1, and those are not deduplicated in the root yarn.lock, yarn will create multiple copies of the version that is not hoisted to the root, and install them in your workspaces’ node_modules folders. In this case you have the potential to end up not only with multiple copies of the same dependency in the root node_modules, but also with the same copies of the dependency scattered all over the repo:

In a really big project that I was working on, that had more than 600 workspaces, if we would have allowed as little as 10% of the dependencies to go out of sync, yarn install instead of 5 min would take an hour. For 50% it would just throw with “out of memory” error halfway through :)

Solution

If migrating to yarn2 or anything pnpm based is not an option (they solve this particular problem), then the only other choice is to keep dependencies in your workspaces strictly in sync with each other and the root package.json. Up to the patch version number, no reliance on semver!

Nuke node_modules situation

For some reason, when something weird is happening after updating dependencies in yarn, the very first solution that everyone recommends is to nuke node_modules folder and do a fresh yarn install. It’s usually the last suggestion as well since it magically fixes 90% of the weirdness cases. After a while, doing rm -rf node_modules && yarn install develops into a habit that you do every time you check out the main branch and start working on a new feature. All those boring, boring, booooring minutes watching yarn re-installing everything…

If only it could be speeded up a little bit, at least by 50%…

Solution

Inside your root node_modules folder there is a secret file .yarn-integrity, where yarn writes down everything that it needs to know about your repository and its installed dependencies. If the content of this file doesn’t match the situation in the repo, yarn will update it and refresh the installed packages. And if the file is missing, it will generate it based on the yarn.lock and update the content in node_modules with whatever is supposed to be there according to yarn.lock. Without removing it first, so if most of the content there is already correct, then it just replaces the incorrect one.

See where this is going? 😉 Instead of rm -rf node_modules && yarn install habit do

rm -rf node_modules/.yarn-integrity && yarn install

🙃. It usually halves the time it takes to run yarn install (tested on at least 3 projects of various sizes).

Hope you enjoyed those three little secrets, applied them to your repo and your yarn install is now faster than ever! 🥳