Magnus: thank you for this technical overview, it was highly valuable. I would love to see the full source for your solution.

How we run NPM packages in the browser

JavaScript has never had any official solution for distributing packages, and every web platform (Rails, Django etc) has their own idea of how to structure and package JavaScript. In the last few years NPM has started becoming the canonical way of distribution, with Webpack as the build system, but there's no way to load NPM packages in the browser without a server-side component.

Scrimba is a platform for interactive coding screencast where you can run the code at any moment in time. Being able to use NPM packages has been a frequently requested feature, but it turns out to be surprisingly technically difficult to implement. In this article I will explain how we tackled the problem and the final solution we ended up with.

Our solution (inspired by StackBlitz) runs almost entirely on the client-side. Dependency resolution is done server-side, but fetching NPM files, parsing require(), resolving require paths and creating a bundle is all done at the client. This (combined with IndexedDB caching) gives a smooth experience with very low latency.

What is involved in supporting NPM?

There are two very distinct steps in supporting NPM: Dependency resolution and module loading.

Dependency resolution is about going from a dependency requirement,

"dependencies": {
"react": "^16.0.0",
"react-dom": "^16.0.0"
}

to a concrete description of what packages needs to be installed:

node_modules/react react@16.0.0
node_modules/react-dom react-dom@16.0.0
node_modules/fbjs fbjs@0.8.16
...

Module loading is more involved and includes:

  • Figuring out that require("react") in app.js refers to node_modules/react/index.js
  • Making sure dependencies are available before the module body is being executed
  • Providing exports, module, process variables inside the module body

Dependency resolution

Yarn is a package manager for NPM which you can use a library. There is not a public API available at the moment, but as long as you depend on a fixed version everything works out nicely. Yarn has a PackageResolver which resolves all dependencies, and with the PackageHoister you can find the paths where packages needs to be installed. Yarn is decently decoupled and although it lacks documentation about the internal classes it's quite easy to figure things out.

Module loader: Webpack

There are some online coding sandboxes (Webpackbin and CodeSandbox) which supports NPM packages through server-side Webpack. Their approach is essentially as follows:

First they have a packager which takes a list of fully resolved packages (e.g react@16.0.0,react-dom@16.0.0), generates a package.json, install packages and dependencies with yarn, and then uses Webpack to build a DLL package. A DLL package is a way of bundling all files into a single Webpack bundle. Packaging can take a while (>10s), but it's only needed once per request and you can cache the produced package indefinitely.

When you want to execute your app.js you send it to the server, generate a Webpack configuration (which includes a reference to the DLL package) and Webpack will resolve everything for you.

There are two big disadvantages with this approach:

  • Building the initial package can take a long time. This needs to be done on the server-side and its scalability is limited by how many servers you have.
  • Packages can only be cached as a whole. This means that if you add one small dependency you must still create a whole new package.

We created a prototype together with Christian Alfoni (the creator of Webpackbin), but had some pending work when we discovered a different approach.

UPDATE: CodeSandbox recently revamped their NPM support and uses a different approach now.

Module loader: Unpkg + SystemJS

StackBlitz is an online coding sandbox which supports NPM without using Webpack at all. They briefly described their approach in a comment at the unpkg-repo:

In short, SystemJS & Unpkg are an incredible duo for dev UX because they reflect what makes local dev environments great: the client is doing all of the downloading, installing, bundling, and even serving the application.

They haven't open-sourced their solution yet, but from their description we can still go through how it works. Let's say we have an app.js which contains the following:

import React from 'react';
import ReactDOM from 'react-dom';

This can be defined as a module in SystemJS and it will start doing its magic. The required paths (react and react-dom) will be extracted and attempted resolved. By writing a SystemJS-plugin we can integrate with Unpkg and fetch files directly from NPM packages:

From StackBlitz' description we implemented a prototype in Scrimba, and we had the following experiences:

  • It turns out that SystemJS's built-in resolve logic doesn't completely match Node's, and there's quite a few packages which depends on Node's behaviour. We ended up writing our custom path resolver.
  • SystemJS is slow. Even though we cached the content of the files and did no HTTP requests at all, it took around 150ms to just resolve all dependencies. It seems to be caused by heavily use of Promises which causes overhead when there's no asynchronicity.

In the end we felt we needed to re-invent many parts of SystemJS in order to get the correct behaviour, and the parts left of SystemJS were slow or not in use.

Module loader: Custom loader (our approach)

Our final solution uses a module loader we wrote for our use case. It consists of several decoupled parts which combines into a complete solution. The project is called "mrmanager" and is currently not open-sourced, but if there's interest we can make it happen.

First there's a virtual file system. The file system handles require path resolution and integrates with Unpkg to fetch NPM files.

var fs = new FileSystem;

// Set up file system:
var npm = new PackageSet;
fs.mount("/node_modules/react/", npm.get("react@16.0.0"));
fs.mount("/node_modules/react-dom/", npm.get("react-dom@16.0.0"));

var static = new StaticMount({"app.js": "require('react')"});
fs.mount("/", static);

// Resolve paths relative to a directory:
var path = await fs.resolve("react", "/")

// result: /node_modules/react/index.js

// Fetch the body
var body = await fs.fetch(path);

Then we have the module system. This provides a way to register modules and execute modules:

var system = new ModuleSystem;

// Define module
system.define("/app.js", function(require, module, exports) {
var React = require("react");
});

// Register mappings
system.registerResolve("react", "/", "/node_modules/react/index.js");

// Execute module
var result = system.moduleResult("/app.js")

Note that the module system does not parse require paths or fetch dependencies, and you must handle this at a different level. It's your responsibilty to define all modules before you attempt to execute them. This is a way of simplifying the module system to its most basic features and allowing for different approaches to dependency handling.

The final piece of the puzzle is the loader. The loader combines the module system with the file system such that:

  • The loader will use the file system to fetch the code
  • It will parse the code for dependencies, fetch the dependencies, resolve all paths and make sure everything is available. This is done recursively.
  • The code will be wrapped inside a system.define(name, function() { … }-block
  • The wrapped code for all files is concatenated together to one big string
  • Once all dependencies are loaded, it will execute the code once. This will register every file to the module system, and we're now ready to invoke the main file.

The crucial part here (compared to SystemJS) is that we combine all the processed files into one big string and (1) execute it at once and (2) cache it all together. In SystemJS the only unit you can cache is a file and SystemJS will every time re-parse require paths, resolve those paths and so on.

This is a good example of how decoupling two system (the module system and the dependency system) gives us greater flexibility even though we're always going to use them together. By decoupling them we have clearly defined what the module system needs to function. After careful consideration we found a way to cache this in the most performant way.

The end result? From a hot cache, mrmanager is capable of executing React (with ReactDOM) in 9ms instead of SystemJS' 150ms. Those milliseconds matter when you want to quickly iterate on small projects.

I must again clarify that it was not only performance which lead us to a custom approach. In our SystemJS-prototype we ended up using just a fraction of SystemJS, reinventing parts of it, and monkey-patching various functions. That's not a criticism of SystemJS itself, but a sign that it hasn't been designed for our use case.