Using codemods to upgrade legacy code

Last week I helped a client plan an upgrade from an internal custom JavaScript library to a more modern UI framework. The existing code uses a tool called “jingo“ to manage dependencies. We wanted to support modern bundlers that use the new EcmaScript Modules (import and export) syntax. A straight forward mechanical transformation exists but the codebase consists of hundreds of files to convert. It would be tedious, boring work over the course of days. However, I could use codemods to write a script that would convert the syntax automatically.

In jingo, engineers write code in modules, and each module provides a list of dependencies. The developer specifies dependencies in a . separated format, just like a Java package namespace. At runtime, Jingo will convert those namespaces to URLs, and add the relevant <script> tags to the DOM before executing the module code. Anything loaded as a dependency should create new global variables in window, but generally placed in a namespace object.

Here’s a simple example:

jingo.declare({
  require: [
    'root:lib.common.utilities',
    'root:lib.common.textBox'
  ],
  as: function() {
    // find or create a namespace for our code
    var MY_APP = window.MY_APP || (window.MY_APP = {});

    // Add a class to the namespace
    MY_APP.HelloWorldController = function(rootNode) {
      this.rootNode = rootNode;

      // TextBox is a dependency, and is expected to put a class into the MY_APP namespace
      this.textBox = new MY_APP.TextBox(this.rootNode);
    }

    // TODO: Add other class methods here
  }
})

This module depends on utilities and textBox and requires that the browser load them first. Jingo makes sure they get initialized before executing the as function. At that point, the code of this module can depend on the methods and classes its dependencies create. The engineers chose to have dependencies create known classes and functions inside of a global namespace.

If you stand back and squint at this, you’ll notice that this looks kinda like a regular ES Module, where the require array is equivalent to import statements, and the as function is the main code in the module.

An engineer could translate it into modules like this:

import '~/lib/common/utilities';
import '~/lib/common/textBox';

// find or create a namespace for our code
var MY_APP = window.MY_APP || (window.MY_APP = {});

MY_APP.HelloWorldController = function(rootNode) {
  this.rootNode = rootNode;

  // TextBox is a dependency, and is expected to put a class into the MY_APP namespace
  this.textBox = new MY_APP.TextBox(this.rootNode);
}

// Add other methods here

It would take a developer a three or four minutes to translate from the previous syntax to the new one, and it doesn’t take too much thought once they’ve completed a couple of files. But it would be tedious, boring, and prone to typos. It’s also complex enough that a well-crafted Find+Replace with regular expressions is unlikely to do the right thing.

Luckily great tooling for this kind of transformation exists in the form of codemods: Scripts that automate the transformation of JavaScript and typescript code.

I used a library called codemod-cli to help set up a project for developing the mod. It helps generate a sample script and some mechansisms for writing tests against the conversions. I also found this article to be helpful while getting started: Getting Started with Code Mods.

The trick to actually writing a codemod is to figure out which Abstract Syntax Tree (AST) elements need modification. An AST is a data structure that represents the code itself. Rather than operating at the text layer of characters and words, we can operate at the syntax layer and look for particular code constructs like a CallExpression (function call).

In this case I wanted to inspect and replace the call to jingo.declare. I’d need to pull out the require and as properties from its function argument and use them to generate the replacement code. Through a library called jscodeshift, codemod authors can search for nodes in the AST that match the requested pattern.

A great way to do that is with https://astexplorer.net. You can paste the code in and see the AST that would correspond to the code in question. By looking at that I could determine that I needed to find a CallExpression where the callee property is a MemberExpression with object named "jingo" and property named declare.

Once I found that, I could inspect the function call arguments, find the array of requirements and rewrite them to import statements.

Then I could find the as function and extract its body. Last, I could combine the new import statements and the function body into the replacement for the original method call.

It looks something like this:

const { getParser } = require('codemod-cli').jscodeshift;
const { getOptions } = require('codemod-cli');

/**
 * Convert "root:lib.common.utilities" to "~/lib/common/utilities"
 * @param namespace
 */
function buildImportPathFromNamespace(namespace) {
  return namespace.replace(/\./g, '/').replace(/^root:/, '~/');
}

module.exports = function transformer(file, api) {
  const j = getParser(api);
  const options = getOptions();

  const root = j(file.source);

  // we're looking for a call to jingo.declare.
  root
    .find(j.ExpressionStatement, {
      expression: {
        type: 'CallExpression',
        callee: { type: 'MemberExpression', object: { name: 'jingo' }, property: { name: 'declare' } },
      },
    })
    .replaceWith((path) => { // path is the CallExpression

      // if we found one, we need to do some munging:
      // 1. Extract the require array from the configuration parameter
      // 2. Extract the "as" function from the configuration parameter
      // 3. For all the required "namespaces" (dotted names) convert them to full file paths
      // 4. Rewrite the require as ES module imports
      // 5. Stick the body of the "as" function at the end of the module

      const configAst = path.value.expression.arguments[0];
      const requireArray = configAst.properties.find((prop) => prop.key.name === 'require');
      const asFunction = configAst.properties.find((prop) => prop.key.name === 'as');

      /** The list of es module imports as AST ImportDeclaration nodes */
      let imports = requireArray.value.elements
            .filter((e) => e.type === 'StringLiteral')
            .map((e) => buildImportPathFromNamespace(e.value))
            .map((importPath) => j.importDeclaration([], j.literal(importPath)));

      // return replacement for jingo.declare, which is the list of imports and the code of the module
      return [...imports, ...asFunction.value.body.body];
    });


  return root.toSource();
};

module.exports.type = 'js';

All in, the codemod wasn’t too scary. The AST Explorer website made it a breeze to inspect jscodeshift’s data model and understand what I was looking for. The docs for jscodeshift are a bit lacking though. For example it was hard to know what methods existed for generating new tokens (such as j.importDeclaration). But in the end I got through it.

I’ll probably be reaching for these tools again sometime, its good to know its possible to write scripts to move code to newer constructs without a lot of manual transformation.