typedspace.com: tidbits on software

parsing JavaScript source code for fun

2021-08-16

A snippet for parsing some Ember.js or random JavaScript source code on disk. Deno. Acorn. TypeScript.

why

I read about machine learning on code once. That sparked my curiosity. How might I parse codebases I maintain to discover better software designs? I work in an 6 plus year old Ember.js application. It has Ember classic patterns such as mixins. So I wanted to know what methods those mixins had.

tools for the job

I chose the following to slice and dice the codebase:

Why these you ask?

Just because.

I've used Acorn for trivial JavaScript parsing in the past. I did not feel like trudging through the many alternatives in the JavaScript ecosystem.

I have been curious about using Deno for something too. I've started using deno to learn TypeScript and explore its design and ergonomics for a JavaScript and TypeScript runtime. To me, this kind of program suits Deno quite well. A small program with dashes of complexity. Today parsing a few files. Tomorrow an entire platform to handle all your JavaScript parsing needs.

the parsing program code

I version all snippets on my blog so you can download them from source control.

// we do not check the types because acron and esprima
// types are somewhat off
// See https://github.com/acornjs/acorn/issues/946
import { parse } from "https://esm.sh/acorn@8.4?no-check";
/**
 * Extract method statements from an Ember.js classic
 * class.
 * This will not extract actions from an `actions` hash.
 * @param {string} absPath
 * @param {Map} map
 * @returns void
 */
const extractMethods = (
  { absPath, map }: { absPath: string; map: Map },
) => {
  let file: Uint8Array;
  try {
    file = Deno.readFileSync(absPath);
  } catch (err) {
    console.error("could not read file", err);
    return;
  }
  const text = new TextDecoder().decode(file);
  const program = parse(text, { ecmaVersion: 2021, sourceType: "module" });

  const grabMethods = () => {
    const dec = program.body.filter(({ type }: { type: string }) =>
      type === "ExportDefaultDeclaration"
    );
    if (dec.length === 0) {
      return dec;
    }
    const { properties } = dec[0].declaration.arguments[0];
    if (!properties) {
      return [];
    }
    const methods = properties
      .filter(({ method }: { method: string }) => method)
      .map(({ key: { name } }: { key: { name: string } }) => name);
    return methods;
  };
  // TODO retrieve actions as well?
  const foundMethods = grabMethods();
  if (!foundMethods.length) {
    return;
  }
  map.set(absPath, foundMethods);
};

const collected = new Map();
const [root] = Deno.args;
for (const path of Deno.readDirSync(root)) {
  extractMethods({ absPath: root + path.name, map: collected });
}
console.log(Array.from(collected.values()).flatMap((val) => val).sort());

The program traverses found JavaScript files, extracts method names, and prints those names to your terminal. Nothing fancy. This might come in handy for building more sophisticated pipelines or just exploring your codebase.

You run the program with something like:

#!/usr/bin/env sh
deno run --allow-read static/snippets/search.ts ~/an-ember-app/app/mixins/

The run snippet demonstrates a key feature of deno. Deno denies security-related capabilities by default and requires you explicitly grant those capabilities to your programs.

This was a nice way to explore deno for scripting data extraction from a JavaScript codebase. I did learn about TypeScript support, npm package interoperability, and some limitations to Acorn's bundled types.