A Preview of JHM

JHM is yet another build system. JHM understands not only how to compile code, but also how to run it, how to find tests for it, and how to check it for cleanliness. It is a general framework for building, testing, and packaging code.  You tell what you want, and it will do its best effort to build that for you.

We currently use JHM to compile our C++ codebase, and are working on converting older projects to utilize it. It presents a unified interface for building, testing, and packaging code which reduces the number of tools a programmer needs to know to get started on a project. In the rest of this post I will outline the goals behind JHM, and the machinery that makes it all work. We also plan on open-sourcing JHM.

Major Goals of JHM

  1. Programmers should not have to repeat themself.
    1. Programmers should have to write dependencies at most once.
    2. Compilation knowledge (ex. How do I compile C source to an object module) should be specified per-OS rather than per-project.
  2. Source code should not mingle with generated files
  3. Allow for rapid iteration
    1. Build should be fast, rebuilding after change should be faster.
    2. Running tests should be trivial, and possibly automatic.
  4. There should be a unified interface for compiling, testing, and checking code regardless of language.

How it works

JHM utilizes a large vocabulary to modularize the problem into well-defined pieces, which are then used by JHM’s core algorithms to build the code. The vocabulary makes the core pieces of JHM almost trivial.

Vocabulary

Each vocabulary term in JHM translates to a Python class in JHM. The term will be given, along with what it inherits from (indicated using the C++ ‘:’ convention), followed by a definition, brief explanation, and an example (if applicable). After that, there is a section listing attributes and functions of the class.

BuildableKind:

Describes the general form of a Buildable (which we will define next). A buildable kind knows how to extract or generate certain information for a buildable, such as the buildable’s run time dependencies (requires). For example, a buildable kind could know how to read a C++ source file and list the buildables it depends on. Other buildable kinds could know how to figure out what files need to be linked together to generate an output file.

GetRequires(Buildable): Examine the given buildable and return the set of buildables which are the run-time dependencies of the buildable.

GetRunner(Buildable): Return a function which, when called, will run the buildable. A build kind does not have to implement function.

Buildable:

Something which can be “built”.  All Buildables are interned, so that if the same buildable is asked for twice, both times the same object is returned. Buildable is an abstract class, and should not be instantiated.

availability: Whether or not the buildable could ever possibly exist. The function to discover whether or not a buildable is available, FindAvailability, is implemented by classes which inherit buildable

builder: A builder is a buildable which, when run, will create this buildable.

exists: Whether or not the buildable physically exists in the file system. This tells us whether or not the Buildable needs to have a builder to exist.

kind: An instance of BuildableKind

requires: A list of buildables which need to exist before the current buildable can be run. The list is constructed by calling kind.GetRequires. Note that as individual requires are built, the set of requires which are needed may change. This means that until all requires are finished, the full set is not known. type: The underlying type of the buildable, such as Job or Item.

Name:

A buildable within the environment, represented primarily as a relative filesystem path. The path can be constructed from something such as a Java or Haskell module name.

rel_path: The unique identity within the environment. This is a relative path within the environment.

branch: The directory within the tree where the name file resides; the portion of the path from the start of rel_path to the last slash.

name: The portion of the path after the last slash, the “filename” when you’re talking in general about file systems.

base: The portion of the name up to, but not including the last dot.

ext_list: The portion of the name after the last dot, split by ‘.’ and represented as a list. Note that the list always has length greater than or equal to one. An empty extension is valid within the extension list (Such an extension is given to Linux executables).

prefix: The portion of the base which is always the same for a name of the same Kind. For example, static libraries always start with ‘lib’. This is often an empty string.

atom: The portion of the base after the prefix. This is the kernel of the file’s identity and the portion of the name which is shared between different but related names.

kind: The buildable kind. Normally this is derived from the last extension in the extension list, although it may be manually specified at name construction time.

Overall:
   <rel_path> -> <branch> <name>
    <name> -> <base> <ext_list>
    <base> -> <prefix> <atom>
Example:
   rel_path = ‘/there/everwhere/libthorium.a’
    branch   = ‘/there/everywhere’
    name     = ‘libthorium.a’
    base     = ‘libthorium’
    ext_list = [‘a’]
    prefix   = ‘lib’
    atom     = ‘thorium’

Repository:

A location where JHM will look to try and turn a name into an absolute path. A repository could be a directory, a language specific module repository (Python PyPI, Ruby Gem, Haskell Hackage, etc.), another JHM project, or a collection of JHM projects.

Contains(Item): Returns whether or not the given item is contained within the repository.

Item: Buildable

A name which has been tied to a repository. Names can only ever be associated with a single repository. Items are interned on their name.

name: The instance of Name which the buildable represents

repo: The repository which the item resides

Job: Buildable.Item

A manipulation, transformation, and/or generator which can be run to create one or more Items. Jobs are interned on each individual output they produce. A job could be linking .o_pic (object files compiled with –fPIC) files together to build a static library, such as libthorium.a.

GetRequires(): Examines at the output of the job and determines what the job will need in order to create the desired intput. For example, seeing libthorium.a comes from thorium.cc then finding the .o_pic files needed based on thorium.cc includes.

output_set: The set of buildables the job will produce when it is run.

Environment:

JHM needs a complete worldview of your project and its dependencies in order to operate. This world view is stored in the JHM Environment (Env). An environment contains a collection of Config files, Repositories, Jobs, and Items. It is the level at which Buildables are interned. The heart of the environment is the project root. The project root is a directory which is auto-discovered by locating the nearest parent folder containing a folder named .jhm (this folder contains the JHM configuration). The project root must have the source directory as a subfolder, although it may be arbitrarily nested. The output directory should be inside the project root as well, but that is not required. Generally speaking, the folders are laid out like so:

$(project root)/
      .jhm/
      $(source_dir)/
      $(out_dir)/

Algorithms

JHM being written in python, the algorithms below are written in python-like pseudocode.

Finding item availability

Producing Producables
Moving Forward
As we work to convert more projects to utilize JHM, we continue to make improvements to it and develop new features within it. The current version of JHM used in development is Tagged is available on Github at https://github.com/tagged/jhm. If you’re interested in keeping up with the Tagged team, be sure to follow us on Github and/or Twitter.

Cody Maloney was a software engineering intern this summer as part of the Stig team.