A common occurence in legacy modernization (and other projects) is the case where a system is fractionally replaced. This typically takes shape by a coherent grouping of functionality being determined, then being rebuilt as a separate system that integrates back with the original system. This approach tends to be successful when the original system is overloaded, as it can (if built correctly) take a lot of the burden off the source system, especially if the new system integrates with the source system via batched data. The more a user can do in the new system without it having to interact with the original system, the more burden can be taken off of that source system, but this increases the amount of truth that this new system holds and increases the odds of conflicts betweeen the two systems. It’s not uncommon, for example, for data to change in the old system, invalidating some of the choices made in the new. Storing data as an event log is one way to help mitigate the pain of reconciling these conflicts.

Consider the following use case:

You are building the new coverage comparison tool for an auto insurance company. Demographic data for covered individuals (age, SSN, driver’s license #, etc) is collected in the old system. Data about covered vehicles (VIN, mileage, etc) is also entered in the old system. Your system needs to cover the following features:

  • Grouping covered individuals in 1..n policies in order to maximize discounts (a user might not want a high-risk driver pulling up the rate for other drivers, for example)
  • Attaching vehicles to the above policies
  • Collecting coverage limits (liability, etc) for the given policy
  • Collecting exclusions for vehicles and drivers (a user might not want a high-risk driver to be covered for a shiny new car)

Some likely sources of conflict between the old and new system include:

  • Removing a covered individual that has already been grouped into a policy
  • Removing a vehicle for which an individual has been excluded
  • Adding a new vehicle after grouping is complete
  • Adding a new covered individual after grouping is complete

By business policy, this shopping experience needs to retain in-flight comparisons for 30 days, leaving a sizeable window where this system is the only one that knows certain facts pertaining to who the user wants covered on what policy with what vehicles.

Let’s take a quick look at how a rough version of this could work in TypeScript:

Given the following edge types describing the input from the legacy system and expected output…

export type Person = {
  id: number,
  age: number,
  years_licensed: number,
  name: string,
  ssn: string
}

export type Vehicle = {
  id: number,
  make: string,
  model: string,
  year: number,
  vin: string
}

export type LegacyInput = {
  people: Person[],
  vehicles: Vehicle[]
}

export type Exclusion = {
  driver_id: number,
  vehicle_id: number
}

export type Group = {
  id: number
  vehicles: Vehicle[]
  drivers: Person[]
  limit?: number
}

export type Output = {
  groups: Group[]
  exclusions: Exclusion[]
}

…the following input from the legacy system…

import {LegacyInput} from '../edge_types';

export let payload: LegacyInput = {
  people: [
    {
      id: 1,
      age: 35,
      years_licensed: 18,
      name: "Jimmy Jams",
      ssn: "111-11-1111"
    },
    {
      id: 2,
      age: 34,
      years_licensed: 18,
      name: "Joanie Jams",
      ssn: "222-22-2222"
    },
    {
      id: 3,
      age: 17,
      years_licensed: 1,
      name: "Jimmy Jams II",
      ssn: "333-33-3333"
    }
  ],
  vehicles: [
    {
      id: 1,
      make: "GMC",
      model: "Sonoma",
      year: 2005,
      vin: "12345678901234"
    },
    {
      id: 2,
      make: "Chevy",
      model: "Sonoma",
      year: 2005,
      vin: "43210987654321"
    }
  ]
};

…and the following code…

import {Person, Vehicle, LegacyInput, Exclusion, Group, Output} from '../shared/edge_types'

import {payload as legacyInput} from '../shared/legacy_input_payloads/initial.js';

type Command = "set" | "remove";
type TargetType = 'exclusion' | 'group' | 'groupLimit' | 'vehicleGroupAssignment';
type Target = number;
type Value = any;

enum EventIndex {Command, Type, Target, Value}

type event = [Command, TargetType, Target, Value];

let events: event[] = [
  ["set", "exclusion", 1, 2],
  ["set", "group", 1, [1, 2]],
  ["set", "group", 2, [3]],
  ["set", "groupLimit", 1, 30000],
  ["set", "groupLimit", 2, 50000],
  ["set", "vehicleGroupAssignment", 1, 1],
  ["set", "vehicleGroupAssignment", 1, 2],
  ["remove", "exclusion", 1, 2]
];

let output: Output = {
  groups: [],
  exclusions: []
};

// Force exhaustiveness checking via the `never` type
function assertNever(x: never, label: string): never {
  throw new Error("Unexpected " + label + ": " + x);
}

let handleSet = (e: event) => {
  let eventType = e[EventIndex.Type];
  switch (eventType) {
    case "exclusion":
      output.exclusions.push({ driver_id: e[EventIndex.Target], vehicle_id: e[EventIndex.Value] });
      break;
    case "group":
      output.groups.push({
        id: e[EventIndex.Target],
        vehicles: [],
        drivers: e[EventIndex.Value].map((pid: number) => {
          return legacyInput.people.find(p => p.id === pid);
        })
      });
      break;
    case "groupLimit":
      output.groups.find(g => g.id === e[EventIndex.Target]).limit = e[EventIndex.Value];
      break;
    case "vehicleGroupAssignment":
      output.groups
        .find(g => g.id === e[EventIndex.Target])
        .vehicles.push(legacyInput.vehicles.find(v => v.id === e[EventIndex.Value]));
      break;
    default:
      assertNever(eventType, "parameter");
  }
};

let handleRemove = (e: event) => {
  let eventType = e[EventIndex.Type];
  switch(eventType) {
    case "exclusion":
      output.exclusions = output.exclusions.filter(r => {
        r.driver_id !== e[EventIndex.Target] || r.vehicle_id !== e[EventIndex.Value]
      });
      break;
    case "group":
    case "groupLimit":
    case "vehicleGroupAssignment":
      throw new Error(`Parameter ${eventType} is invalid for command Remove`);
      break;
    default:
      assertNever(eventType, "parameter");
  }
}

events.forEach(e => {
  switch (e[EventIndex.Command]) {
    case "set":
      handleSet(e);
      break;
    case "remove":
      handleRemove(e);
      break;
    default:
      throw "Invalid command: " + e[EventIndex.Command]
  }
});

console.log(JSON.stringify(output));

…we get output like:

{"groups":[{"id":1,"vehicles":[{"id":1,"make":"GMC","model":"Sonoma","year":2005,"vin":"12345678901234"},{"id":2,"make":"Chevy","model":"Sonoma","year":2005,"vin":"43210987654321"}],"drivers":[{"id":1,"age":35,"years_licensed":18,"name":"Jimmy Jams","ssn":"111-11-1111"},{"id":2,"age":34,"years_licensed":18,"name":"Joanie Jams","ssn":"222-22-2222"}],"limit":30000},{"id":2,"vehicles":[],"drivers":[{"id":3,"age":17,"years_licensed":1,"name":"Jimmy Jams II","ssn":"333-33-3333"}],"limit":50000}],"exclusions":[]}

There are a couple key elements here.

  1. The data sourced from the legacy system only intermingles with the data stored in the new system on demand
  2. The output (this could, for example, be used to calculate pricing for these insurance policies or fed into another system to process enrollments) can be regenerated at any time by replaying the event log over the legacy input, even if a name correction occurred in the old system (as that’s not an alterable field in the new system)
  3. The PII stored for an inderminate amount of time in the new system is lessened - you can throw away the payload from the old system after a timeout and drastically reduce the attack surface area here

Some of this may look familiar if you’re used to using Redux in front-end apps and for good reason - storing an abbreviated set of the events processed through a Redux store is a totally valid way to handle generating the event log. This approach might also be familiar if you’ve used tools like dot-prop-immutable - the shift to immutable data in the JS world really jives well with this approach.

This gives you a rough start on implementing an event-log-based system for truth storage, but it only scratches the surface - in the next post, we’ll talk about strategies for conflict resolution! In the meantime, I’ll be collecting the code from this series here