Engineering

Making code-generation in Go more powerful with generics

As Wikipedia puts it:

Generic programming is a style of computer programming in which algorithms are written in terms of types to-be-specified-later that are then instantiated when needed for specific types provided as parameters.

More simply, you can write a function with unknown parameters and/or return types. Callers of this function must provide these types. A common example is a Map function:

func Map[Source, Result any](apply func(Source) Result, source []Source) []Result {
	var res []Result
	for _, item := range source {
		res = append(res, apply(item))
	}
	return res
}

samber/lo is a popular library that exports common functional helpers present in other languages for use in Go, almost all of them depending on generics to be type-safe.

For many years Go resisted adding generics, with code generation – where you use a tool to generate copies of Go code specialised for each of the types you’d like to use it with – suggested as the official workaround.

But the choice doesn’t need to be between code generation or generics. At incident.io, we’ve found mixing generics with code generation to provide extremely useful developer flows, and has let us solve some tricky pain points in our codebase.

This post shares some of the ways we’ve achieved that.

Partials

Of all the ways we’ve used generics and code generation, introducing a concept of partial structs has been most transformative to the way we write code. The ability to represent a subset of a whole struct type, and track which struct fields it sets, is useful in countless situations.

The best way to motivate partials is to explain what problems they solve, and I’ll start with something I found very confusing when first learning Go: which is the use of zero values.

Let’s say we have a User struct:

package domain

type User struct {
	ID string
	OrganisationID string
	Name string
	Email string
	LoginCount int
}

If I initialise a User without specifying any attributes, they become the zero value associated with that type. If the field is a string then it defaults to "", while ints will be 0:

var user domain.User
user.Name == ""             // true
user.LoginCount == 0        // true

In some cases this is very useful - we don’t have to worry about nils unless we explicitly decide to use a pointer. However, it means we can’t tell the difference between “I’ve set this field to """ and “I forgot to initialise that field”.

That can be quite scary: for example if you tell Gorm (the database ORM we use at incident.io) to save a struct, by default it writes every field in the struct to the database.

That’s a recipe for race conditions: if one process updates the Name of a given user, and another updates the Email, one will overwrite the other. This is easy to fix by explicitly specifying which fields to update:

func SetName(ctx context.Context, db *gorm.DB, user *domain.User) error {
	return db.WithContext(ctx).
		Model(&user).
		Select("Name"). // This says "only update the Name"
		Updates(user).
		Error
}

But while easy, it’s quite inconvenient: what if we’re updating different fields at different times? Or depending on the code, we want to conditionally set certain fields, but not others?

That would require us to track which fields we set alongside what we’d like to set them to. Not impossible, but a lot of repeated book-keeping, which runs the risk of introducing bugs: this is where Partial comes in:

// You can find this package for your own use at:
// https://github.com/incident-io/partial
package partial

type Partial[T any] struct {
	FieldNames []string
	Subject T
}

In the most basic of terms, a partial is just a list of the struct fields set, and a copy of the object with the relevant fields set.

This is where we can benefit from generics, as we need just a single implementation of Partial for any struct type. Without them we’d have been forced to use interface{}, which would lose us type-safety and force unsafe reflection, hurting both developer experience and runtime safety.

But with this partial, we can now pass around an object that says “a User, but just the Name”:

// Partial that represents a user with just the Name struct field set.
userPartial := partial.Partial[domain.User]{
	FieldNames: []string{"Name"},
	Subject: User{Name: "Changed Name"},
}

That looks fairly verbose, and a bit nasty, but hold on: we’ll get to that later!

The important thing is that with this Partial, we can build a wrapper around Gorm that allows us to update the database record but set only the fields that have been set on the partial.

It allows us to perform safe, partial updates:

func Update[T any](ctx context.Context, db *gorm.DB, partial partial.Partial[T]) (*T, error) {
	var res *T
	err := db.WithContext(ctx).
		Model(&res).
		Select(partial.FieldNames...). // limit update to fields set in the partial
		Update(partial.Subject).       // and update from the partial
		Error
	if err != nil {
		return nil, err
	}

	return &res, nil
}

// It can be called like so:
updatedUser, err := Update(ctx, db, userPartial)

Not only is this update safe, but we can omit explicit types from our call to update (Update[domain.User](…)) because the partial parameter can be used to infer the type of T.

That might seem like a small thing, but most of our code now interacts with Partials. We use them to:

  • Build database query scopes, by dynamically building a partial that can be transformed into a SQL where condition
  • Create and tweak test fixtures, adding/modifying fields on a partial in subsequent BeforeEach
  • Rely on our database default column values, now that our database wrapper can tell when we don’t supply a column value

That all said, I confess if building partials looked like this, we probably wouldn’t use them. Explicitly managing the FieldNames list is ugly, and leaves us open to mistakes like typos in the field names we want to set.

Thankfully we can improve on this.

No more generated code?

As we mentioned before, we’re finding code generation to be an amazing partner to generic abstractions, and we’ve use this pairing to significantly improve the experience of using Partials.

We use (and open-source, in incident-io/partial) two code generation helpers around Partial types.

Builders

Constructing a Partial[domain.User] wasn’t just fiddly, but opened us to several errors we’d like to remove.

Remembering how it looked before:

partial.Partial[domain.User]{
	FieldNames: []string{"Name"},
	Subject: User{Name: "Changed Name"},
}

This code could go wrong by specifying field names that don’t exist on the struct, such as typos like Naem, or fields that previously existed by you had removed. It’s also easy to add a field to the Subject, such as Email: "user@example.com", and forget to add it to FieldNames.

We solve these issues by code generating partial builders, specific for the type you want to partially represent. By adding a codegen:builder comment above the struct and running the codegen binary, we’ll generate a ‘builder’ that can be used to construct partials piece-by-piece, in a type-safe (according to the fields on the base type) manner.

Using the builder looks like this:

// Create a user partial with just name:
partial := domain.UserBuilder(
	domain.UserBuilder.Name("Changed Name"),
)

// If we see that email has been provided, add email to the partial:
if payload.Email != nil {
  partial = partial.Add(
    domain.UserBuilder.Email(*payload.Email),
  )
}

This has a few advantages.

First, your editor can auto-complete all the available fields from domain.UserBuilder., making it easy to find whatever field you may be looking for.

We also get several benefits from strictly typing codegen’d builders. Taking domain.UserBuilder.Name as an example, this function has a type signature derived from the base type’s (User) struct field, and it’s implementation looks like this:

func (b UserBuilderFunc) Name(value string) func(*User) []string {
	return func(subject *User) []string {
		subject.Name = value

		return []string{
			"Name",
		}
	}
}

There’s an obvious benefit to this, in that Go will error if you pass a value that isn’t string to the Name setter.

But there’s a more subtle benefit that comes from the interaction with Partial, and the helpers we define on it:

partial := domain.UserBuilder(
	domain.UserBuilder.Name("Changed Name"),
)

// ❌
// cannot use
//   domain.SlackUserBuilder.Email(*payload.Email) (
//     value of type func(*domain.SlackUser) []string
// ) as type func(*domain.User) []string in argument to partial.Add
if payload.Email != nil {
	partial = partial.Add(
    domain.SlackUserBuilder.Email(*payload.Email),
  )
}

Functions that receive partials fix the partial base type, which means you can’t accidentally use setters for one partial base type alongside an incompatible type, as the compiler will catch it for you.

That applies to several quality-of-life helpers we define on Partial, which lends Partials a level of flexibility and ease-of-use more often found in more dynamic languages, like TypeScript or Ruby.

type Partial
    func New[T any](subjectPtr *T) (model Partial[T], err error)
    func (m Partial[T]) Add(opts ...func(*T) []string) Partial[T]
    func (m Partial[T]) Apply(base T) *T
    func (m Partial[T]) Empty() bool
    func (m Partial[T]) Match(otherPtr *T) bool
    func (m Partial[T]) Merge(other Partial[T]) Partial[T]
    func (m *Partial[T]) SetApply(apply func(T) *T)
    func (m Partial[T]) Without(fieldNamesToRemove ...string) Partial[T]

Matchers

Having solved building and working with partial types, we wanted to find something similarly type-safe and ergonomic for writing tests against those types.

At incident.io we use Gomega to write tests, which has the excellent gstruct library for doing partial-matches on structs.

Writing a test to confirm a user name matches a value would look like this:

Expect(user).To(MatchFields(IgnoreExtras, Fields{
  "Name": Equal("Tommy"),
}))

This works, but has several disadvantages:

  1. Strings are used to pick which struct keys to match against (”Name”), so the compiler can’t help auto-complete the field name based on the type that you expect against.
  2. Just as with manually constructing partials, the compiler can’t know when a string field name doesn’t exist on the type you’re matching, as there’s nothing in the type-system that says those strings are related to the base type (User).
  3. Because you express how the field should look in Gomega matchers (Equal), the matchers – by necessity – receive interface{} parameters. That allows matchers to support any type, be it string or int, but it means the compiler can’t shout if you ask Name to Equal(3), even though we know a user’s name is a string.

That last one hurt us lots, especially as we use the null library to represent database-optional fields in our struct types. Changing a field from null.String to string became an exercise in finding failing tests, rather than letting the compiler help you find the test assertions that were no longer valid.

Again, code generation can help us. And we do just that, generating ‘matchers’ just as we have ‘builders’, but intended to allow type-safe construction of gstruct matchers:

// Exact match against user fields:
Expect(user).To(domain.UserMatcher(
	domain.UserMatcher.OrganisationID(orgID),
	domain.UserMatcher.Name("Tommy"),
))

// Call Match() to access the same field methods, but supporting
// Gomega matchers, for when you want more complex matching:
Expect(user).To(domain.UserMatcher(
	domain.UserMatcher.Match().ID(Not(BeEmpty())),
))

// Make invalid types a compiler error, such as when we backfill
// emails and make it a required database field:
//
// ❌
// cannot use null.StringFrom("user@example.com") (value
// of type null.String) as type string in argument to 
// domain.UserBuilder.Email
Expect(user).To(domain.UserMatcher(
	domain.UserMatcher.Email(null.StringFrom("user@example.com")),
))

Matchers like this make writing tests much easier, and help build assertions that produce richer error messages: if any field fails to match, the test failure will print the rest of the object so you can see all the things that have gone wrong, instead of just the specific failure.

Wrapping up

All this could have been achieved with lots and lots of generated, non-generic code. However, generating complex code is incredibly hard to get right, and to understand and debug. Generics lets us put the complex work into code that is easier to read and reason about, and then generate a bunch of simple bits of boilerplate that complement that.

We think this achieves a nice balance of developer experience without making improving on this abstraction harder than it needs to be.

If you think this would be useful, give the library a spin. We’d love to hear about how you’re using this.

Picture of Isaac Seymour
Isaac Seymour
Product Engineer

Operational excellence starts here.