Phantom ID Types

Brian Hicks, December 14, 2020

I previously wrote about tradeoffs of custom ID types in Elm, and promised to explore a few points in the space. This is one!

Recently I built a little game for my son to learn his letters and decided to see if I could find a nicer way to solve id-as-string problem. My goal was to find something in between the wild wild west of using Strings directly and the repetition of defining ID types for every resource.

Anyway, I found something new to me. Maybe it'll be useful to you, too! The gist: you can make a reusable ID type by using phantom types (that is, types with a variable that appears in the definition but not any of the constructors.)

Let's define a module for our IDs:

module Id exposing (Id(..), decoder, sorter)

import Json.Decode as Decode exposing (Decoder)
import Sort exposing (Sorter)

type Id thing =
    Id String

decoder : Decoder (Id thing)
decoder = Id Decode.string

sorter : Sorter (Id thing)
sorter = (\(Id id) -> id) Sort.alphabetical

You use it in external definitions like this:

module Shelter exposing (Shelter)

import Cat exposing (Cat)
import Id exposing (Id)
import Json.Decode exposing (Decoder)

type alias Shelter =
    { name : String
    , adoptableCats : List (Id Cat)

decoder : Decoder Shelter
decoder =
    Decode.map2 Shelter
        (Decode.field "name" Decode.string)
        (Decode.field "adoptableCats" (Decode.list Id.decoder))

Of course, this improvement still doesn't come for free. Id thing is still not comparable so you need a workaround if you want to use it in a Dict or Set (sorter above.)

You also lose some of the assurance that you're not constructing or matching on Id in places where you shouldn't be. To put it another way, you can make a bad ID pretty easily: just call Id "a hot dog is a salad". Because the type is not used in the constructor, that's a valid whatever... Id Cat, Id Dog, it's all the same to the constructor.

You also can't embed the ID of a record in the record itself (doing so would be a recursive definition.) I managed to get around needing this in my alphabet game (records do not need to refer to themselves because whenever they're displayed there's another piece of data containing the ID.) If you need it, you just do a little type trick:

module Cat exposing (Id, Cat, decoder)

import Id
import Json.Decode exposing (Decoder)

{-| An module-internal identifier. Has to be a custom type instead of an
alias so the compiler can detect if it's being used improperly.
type CatId
    = CatId

type alias Id =
    Id.Id CatId

type alias Cat =
    { id : Id
    , name : String
    , purriness : Int

decoder : Decoder Cat
decoder =
    Decode.map3 Cat
        (Decode.field "id" Id.decoder)
        (Decode.field "name" Decode.string)
        (Decode.field "purriness"

All the instances of CatId should be erased during compilation. That is, they shouldn't end up adding any weight to your compiled JavaScript!

And there you have it! To sum up: with this technique, you get the normal stuff you get by using custom types as IDs, like:

But with these tradeoffs:

So would I do this again? I think I would avoid using this in a project where I didn't have a high confidence that my fellow programmers knew the intent of the module. That means that I wouldn't to publish a package using this, or contribute code to a high-traffic open source repo using this pattern. But on my work team in private code, or in small apps that I build for myself, I'll definitely be coming back to this!