Exported for tests only: Precise control over API visibility with custom warnings

This article describes a pattern made possible by recent advances in GHC 9.8, to allow library authors to control how internal data constructors can be labelled as unsafe for human consumption, while needing to be exported for test suites.

Making invalid states unrepresentable

One of the mantras of strongly-typed functional programming is "Make invalid states unrepresentable" by construction. This leads to a variety of precautions from the author of libraries, and especially avoiding the export of the "raw" constructor of a data type, in favour of exporting a "smart constructor" that can check pre-conditions in order to return (or not) a value of a certain type.

For instance, a strictly positive type (to hold integers greater than 0) could be internally defined as such:

data Positive = MkPositive Word

The underlying type is Word, that can be of value 0 or greater. That being said we still want to forbid the value 0. We will then hide the data constructor, and only expose a “smart constructor” for our API users.

The smart constructor will be a function like this:

mkPositive :: Word -> Maybe Positive
mkPositive number =
  case number of
    0 -> Nothing
    n -> Just (MkPositive n)

However, in more complex cases, the internals of data types are still needed for writing test cases, and in the case of property testing with QuickCheck, writing instances of the Arbitrary type class.

Let's say we have a type, and while our smart constructors are tested elsewhere, we need to run some checks on the internals themselves.

We have defined this type in our library:

data MyType
  = Constructor1 {…}
  | Constructor2 {…}
  deriving stock (Eq, Ord, Show)

And in our test suite, we have to implement the instance of Arbitrary:

instance Arbitrary MyType where
  arbitrary =
    oneof
      [ Constructor1 <$> arbitrary
      , Constructor2 <$> arbitrary
      ]

Thus the fundamental tension between hiding unsafe ways of constructing a data type, and needing the internal representation for testing.

There have been ways to inform the user of a library that their import is discouraged. Some of them are:

  1. A section of the exports dedicated to "internal use" entities
  2. DEPRECATED pragma (See createPool within the resource-pool package)
  3. Having the fully-exported data type in an “Internal” module, not meant for general consumption, and a "Public" module will re-export the type only.

However, the first one cannot be readily checked by tooling, and the second is absolutely not granular, because by suppressing the deprecation warning in the module where we import the deprecated entity, we also mute every other deprecation warning from other imported modules.

One method is granular but cannot benefit from automation, and the other benefits from compile-time checks but lacks granularity.

The third option brings two more issues to the table: The age-old debate on the adherence to Haskell's Package Versioning Policy (PVP) by internal modules, and a more recent problem: Our code editor might automatically import such an internal module if we use a declaration from it, when using the Haskell Language Server (HLS).

Since we have no way of decreasing or increasing the preferred origin of a entity for HLS, there are no ways of deprioritise a module's re-exports in favour of another one.

Custom warning categories

Fortunately, GHC 9.8 brings Proposal 541 (WARNING pragmas with categories) so that warnings can be associated to user-defined categories.

The syntax is the following:

{-# WARNING in "x-my-custom-category" <entity-name> "User-facing message" #-}

The in keyword makes the warning belong to a user-defined category, whose name has to start with x-, as it is an extension point that GHC knows about.

(Check out the documentation, you'd be surprised about the amount of things you can label with WARNING pragmas)

Let us now write some code. This is our data type:

module MyLib
  ( SuperType(..)
  ) where

data SuperType = MkSuperType
  { a :: Int
  , b :: Bool
  }
  deriving stock (Eq, Ord, Show)

and this is the test module that imports the type:

module Main (main) where

import MyLib

main :: IO ()
main = do
  let st = MkSuperType 2 False
  print st

To classify its constructor (MkSuperType) as "exported for tests only", let's write this pragma next to the definition:

{-# WARNING in "x-unsafe-internals" MkSuperType
    "This record's constructor is exported for tests only" #-}

And while we will not see a warning when using the constructor in the module in which it was defined, using it in the test module will show us this:

  test/Main.hs:9:12: warning: [GHC-63394] [-Wx-unsafe-internals]
      In the use of data constructor ‘MkSuperType’ (imported from MyLib):
      "This record's constructor is exported for tests only"
    |
  9 |   let st = MkSuperType 2 False
    |            ^^^^^^^^^^^

As we can see, the category we have defined is understood by the compiler, and we can ignore this flag in our test modules.

My type and constructor have the same name!

Notice that SuperType's constructor is named MkSuperType. This is fundamental, because according to the documentation of warning pragmas:

A capitalised name, such as T refers to either the type constructor T or the data constructor T, or both if both are in scope. If both are in scope, there is currently no way to specify one without the other […].

Calling the value the same as the type is called "punning". We can encounter it in other contexts, like with records which enables us to write the name of the field and expose a binding of the same name in the scope for it to automatically be picked.

First of all, it's okay. Haskell is still not a fully dependently-typed language yet, and so we can still give the same name to entities that live at the type and value levels, because they will not clash… usually.

But there is a trick. The ExplicitNamespaces extension provides the type and data keywords, allowing us to distinguish precisely between the type and its data constructor in the import list of a module, and in a WARNING pragma.

Let's get back to our code, where we instead define a record whose constructor has the same name as the data type:

data DataType = DataType
  { a :: Int
  , b :: Bool
  }
  deriving stock (Eq, Ord, Show)

{-# WARNING in "x-unsafe-internals" data DataType
    "This record's constructor is exported for tests only" #-}

Do note that the data keyword is now present in the WARNING pragma, to disambiguate the subject of the warning.

and here we import and use the data constructor:

module MyLib
  ( DataType (..)
  ) where

and using it in our test module will raise the same warning:

test/Main.hs:7:12: warning: [GHC-63394] [-Wx-unsafe-internals]
    In the use of data constructor ‘DataType’ (imported from MyLib):
    "This record's constructor is exported for tests only"
  |
7 |   let dt = DataType 3 True
  |            ^^^^^^^^

Then, we will of course suppress this warning by setting the -Wno-x-unsafe-internals option in the consumer module, like this:

{-# OPTIONS_GHC -Wno-x-unsafe-internals #-}
module Main (main) where

import MyLib
[…]

And here we go.

I want to thank the GHC team and the people who have contributed to the powerful and rich system of warning pragmas over the years, as they allow for a level of precision and granularity that is very helpful in the day-to-day practice of Haskell programming.