Advanced: Custom Types & Generic Instances

Keelung already provides different Values & Expressions, but as a developer you may find it inconvenient to translate your structure in mind to these primitive types every time. Luckily, Keelung, despite being a DSL, supports custom datatype definitions in Haskell, as long as they are built on top of Keelung primitive types (Field, Boolean, and UInt).

This might sounds a little crazy: you DON'T have to define the translations yourself! Just define a datatype and add some auxiliary lines, the Keelung library will derive the translations for you, thanks to the power of datatype-generic programming.

To achieve this, we make use of some properties held on Haskell datatypes, or more advancedly, GADTs, as well as Haskell typeclass system. There are mainly two typeclasses for custom datatypes in Keelung, Encode and Inputable, each provides different functionalities. In general, if a datatype has an Encode instance, it must has an Inputable instance, but not the other way around.

Because Keelung has already defined Encode and Inputable instances for primitive types, and generic instances for datatypes that are built on top of primitive types, GHC will check for you if the datatypes meet the conditions and generates the implementations automatically. See the examples below.

The Encode Typeclass

Types are encode-able into Keelung means they can be put in the Comp monad in the same way MerkleTree is in the signature of buildMerkleTree. We represent this property with the Encode typeclass.

E.g., we know MerkleTree, defined at line 11 in MTree.hs , is encode-able because its elements (all of which are Fields in this case) are encode-able. We can define the instances ourselves, but the whole point of generic datatypes, functions or instances is to save you from this suffering.

MTree.hs
{-# LANGUAGE DeriveGeneric #-}

module MTree where

import Keelung
import GHC.Generics hiding (UInt) -- hiding UInt to prevent conflicting with Keelung.UInt

data Tree a = Node a (Tree a) (Tree a) | Leaf a
  deriving Generic

type MerkleTree = Tree Field

instance (Encode a) => (Encode (Tree a))

buildMerkleTree :: Int -> Comp MerkleTree
buildMerkleTree = ...

In short, an Encode k instance makes datatype k possible to be used natively in Keelung by making Keelung programs of type Comp k acceptable to the compiler.

An instance of Encode k can be automatically generated, i.e. at line 13 where instance is used without a definition, if k satisfies these two properties:

  1. it is either Field, Boolean, or UInt, or a sum type or a product type of such types recursively (usually what is defined as a GADT is a sum or product type),

  2. there is a Generic instance for k, most of which should be derivable by importing GHC.Generics and adding deriving Generic after its definition if there are no foralls in its constructors.

Example of Encode

Let's look back at our Keelung program in the Merkle tree example that generates a Merkle tree:

module MerkleTree where

import Data.Foldable (foldlM)
import Hash.Poseidon
import Keelung

getMerkleProof :: Int -> Comp Field
getMerkleProof depth = do
  leaf     <- inputField Private
  siblings <- inputList2 Private depth 2
  indices  <- inputList Private depth :: Comp [Field]
  foldlM
    (\_digest (_i, p) -> hash p)
    leaf
    (zip indices siblings)

This program only generates a Field element that is the root of a Merkle tree. But what if we want a structure that actually represent the tree in question? MTree.hs shows what a datatype that represents Merkle trees and the functions defined around it may look like.

This was not possible before Keelung v0.12.0, but since v0.12.0 we can achieve this with GHC.Generics, as long as the datatypes we define are encode-able into Fields, Booleans, UInts, or their combinations.

Notice that to make buildMerkleTree possible, there are three lines are of special interest:

  • {-# LANGUAGE DeriveGeneric #-} and import GHC.Generics tells the Haskell compiler GHC that we want to use datatype-generic features, and

  • deriving Generic after a data declaration generates its datatype-generic representation for us.

  • instance (Encode a) => (Encode (Tree a)) looks like a Haskell instance definition without any implementations, which is exactly what it is. In plain text, this line means "If a is a type that can be encoded into Keelung, so is Tree a". Of course, we can declare this on simpler datatypes without parameters, e.g. instance Encode MerkleTree, which tells GHC to make MerkleTree encode-able into Keelung. But why make it ad-hoc when it can be as general as possible?

The Inputable Typeclass

Inputable is a type class added after Keelung v0.24.0, it is much more powerful, thus more limited compared to Encode. An Inputable a instance provides four functions, each having existing counterparts:

  • freshData :: Comp a generalizes freshVar, and

Generic instances of Inputable can be used in the same way as Encode.

Example of Inputable

As your Keelung programs grow, it's likely that you will encounter situations where you have multiple inputs structured in a certain way in mind, in that case instead of writing like this:

sameParent :: Comp Boolean
sameParent = do
  name1         <- inputField Public
  number1       <- inputUInt Public :: Comp (UInt 8)
  hasParent1    <- inputBool Public
  parent1number <- inputUInt Public :: Comp (UInt 8)
  name2         <- inputField Private
  number2       <- inputUInt Private :: Comp (UInt 8)
  hasParent2    <- inputBool Private
  parent2number <- inputUInt Private :: Comp (UInt 8)
  return (eq hasParent1 hasParent2 .&. eq parent1number parent2number)

You probably would want to write like this:

data Person = Person {
  name   :: Field,
  number :: UInt 8,
  parent :: (Boolean, UInt 8)
} deriving Generic

instance Inputable Person

sameParent :: Comp Boolean
sameParent = do p1 <- inputData Public  :: Comp Person
                p2 <- inputData Private :: Comp Person
                return (eqData (parent p1) (parent p2))

Limitations

It requires all conditions for Encode to make a datatype Inputable, excluding sum types, i.e. types that have multiple constructors. This is to ensure a type has fixed size such that inputs and equality make sense in the context of ZK circuits. Other than this, types that have Generic instances are supported in the same way as Encode does. You might want to use record syntax to represent structured data, like the Person datatype in the example above.

Ad-hoc Instantiation

If your datatype cannot have Generic instances derived by GHC, you can either

1. write their Generic instance yourself and generate Encode and Inputable instances generically, or

2. write Encode and Inputable separately.

Usually 2. will be the preferable option.

To know exactly what have been instantiated you may want look into our source code.

Last updated

Logo

Copyright © 2023 BTQ