To start with, let's flesh out your example with enough code to work with. LiteralToken is just made up to represent whatever you are replacing a variable with.
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE StandaloneDeriving #-}
type Identifier = String
data LiteralToken = LiteralToken
deriving Show
class Pack a where
pack :: a -> LiteralToken
instance Pack Int where
pack = const LiteralToken
data Expression a where
Var :: Identifier -> Expression a
Lit :: LiteralToken -> Expression a
Tuple :: Expression a -> Expression b -> Expression (a, b)
deriving instance Show (Expression a)
emap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
emap f e = case e of
Tuple a b -> Tuple (f a) (f b)
otherwise -> e
postmap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
postmap f = f . emap (postmap f)
bind :: Pack a => Identifier -> a -> Expression b -> Expression b
bind ident value = postmap (substitute ident value)
Now, we can implement substitute for an Identifier. To be used in emap or postmap, substitute needs to be able to operate on all parts of the Expression tree, regardless of whether it's the same type as the variable being substituted for. That's why the type variable for the Expressions is different than the type variable for the value (this is also what user2407038 explains).
substitute :: Pack a => Identifier -> a -> Expression b -> Expression b
substitute ident value e = case e of
Var id | id == ident -> Lit (pack value)
otherwise -> e
Now we can try some examples:
example1 :: Expression (Int, Bool)
example1 = Tuple (Var "x") (Var "y")
print . bind "x" (7 :: Int) $ example1
Tuple (Lit LiteralToken) (Var "y")
If we have the wrong type for the identifier, we don't get what we want, it's substituted anyway. That's because substitute had no way to check the type of the variable it was substituting for. In the following example, both Var "x"s are replaced with the literal that was made for an Int, even though the second variable was for a Bool.
example2 :: Expression (Int, Bool)
example2 = Tuple (Var "x") (Var "x")
print . bind "x" (7 :: Int) $ example2
Tuple (Lit LiteralToken) (Lit LiteralToken)
Type safety
If we add a type variable to LiteralToken so that it somehow depends on the type of what it represents
data LiteralToken a = LiteralToken
deriving Show
class Pack a where
pack :: a -> LiteralToken a
data Expression a where
Var :: Identifier -> Expression a
Lit :: LiteralToken a -> Expression a
Tuple :: Expression a -> Expression b -> Expression (a, b)
our previous substitute will have a compiler error.
Could not deduce (b ~ a)
...
In the first argument of `pack', namely `value'
In the first argument of `Lit', namely `(pack value)'
In the expression: Lit (pack value)
substitute needs some way to check that the type it is operating on is correct. Data.Typeable solves this problem. It doesn't seem unreasonable to require that any term represented by a variable have a runtime identifiable type, so it's reasonable to add this constraint to the Expression tree itself. Alternatively, we could add a type annotation to the expression tree that can provide a proof that any term is of a specific type. We'll follow this second route. This will require a few imports.
import Data.Typeable
import Data.Maybe
Here's the expression tree extended to include type annotations.
data Expression a where
Var :: Identifier -> Expression a
Lit :: LiteralToken a -> Expression a
Tuple :: Expression a -> Expression b -> Expression (a, b)
Typed :: Typeable a => Expression a -> Expression a
emap :: (forall a. Expression a -> Expression a) -> Expression b -> Expression b
emap f e = case e of
Tuple a b -> Tuple (f a) (f b)
Typed a -> Typed (f a)
otherwise -> e
bind and substitute can now only work on Typeable variables. Substituting a variable lacking a type annotation does nothing, the variable remains untouched.
bind :: (Pack a, Typeable a) => Identifier -> a -> Expression b -> Expression b
bind ident value = postmap (substitute ident value)
substitute :: (Pack a, Typeable a) => Identifier -> a -> Expression b -> Expression b
substitute ident value e = case e of
Typed (Var id) | id == ident -> fromMaybe e . fmap Typed . gcast . Lit . pack $ value
otherwise -> e
All of the type checking and casting work in substitute is done by gcast. Typeable a from the function's signature provides the proof that the a in the LiteralToken a built by Lit . pack $ value has a Typeable instance. The Typed constructor in the case provides the proof that the output expression's type b also has a Typeable instance. Note that the code would still work if fmap Typed were removed; it is simply preserving the type annotation.
The following two functions make it easy to add type annotations
typed :: Typeable a => Expression a -> Expression a
typed t@(Typed _) = t
typed e = Typed e
typedVar :: Typeable a => Identifier -> Expression a
typedVar = Typed . Var
Our two examples now do what we'd like them to do. Both examples only replace the integer variable, despite both variable names being the same in the second example. We use the smart typedVar constructor instead of Var to build all of our variables with a type annotation.
example1 :: Expression (Int, Bool)
example1 = Tuple (typedVar "x") (typedVar "y")
print . bind "x" (7 :: Int) $ example1
Tuple (Typed (Lit LiteralToken)) (Typed (Var "y"))
example2 :: Expression (Int, Bool)
example2 = Tuple (typedVar "x") (typedVar "x")
print . bind "x" (7 :: Int) $ example2
Tuple (Typed (Lit LiteralToken)) (Typed (Var "x"))
Type inference
Just for fun, we can implement type inference on expression trees. Unlike Haskell's, this really simple type inference only works from the leaves towards the root.
inferType :: Expression a -> Expression a
inferType e = case e of
Typed t@(Typed _) -> t
t@(Tuple (Typed _) (Typed _)) -> typed t
otherwise -> e
inferTypes = postmap inferType
print . inferTypes $ example1
Typed (Tuple (Typed (Var "x")) (Typed (Var "y")))