Why I am answering
I was curious to know what the practical usages of the Pair a monad are, especially considering that the only valid implementation of (>>=) actually throws away half of the data, so I asked this question.
It turns out that the answer I accepted is a very interesting answer to this question too (and was contained in Daniel Wagner's comment under the present question, but I hadn't noticed it), so I will elaborate it a bit here, because I've learned a lot from it.
The short answer: (a,a) and Bool -> a are the same thing
So the short answer is that the types Pair a, or more simply (a,a), and Bool -> a are isomorphic: feeding a (a,a) to fst/snd is equivalent to feeding a Bool -> a function with True/False (or False/True, it's just a choice). The two isomorphisms are actually shown in the (deleted) answer from Zhiltsoff Igor:
funToPair :: (Bool -> a) -> (a, a)
funToPair f = (f True, f False)
pairToFun :: (a, a) -> Bool -> a
pairToFun (x, y) True = x
pairToFun (x, y) False = y
As a consquence, whichever way Bool -> a is a monad, (a,a)/Pair a is a monad in the same way.
But (a,a)'s Monad throws information away... so does Bool -> a!
What bugged me at this point was that it's so apparent that the Monad instance for (a,a)/Pair a, thoroughly explained in Aadit M Shah's answer, throws away data, whereas the Monad instance for Bool -> a... doesn't...?
Fatally wrong! The Monad instance for r -> a (with generic r, not just Bool) throws away information all the time! How? Let's look at the well known implementation of >>=:
instance Monad ((->) r) where
f >>= k = \ r -> k (f r) r
Now remember that a function of type x -> y is equivalent to a tuple of type (y,y,...,y) with as many entries as the number of possible values of type x (not y).
What about k? k is a binary function, let's say of type x -> y -> z, so it can be fed with the cartesian product of the values that inhabit the type x and those that inhabit the type of y. If x and y are inhabited by m and n values respectively, then k is equivalent to a tuple of type (z,z,...,z) with as many entries as m*n, and that is the information that comes with k.
Now the question is whether (>>=) makes use of all that information. No, it doesn't! (>>=) is feeding k with only n possible inputs: the second argument r is any value of type x, whereas the first argument is the single value corresponding to r via f.
If we think of mathematical functions, we are saying that, for fixed unary function f: A → B and binary function k: B×A → C, f >>= k is the unary function that does t ∈ A → k(f(t),t) ∈ C, i.e. it's a restriction of k to the curve parametrized by the equations x = f(t) and y = t.
Going back to Bool -> a, the signature of (>>=) specilizes to
(>>=) :: (Bool -> a) -> (a -> Bool -> b) -> Bool -> b
f >>= k = \r -> k (f r) r
which we can rewrite as follows, just to make the obvious more apparent:
(>>=) f k r
| r == True = k (f True) True
| r == False = k (f False) False
-- remember this
What's obvious in the above? If we feed f >>= k with True, we'll only use f True, thus throwing away the other half of the data, f False. Similarly, if we call (f >>= k) False, we throw away whatever f True is. This way of throwing away half of the information contained in k mirrors exactly what is throw away via the _ placeholders for (a,a) aka Pair a (adaptation from Ismor's answer):
instance Monad Pair where
return = pure
(Pair a b) >>= k = let Pair a' _ = k a
Pair _ b' = k b
in Pair a' b'
Indeed, if we define fst' (Pair a _) = a and snd' (Pair _ b) = b (these are mirroring fst and snd for (,)), then (>>=) can be written simply as follows
(>>=) :: Pair a -> (a -> Pair b) -> Pair b
p >>= k = (fst' (k (fst' p)),
snd' (k (snd' p)))
Which is in my opinion strikingly similar to the code I marked with -- remember this, with fst' mirrors True and snd' mirrors False.
By the way, for the following, let's take note of the type of (>>=) in the case we were allowed to write it for (a,a):
(>>=) :: (a,a) -> (a -> (b,b)) -> (b,b)
What about (a,b) when a and b are the same type?
This was the last doubt I had: since (a,b) is a Monad in b provided a is a Monoid, in the special case that the type b is equal to the type a (and b has to be a Monoid), do we get the same instance as above?
No, the question above is ill-posed, the flaw being in the part "in the special case that the type b is equal to the type a". This hypothesis is simply not possible because for a type constructor to be a Monad, it has to have one and only one free type parameter:
(a,b) is made a Monad in the free parameter b, meaning that b will be allowed to vary according to the second argument to (>>=), whereas the type a will stay the same through the (>>=) operator;
(a,a) is made a Monad in the free parameter a, meaning that a will be allowed to vary according to the second argument to (>>=), whereas... nothing, that's it, all explained in the previous part of the answer.