I have the following type and two relevant functions that I intend to measure as a part of a large list fold:
Type and access functions:
data Aggregate a = Aggregate (Maybe a) (a -> Aggregate a)
get :: Aggregate a -> Maybe a
get (Aggregate get' _) = get'
put :: Aggregate a -> a -> Aggregate a
put (Aggregate _ put') = put'
First function:
updateFirst :: Maybe a -> a -> Aggregate a
updateFirst cur val = Aggregate new (updateFirst new)
  where new = mplus cur (Just val)
first :: Aggregate a
first = Aggregate Nothing (updateFirst Nothing)
Second function:
updateMinimum :: Ord a => Maybe a -> a -> Aggregate a
updateMinimum cur val = Aggregate new (updateMinimum new)
  where new = min <$> (mplus cur (Just val)) <*> Just val
minimum :: Ord a => Aggregate a
minimum = Aggregate Nothing (updateMinimum Nothing)
The functions are written in a way that the memory should be constant. Therefore, I have opted in to use the Strict language extension in GHC, which forces the thunks to be evaluated. The Weigh library was used to perform the allocation measurements:
test :: A.Aggregate Double -> Int -> Maybe Double
test agg len = A.get $ F.foldl' A.put agg (take len $ iterate (+0.3) 2.0)
testGroup :: String -> A.Aggregate Double -> Weigh ()
testGroup name agg = sequence_ $ map (\cnt -> func (str cnt) (test agg) cnt) counts
  where
    counts  = map (10^) [0 .. 6]
    str cnt = name ++ (show cnt)
main :: IO ()
main =
  mainWith
    (do setColumns [Case, Allocated, Max, GCs]
        testGroup "fst" A.first
        testGroup "min" A.minimum
    )
The Weigh output is as follows:
Case          Allocated          Max  GCs
fst1                304           96    0
fst10             2,248           96    0
fst100           21,688           96    0
fst1000         216,088           96    0
fst10000      2,160,088           96    2
fst100000    21,600,088           96   20
fst1000000  216,000,088           96  207
min1                552           96    0
min10             4,728           96    0
min100           46,488           96    0
min1000         464,088           96    0
min10000      4,967,768           96    4
min100000    49,709,656    6,537,712   44
min1000000  497,226,840  103,345,512  445
Why does GHC suddenly allocate way more memory in the inputs of size 10^5 and 10^6? My GHC version is 8.4.4. 
 
    