I'm observing the following execution times:
- ~9s when compiled with stack build
- ~5s when compiled with stack build --profile
I'd expect non-profiled execution time to be below 1s, which tells me that the profiled time above makes sense, it's the non-profiled time that is abnormally slow.
A few more details:
- The program reads a relational algebra-like DSL, applies a series of rule-based transformations, and outputs a translation to SQL. Parsing is done with megaparsec. I/O isStringbased and is relatively small (~ 150 KBs). I would exclude I/O as a source of the problem. Transformations involve recursive rewriting rules over an ADT. In a few occasions, I useugly-memoto speed up such recursive rewrites.
- Using stack 2.9.1 with LTS 18.28, ghc 8.10.7 
 (EDIT: upgrading to LTS 20.11, ghc 9.2.5, does not help)
- In the cabal file:
  ghc-options:        -O2 -Wall -fno-warn-unused-do-bind -fno-warn-missing-signatures -fno-warn-name-shadowing -fno-warn-orphans
  ghc-prof-options:   -O2 -fprof-auto "-with-rtsopts=-p -V0.0001 -hc -s"
- Notice that none of the above is new, but I have never observed this behaviour before.
- I've seen this related question, but compiling with -fno-state-hackdoes not help
- Compiling with -O1doesn't help (about the same as-O2), and-O0is significantly slower, as expected.
- The profiling information shows no culprit. The problem only shows up with non-profiled execution.
I realise I'm not giving many details. In particular, I'm not including any code snippet because it's a large code base and I have no idea of which part of it could trigger this behaviour. My point is indeed that I don't know how to narrow it down.
So my question is not "where is the problem?", but rather: what could I do to get closer to the source of the issue, given that the obvious tool (profiling) turns out to be useless in this case?
 
    