inline functions might make it faster:
  As shown above, procedural integration
  might remove a bunch of unnecessary
  instructions, which might make things
  run faster.
inline functions might make it slower:
  Too much inlining might cause code
  bloat, which might cause "thrashing"
  on demand-paged virtual-memory
  systems. In other words, if the
  executable size is too big, the system
  might spend most of its time going out
  to disk to fetch the next chunk of
  code.
inline functions might make it larger:
  This is the notion of code bloat, as
  described above. For example, if a
  system has 100 inline functions each
  of which expands to 100 bytes of
  executable code and is called in 100
  places, that's an increase of 1MB. Is
  that 1MB going to cause problems? Who
  knows, but it is possible that that
  last 1MB could cause the system to
  "thrash," and that could slow things
  down.
inline functions might make it
  smaller: The compiler often generates
  more code to push/pop
  registers/parameters than it would by
  inline-expanding the function's body.
  This happens with very small
  functions, and it also happens with
  large functions when the optimizer is
  able to remove a lot of redundant code
  through procedural integration — that
  is, when the optimizer is able to make
  the large function small.
inline functions might cause
  thrashing: Inlining might increase the
  size of the binary executable, and
  that might cause thrashing.
inline functions might prevent
  thrashing: The working set size
  (number of pages that need to be in
  memory at once) might go down even if
  the executable size goes up. When f()
  calls g(), the code is often on two
  distinct pages; when the compiler
  procedurally integrates the code of
  g() into f(), the code is often on the
  same page.
inline functions might increase the
  number of cache misses: Inlining might
  cause an inner loop to span across
  multiple lines of the memory cache,
  and that might cause thrashing of the
  memory-cache.
inline functions might decrease the
  number of cache misses: Inlining
  usually improves locality of reference
  within the binary code, which might
  decrease the number of cache lines
  needed to store the code of an inner
  loop. This ultimately could cause a
  CPU-bound application to run faster.
inline functions might be irrelevant
  to speed: Most systems are not
  CPU-bound. Most systems are I/O-bound,
  database-bound or network-bound,
  meaning the bottleneck in the system's
  overall performance is the file
  system, the database or the network.
  Unless your "CPU meter" is pegged at
  100%, inline functions probably won't
  make your system faster. (Even in
  CPU-bound systems, inline will help
  only when used within the bottleneck
  itself, and the bottleneck is
  typically in only a small percentage
  of the code.)
There are no simple answers: You have
  to play with it to see what is best.
  Do not settle for simplistic answers
  like, "Never use inline functions" or
  "Always use inline functions" or "Use
  inline functions if and only if the
  function is less than N lines of
  code." These one-size-fits-all rules
  may be easy to write down, but they
  will produce sub-optimal results.