Do macros in C++ improve performance?

Question

I'm a beginner in C++ and I've just read that macros work by replacing text whenever needed. In this case, does this mean that it makes the .exe run faster? And how is this different than an inline function?

For example, if I have the following macro :

#define SQUARE(x) ((x) * (x))

and normal function :

int Square(const int& x)
{
    return x*x;
}

and inline function :

inline int Square(const int& x)
{
    return x*x;
}

What are the main differences between these three and especially between the inline function and the macro? Thank you.

Make it work first. Then, make it work fast. If correctness doesn't matter, then a function that does nothing is fastest. Generally, macros are more difficult to get right than a function. The macro vs. inline function micro-optimization is usually wasted effort unless profiling tells you otherwise. — jxh, Mar 25 '16 at 01:11
"Macros work by replacing text" and "Macros make your program faster" are two almost unrelated statements. — user253751, Mar 25 '16 at 01:12
Almost a dupe: http://stackoverflow.com/questions/1137575/inline-functions-vs-preprocessor-macros — vsoftco, Mar 25 '16 at 01:14

vsoftco · Accepted Answer · 2018-06-06T23:08:18.160

13

You should avoid using macros if possible. Inline functions are always the better choice, as they are type safe. An inline function should be as fast as a macro (if it is indeed inlined by the compiler; note that the inline keyword is not binding but just a hint to the compiler, which may ignore it if inlining is not possible).

PS: as a matter of style, avoid using const Type& for parameter types that are fundamental, like int or double. Simply use the type itself, in other words, use

int Square(int x)

since a copy won't affect (or even make it worse) performance, see e.g. this question for more details.

edited Jun 06 '18 at 23:08

answered Mar 25 '16 at 00:54

vsoftco

55,410
12
139
252

Thank you for the quick answer and for the tips. I wrote it that way (const Type&) because in one of the courses I've read it said that copies "cost" in terms of performance. It didn't mention anything about fundamental types, but now I guess i'll use it only for the user defined types. Thanks again! – ggluta Mar 25 '16 at 01:18
@GeorgeGabriel You're welcome. The explanation in the course if correct, and you don't lose anything if you follow it. However, for fundamental types, passing by reference is as expensive as a copy, since internally a copy of the pointer to the data is passed around. And stylistically C++ programmers tend to pass fundamental types by value. – vsoftco Mar 25 '16 at 01:20
Ok, I understand. Thanks again. It's good to know and make a good habit of it, when it comes to writing code. – ggluta Mar 25 '16 at 01:29
In fairly rare cases it's acceptable to use macros to make sure code is inlined. It's definitely a heavy-handed approach, and you need to do performance testing to show that it's actually helping vs writing the code like a normal person. – Adam McKee Mar 25 '16 at 02:36
1

@vsoftco: The advice is right, the reason not. Fundamental types typically aren't passed using a pointer; they're passed in a CPU register. That means they can be directly operated upon, and it doesn't get faster than that. – MSalters Mar 25 '16 at 09:05

score 1 · Answer 2 · answered Mar 25 '16 at 00:58

Macros translate to: stupid replacing of pattern A with pattern B. This means: everything happens before the compiler kicks in. Sometimes they come in handy; but in general, they should be avoided. Because you can do a lot of things, and later on, in the debugger, you have no idea what is going on.

Besides: your approach to performance is well, naive, to say it friendly. First you learn the language (which is hard for modern C++, because there are a ton of important concepts and things one absolutely need to know and understand). Then you practice, practice, practice. And then, when you really come to a point where your existing application has performance problems; then do profiling to understand the real issue.

In other words: if you are interested in performance, you are asking the wrong question. You should worry much more about architecture (like: potential bottlenecks), configuration (in the sense of latency between different nodes in your system), and so on. Of course, you should apply common sense; and not write code that is obviously wasting memory or CPU cycles. But sometimes a piece of code that runs 50% slower ... might be 500% easier to read and maintain. And if execution time is then 500ms, and not 250ms; that might be totally OK (unless that specific part is called a thousand times per minute).

I am well aware that I do not master this language at all and I want to learn as much as possible. The purpose of this question was only to understand what are the differences, because I do not intend to actually improve any code that I'm writing now, as it it very simple. Thank your for your answer and for the tips. I realize that will be a long trip in the C++ world before I will actually consider improving performance on my code. — ggluta, Mar 25 '16 at 01:11

score 1 · Answer 3 · answered Mar 25 '16 at 01:27

1

The difference between a macro and an inlined function is that a macro is dealt with before the compiler sees it.

On my compiler (clang++) without optimisation flags the square function won't be inlined. The code it generates looks like this

4009f0:       55                      push   %rbp
4009f1:       48 89 e5                mov    %rsp,%rbp
4009f4:       89 7d fc                mov    %edi,-0x4(%rbp)
4009f7:       8b 7d fc                mov    -0x4(%rbp),%edi
4009fa:       0f af 7d fc             imul   -0x4(%rbp),%edi
4009fe:       89 f8                   mov    %edi,%eax
400a00:       5d                      pop    %rbp
400a01:       c3                      retq

the imul is the assembly instruction doing the work, the rest is moving data around. code that calls it looks like

  400969:       e8 82 00 00 00          callq  4009f0 <_Z6squarei>

iI add the -O3 flag to Inline it and that imul shows up in the main function where the function is called from in C++ code

0000000000400a10 <main>:
400a10:       41 56                   push   %r14
400a12:       53                      push   %rbx
400a13:       50                      push   %rax
400a14:       48 8b 7e 08             mov    0x8(%rsi),%rdi
400a18:       31 f6                   xor    %esi,%esi
400a1a:       ba 0a 00 00 00          mov    $0xa,%edx
400a1f:       e8 9c fe ff ff          callq  4008c0 <strtol@plt>
400a24:       48 89 c3                mov    %rax,%rbx
400a27:       0f af db                imul   %ebx,%ebx

It's a reasonable thing to do to get a basic handle on assembly language for your machine and use gcc -S on your source, or objdump -D on your binary (as I did here) to see exactly what is going on.

Using the macro instead of the inlined function gets something very similar

0000000000400a10 <main>:
400a10:       41 56                   push   %r14
400a12:       53                      push   %rbx
400a13:       50                      push   %rax
400a14:       48 8b 7e 08             mov    0x8(%rsi),%rdi
400a18:       31 f6                   xor    %esi,%esi
400a1a:       ba 0a 00 00 00          mov    $0xa,%edx
400a1f:       e8 9c fe ff ff          callq  4008c0 <strtol@plt>
400a24:       48 89 c3                mov    %rax,%rbx
400a27:       0f af db                imul   %ebx,%ebx

Note one of the many dangers here with macros: what does this do ?

x = 5; std::cout << SQUARE(++x) << std::endl;

36? nope, 42. It becomes

std::cout << ++x * ++x << std::endl;

which becomes 6 * 7

Don't be put off by people telling you not to care about optimisation. Using C or C++ as your language is an optimisation in itself. Just try to work out if you're wasting time with it and be sensible.

answered Mar 25 '16 at 01:27

Hal

1,061
7
20

1

The real danger of `SQUARE(++x)` is not that it produces `42` instead of `36`. The danger is that it modifies a variable twice in a single statement, which gives undefined behaviour. Producing `42` is only one possible result. It might produce `42`, it might produce `36`, it might produce `49`, or it might reformat your hard drive. – Peter Mar 25 '16 at 01:37
1

People won't tell you not to worry about optimisation. They will tell you not to indulge in premature optimisation. That includes not indulging in optimisation before you even know there is an actual need to. – Peter Mar 25 '16 at 01:41
Yeah sure, whatever. Not very important here when the point is "it won't do what you intended" You are right, of course, gcc with -O3 gives 49, clang gives 42. Both give a nice warning with -Wall so definitely compile with -Wall is probably the more useful thing to note. – Hal Mar 25 '16 at 01:45
"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%." --Knuth. Don't pass up the 3%, know how to analyse and do it. People continually break out 7 words from the middle of the quote when a genuine question on understanding optimisation comes up. This question was one of them and deserves to be treated sensibly. – Hal Mar 25 '16 at 01:52
The problem is that you gave your comment about optimisation without context. You're also missing the point that a lot of questions about optimisation ask how to do it (or ask for a comparison of performance of X versus Y) before they do any analysis of the need. Anyway, you've clarified your attitude, so enjoy your down-vote. – Peter Mar 25 '16 at 02:00
That's possibly not as sensible as you imagine. Good luck! – Hal Mar 25 '16 at 02:38
Oh, you think I was attacking you do you? Wouldn't I comment on your post if i were? Not my style anyway. The context is a question on stack overflow, and this google search: [ "premature optimization is the root of all evil" site:stackoverflow.com ] I don't know what you think it is. I encourage sensible questions on understanding optimisation in C and C++. I wrote my answer before seeing yours to show how to go to assembly to answer it rather than actually provide the answer for the specific example. I think it's worthwhile. But thanks for the downvote! Should get a "first" badge for it! :) – Hal Mar 25 '16 at 02:58
You are both missing the point. I'm a noob, I posted a question just to clear my thoughts about this one. I appreciate each one of your answers as it helps me to better understand the concepts of C++ and that's all that it matters, so please don't argue. I very much thank you for all your answers, no matter the opinion because I'm sure they come from different experiences so let's respect each other's point of view. Thank you and have an awesome day. Cheers. – ggluta Mar 25 '16 at 07:07
Yeah that was my mistake. Don't respond to people who comment on your post who have also answered and feel "in competition" or something like that. Very silly stuff. Anyway, keep clarifying your thinking about performance, it's a good thing to do. Definitely use g++/clang++ -S flag on the source to see the code generated or use objdump -D on the binary. It's an under-emphasised and crucial part of learning programming. With this method I've demonstrated, you can answer such questions quickly for yourself. Good luck! – Hal Mar 29 '16 at 01:54

score 0 · Answer 4 · answered Mar 25 '16 at 01:12

Macros just perform text substitution to modify source code.

As such, macros don't inherently affect performance of code. The techniques you use to design and code obviously affect performance. So the only implication of macros on performance is based on what the macro does (i.e. what code you write the macro to emit).

The big danger of macros is that they do not respect scope. The changes they make are unconditional, cross function boundaries, and things like that. There are a lot of subtleties in writing macros to make them behave as intended (avoid unintended side effects in code, avoid undefined behaviour, etc). This means code which uses macros is harder to understand, and harder to get right.

At best, with modern compilers, the performance gain you can get using macros, is the same as can be achieved with inline functions - at the expense of increasing chances of the code behaving incorrectly. You are therefore better off using inline functions - unlike macros they are typesafe and work consistently with other code.

Modern compilers might choose to not inline a function, even if you have specified it as inline. If that happens, you generally don't need to worry - modern compilers are able to do a better job than most modern programmers in deciding whether a function should be inlined.

score 0 · Answer 5 · answered Mar 25 '16 at 10:35

Using such a macro only make sense if its argument is itself a #define'd constant, as the computation will then be performed by the preprocessor. Even then, double-check that the result is the expected one.

When working on classic variables, the (inlined) function form should be preferred as:

It is type-safe;
It will handle expressions used as an argument in a consistent way. This not only includes the case of per/post increments as quoted by Peter, but when the argument it itself some computation-intensive expression, using the macro form forces the evaluation of that argument twice (which may not necessarely evaluate to the same value btw) vs. only once for the function.

I have to admit that I used to code such macros for quick prototyping of apparently simple functions, but the time those make me lose over the years finalyl changed my mind !

Do macros in C++ improve performance?

5 Answers5