Understanding a function return

Question

I am a novice programmer and have only briefly covered the anatomy of a function call (setting up the stack, etc.). I can write a function two different ways and I'm wondering which (if either) is more efficient. This is for a finite element program so this function could be called several thousand times. It is using the linear algebra library Aramdillo.

First way:

void Q4::stiffness(mat &stiff) 
{
    stiff.zeros; // sets all elements of the matrix to zero
    // a bunch of linear algebra calculations
    // ...
    stiff *= h;
}

int main()
{
    mat elementStiffness(Q4__DOF, Q4__DOF);
    mat globalStiffness(totalDOF, totalDOF);
    for (int i = 0; i < reallyHugeNumber; i++)
    {
        elements[i].stiffness(&elementStiffness, PSTRESS);
        assemble(&globalStiffness, &elementStiffness);
    }
    return 0;
}

Second way:

mat Q4::stiffness() 
{
    mat stiff(Q4__DOF, Q4__DOF); // initializes element stiffness matrix
    // a bunch of linear algebra calculations
    // ...
    return stiff *= h;
}

int main()
{
    mat elementStiffness(Q4__DOF, Q4__DOF);
    mat globalStiffness(totalDOF, totalDOF);
    for (int i = 0; i < reallyHugeNumber; i++)
    {
        elementStiffness = elements[i].stiffness(PSTRESS);
        assemble(&globalStiffness, &elementStiffness);
    }
    return 0;
}

I think what I'm asking is: using the second way is mat stiff pushed to the stack and then copied into elementStiffness? Because I imagine the matrix being pushed to the stack and then being copied is much more expensive than passing a matrix be reference and setting its elements to zero.

http://stackoverflow.com/q/18594241/560648 Please do research before posting. — Lightness Races in Orbit, Jan 04 '14 at 03:16
Relevant [blog post](http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/) by Dave Abrahams — Praetorian, Jan 04 '14 at 03:48
Implement both ways then profile, otherwise you're pretty much guessing. — Retired Ninja, Jan 04 '14 at 05:55

Miguel · Accepted Answer · 2014-01-04T03:36:50.687

3

Passing a variable by reference and doing your calculations on that variable is a lot cheaper. When c++ returns a variable, it pretty much copies it twice.

First inside the function, and then it calls the copy constructor or assignment operator, depending on if the value is being assigned to a new variable or to an existing variable, to initialize the variable. If you have a user-defined variable with a long list of internal state variables then this assignment operation is going to take a big chunk of the operator's processing time.

EDIT#1: I forgot about c++11 and the std::move. Many compilers can optimize functions like this so they can use std::move and instead of copying an lvaue it can copy an rvalue which is just the memory location.

edited Jan 04 '14 at 03:36

answered Jan 04 '14 at 03:21

Miguel

872
1
12
26

3

This is only what logically happens. What actually happens is often much different. – Vaughn Cato Jan 04 '14 at 03:26
3

Not even logically, because copy constructor side effects aren't guaranteed to occur either. – orm Jan 04 '14 at 03:29
Copy elision existed prior to C++11 – Praetorian Jan 04 '14 at 03:45

score 0 · Answer 2 · answered Jan 04 '14 at 03:27

0

On the surface, I think the second way will be much more expensive as it both constructs a new mat and copies it to the stack on every call. Of course that depends a bit on how often the mat construction takes place in the first way.

That said, I think the best thing to do is setup an experiment and test to make sure (agreeing with the suggestion to research).

answered Jan 04 '14 at 03:27

Bryan Polyak

40
2

The copy basically never happens, due to RVO (and, in C++11, moves). – Lightness Races in Orbit Jan 04 '14 at 03:29

Understanding a function return

2 Answers2