So I am quite new to OpenMP, but I've recently started to play with it and now I want to use it in order to "speed-up" some calculations which I need to do. I have a function which depends on 3 variables and I want to evaluate it at some points in an interval, then show the values in a file. The thing is that the intervals for the 3 parameters are quite large, so I will have a lot of function evaluations. Using the 3-nested for loop is quite a pain if my interval is large.
The serial implementation is straight-forward, just make a 3-nested loop where each index i,j,k takes the value of the corresponding parameter (integer numbers from 1 to DIM, and evaluate the function in the point (i,j,k). 
In the OpenMP approach, obviously I thought of using the #pragma omp parallel for hoping that the program runtime will be faster. 
Here is the code I wrote for serial implementation and the "parallel" one. Please keep in mind that DIM is set here to a smaller number just for testing purposes.
#include <iostream>
#include <chrono>
#include <omp.h>
#include <cmath>
#include <fstream>
using namespace std;
ofstream outparallel("../output/outputParallel.dat");
ofstream outserial("../output/outputSerial.dat");
const int spaceDIM = 80;
double myFunction(double a, double b, double c)
{
    return (double)a * log(b) + exp(a / b) + c;
}
void serialAlgorithmTripleLoop()
{
    auto sum = 0;
    auto timeStart = chrono::high_resolution_clock::now();
    for (int i = 1; i <= spaceDIM; ++i)
        for (int j = 1; j <= spaceDIM; ++j)
            for (int k = 1; k <= spaceDIM; ++k)
            {
                //sum += i * j * k;
                outserial << i << " " << j << " " << k << " " << myFunction((double)i, (double)j, (double)k) << endl;
            }
    auto timeStop = chrono::high_resolution_clock::now();
    auto execTime = chrono::duration_cast<chrono::seconds>(timeStop - timeStart).count();
    cout << "Serial execution time = " << execTime << " seconds";
    cout << endl;
    outserial << "Execution time = " << execTime << " seconds";
    outserial << endl;
}
void parallelAlgorithmTripleLoop()
{
    //start of the actual algorithm
    auto sum = 0;
    auto timeStart = chrono::high_resolution_clock::now();
#pragma omp parallel for
    for (int i = 1; i <= spaceDIM; ++i)
        for (int j = 1; j <= spaceDIM; ++j)
            for (int k = 1; k <= spaceDIM; ++k)
            {
                //   sum += i * j * k;
                outparallel << i << " " << j << " " << myFunction((double)i, (double)j, (double)k) << endl;
            }
    auto timeStop = chrono::high_resolution_clock::now();
    auto execTime = chrono::duration_cast<chrono::seconds>(timeStop - timeStart).count();
    cout << "Parallel execution time = " << execTime << " seconds";
    cout << endl;
    outparallel << "Execution time = " << execTime << " seconds";
    outparallel << endl;
}
int main()
{
    cout << "FUNCTION OPTIMIZATION" << endl;
    serialAlgorithmTripleLoop();
    parallelAlgorithmTripleLoop();
    return 0;
}
The output is unexpected for me: using the parallel approach I get longer execution time than the serial one. I also tried to use "reduction" and "ordered" and "collapsed" clauses from OMP standard, but none helped me. I'm running this on a 4-cores 8-threads laptop.
FUNCTION OPTIMIZATION
Serial execution time = 4 seconds
Parallel execution time = 7 seconds
Q: How can I properly speed up the evaluation of the function?