I want to use async() to speed up a loop in my code. So, I encapsulate looping task as a function and want to make multiple async() call to create multiple looping tasks in parallel.
This snippet gives an idea of the function:
void foo(int start, int end, std::vector<int>& output) {
    // do something
    // write results to corresponding locations in output vector
}    
My question is that when I create tasks in a loop, it seems that async does not call the function passed to it immediately. So, when I modify the variable passed to foo() and make another call, previous call's argument might be modified as well.
Here is a brief example:
#include <future>
int main(void) {
    int start = 0;
    int step = 10;
    auto output = std::vector<int>(100);                // an output vector of length 100
    auto ft_list = std::vector<std::future<void>>(10);  // want create 10 jobs in parallel
    
    // create 10 jobs
    for (auto& ft : ft_list) {
        ft = std::async(std::launch::async, [&] {foo(start, start + step, output)});
        start += step;
    }
    // above block does not execute as I intended
    // it sometimes create multiple call to foo() with same value of start
    // I guess it is because async calls foo() sometimes after the start is modified
    
    for (auto& ft : ft_list) {
        ft.wait();
    }
    
    return 0;
}
I tried some dirty hacks to make sure the variable is only modified after previous job starts (by passing a reference to a flag variable). However, I feel there should be a better way. What is the proper way to parallelize the loop here using async()?
 
    