Creating a separate HCRYPTPROV per thread doesn't make much sense. This is pointer to memory block from heap in all current implementations, primarily saved pointers to CSP entry points which used to call actual provider implementation (CPGenRandom in our case). The references themselves do not contain state of the CSP, unlike for example HCRYPTKEY which containing actual key state. So even if you create a separate HCRYPTPROV for every thread - this changes nothing.
There may be some global variables / data used by CSP internally during this call; this is however unknown as these would be implementation details. Of course we can serialize calls to CryptGenRandom in the code. However we cannot control that some other dll in our process also call CryptGenRandom concurrently. So serializing all calls to CryptGenRandom also impossible.
As result I think the CPGenRandom must be design to be thread-safe. and it my tests with a well known Microsoft CSP this is true. Internal synchronization is used in function, when need access global data and if multiple threads call CPGenRandom concurrently; every thread receives unique random data.
So my conclusion - CryptGenRandom is thread-safe, at least for all Microsoft CSP