I'm trying to understand if BlazingSQL is a competitor or complementary to dask.
I have some medium-sized data (10-50GB) saved as parquet files on Azure blob storage.
IIUC I can query, join, aggregate, groupby with BlazingSQL using SQL syntax, but I can also read the data into CuDF using dask_cudf and do all same operations using python/dataframe syntax.
So, it seems to me that they're direct competitors?
Is it correct that (one of) the benefits of using dask is that it can operate on partitions so can operate on datasets larger than GPU memory whereas BlazingSQL is limited to what can fit on the GPU?
Why would one choose to use BlazingSQL rather than dask?
Edit:
The docs talk about dask_cudf but the actual repo is archived saying that dask support is now in cudf itself. It would be good to know how to leverage dask to operate on larger-than-gpu-memory datasets with cudf