It is simple to get a StorageStreamDownloader using the azure.storage.blob package:
from azure.storage.blob import BlobServiceClient
blob_service_client = BlobServiceClient.from_connection_string("my azure connection string")
container_client = blob_service_client.get_container_client("my azure container name")
blob_client = container_client.get_blob_client("my azure file name")
storage_stream_downloader = blob_client.download_blob()
and it is simple to process a file-like object, or more specifically, I think, a string-returning iterator (or the file path of the object) in the csv package:
import csv
from io import StringIO
 
csv_string = """col1, col2
a,b
c,d"""
with StringIO(csv_string) as csv_file:
  for row in csv.reader(csv_file):
    print(row) # or rather whatever I actually want to do on a row by row basis, e.g. ascertain that the file contains a row that meets a certain condition
What I'm struggling with is getting the streaming data from my StorageStreamDownloader into csv.reader() in such a way that I can process each line as it arrives rather than waiting for the whole file to download.
The Microsoft docs strike me as a little underwritten by their standards (the chunks() method has no annotation?) but I see there is a readinto() method for reading into a stream.  I have tried reading into a BytesIO stream but cannot work out how to get the data out into csv.reader() without just outputting the buffer to a new file and reading that file.  This all strikes me as a thing that should be doable but I'm probably missing something obvious conceptually, perhaps to do with itertools or asyncio, or perhaps I'm just using the wrong csv tool for my needs?
 
     
    

