In my Azure role code I download a 400 megabytes file that is splitted into 10-megabyte chunks and stored in Blob Storage. I use CloudBlob.DownloadToStream() for the download.
I tried two options. One is using a FileStream - I create a "write" FileStream and download chunks one by one into the same stream without rewinding and so I end up with an original file. The other option is creating a MemoryStream object by passing a number slightly larger than the original file size as the stream size (to avoid reallocations) and downloading the chunks into that MemoryStream - this way I end up with a MemoryStream holding the original file data.
Here's some pseudocode:
var writeStream = new StreamOfChoice( params );
foreach( uri in urisToDownload ) {
blobContainer.GetBlobReference( uri ).DownloadToStream( writeStream );
}
Now the only difference is that it's a FileStream in one case and a MemoryStream in the other, all the rest is the same. It turns out that it takes about 20 seconds with a FileStream and about 30 seconds with a MemoryStream - yes, the FileStream turns out to be faster. According to \Memory\Available Bytes performance counter the virtual machine has about 1 gigabyte memory available at the moment before MemoryStream is created, so it's not due to paging.
Why would writing to a file be faster than to a MemoryStream?