Using HttpClient to Download a File with GetStreamAsync
I recently wanted to download a set of large files, normally I would use a bash script to do this, but there were some complications around the file and the directory structure that made it a bit more challenging for me in bash.
I decided to use C# instead. I was surprised that it was not so obvious how to do this. There are many methods on the HttpClient, but which should be used? So I wrote this post to remind my future self when I have forgotten how to do it.
I will start with what is better way, I have run tests downloading files of around 400MB. With the first approach the amount of memory used by the application doesn’t exceed 11MB, with the second approach the memory usage goes up to over 400MB.
HttpClient.GetStreamAsync
With the first approach, I use the HttpClient.GetStreamAsync
method to get a Stream
of the file I’m downloading, and then pass that stream to a FileStream
to write the file to disk.
1var httpClient = new HttpClient();
2
3string fileToDownload = "https://upload.wikimedia.org/wikipedia/commons/0/0e/Microsoft_.NET_logo.png";
4
5string fileName = fileToDownload.Split('/').Last();
6
7using var downloadStream = await httpClient.GetStreamAsync(fileToDownload);
8using var fileStream = new FileStream(fileName, FileMode.Create, FileAccess.Write);
9
10await downloadStream.CopyToAsync(fileStream);
11await fileStream.FlushAsync();
12fileStream.Close();
This is the best way to download a large file, as it doesn’t load the entire file into memory, it streams from the source and writes them to disk.
HttpClient.GetByteArrayAsync
Another approach is to read all the bytes of the file into memory and then write them to disk.
1var httpClient = new HttpClient();
2
3string fileToDownload = "https://upload.wikimedia.org/wikipedia/commons/0/0e/Microsoft_.NET_logo.png";
4
5string fileName = fileToDownload.Split('/').Last();
6
7byte[] fileBytes = await httpClient.GetByteArrayAsync(fileToDownload);
8await File.WriteAllBytesAsync(fileName, fileBytes);
This approach is not as efficient as the previous one, as it loads the entire file into memory before writing it to disk.
There other ways to do this, you could also using .GetAsync
and access the response and content within it, but I think that is more complicated than it needs to be.