- I think you should use
var
more, especially when it's clear what type the object has, because you're just creating it. - You shouldn't read the response as a single byte array. Instead, you should use
Stream
s where possiblypossible (it certainly is possible in your case), because they are more efficient. - Don't just ignore unknown exceptions. If there are some exceptions which you want to ignore (e.g. when a website returns 404), specify them explicitly.
- I think you should use
var
more, especially when it's clear what type the object has, because you're just creating it. - You shouldn't read the response as a single byte array. Instead, you should use
Stream
s where possibly (it certainly is possible in your case), because they are more efficient. - Don't just ignore unknown exceptions. If there are some exceptions which you want to ignore (e.g. when a website returns 404), specify them explicitly.
- I think you should use
var
more, especially when it's clear what type the object has, because you're just creating it. - You shouldn't read the response as a single byte array. Instead, you should use
Stream
s where possible (it certainly is possible in your case), because they are more efficient. - Don't just ignore unknown exceptions. If there are some exceptions which you want to ignore (e.g. when a website returns 404), specify them explicitly.
First, some smaller things (in terms of the size of code changed):
- I think you should use
var
more, especially when it's clear what type the object has, because you're just creating it. - You shouldn't read the response as a single byte array. Instead, you should use
Stream
s where possibly (it certainly is possible in your case), because they are more efficient. - Don't just ignore unknown exceptions. If there are some exceptions which you want to ignore (e.g. when a website returns 404), specify them explicitly.
The big thing I would change about your code is how you throttle your code and how you return the results. While throttling your code using Semaphore
will work fine, I think you should let some library for asynchronous handling of collections do that for you. Such libraries include Rx and TPL Dataflow. Doing that will also let you avoid using an event, because your method will return "asynchronous collection" instead (IObservable<T>
in Rx, ISourceBlock<T>
in TPL Dataflow).
With TPL Dataflow, your code could look something like:
public ISourceBlock<DownloadedFile> DownloadFiles(string[] fileIds, string securityCookieString, string securityCookieDomain)
{
var urls = CreateUrls(fileIds);
// we have to use TransformManyBlock here, because we want to be able to return 0 or 1 items
var block = new TransformManyBlock<string, DownloadedFile>(
async url =>
{
var httpClientHandler = new HttpClientHandler();
if (!string.IsNullOrEmpty(securityCookieString))
{
var securityCookie = new Cookie(FormsAuthentication.FormsCookieName, securityCookieString);
securityCookie.Domain = securityCookieDomain;
httpClientHandler.CookieContainer.Add(securityCookie);
}
return await DownloadFile(url, httpClientHandler);
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = Properties.Settings.Default.maxConcurrentDownloads });
foreach (var url in urls)
block.Post(url);
block.Complete();
return block;
}
private static async Task<DownloadedFile[]> DownloadFile(string url, HttpClientHandler clientHandler)
{
var client = new HttpClient(clientHandler);
var downloadedFile = new DownloadedFile();
try
{
HttpResponseMessage responseMessage = await client.GetAsync(url);
if (responseMessage.Content.Headers.ContentDisposition == null)
return new DownloadedFile[0];
downloadedFile.FileName = Path.Combine(
Properties.Settings.Default.workingDirectory, responseMessage.Content.Headers.ContentDisposition.FileName);
if (!Directory.Exists(Properties.Settings.Default.workingDirectory))
{
Directory.CreateDirectory(Properties.Settings.Default.workingDirectory);
}
using (var httpStream = await responseMessage.Content.ReadAsStreamAsync())
using (var filestream = new FileStream(
downloadedFile.FileName, FileMode.Create, FileAccess.Write, FileShare.None, bufferSize: 4096, useAsync: true))
{
await httpStream.CopyToAsync(filestream, 4096);
}
}
// TODO: improve
catch (Exception ex)
{
return new DownloadedFile[0];
}
return new[] { downloadedFile };
}