Return to Answer

typo

Source Link

edited Jan 29, 2014 at 18:26

svick

edited Jan 29, 2014 at 18:26

svick

24.5k
4
53
89

I think you should use var more, especially when it's clear what type the object has, because you're just creating it.
You shouldn't read the response as a single byte array. Instead, you should use Streams where possiblypossible (it certainly is possible in your case), because they are more efficient.
Don't just ignore unknown exceptions. If there are some exceptions which you want to ignore (e.g. when a website returns 404), specify them explicitly.

I think you should use var more, especially when it's clear what type the object has, because you're just creating it.
You shouldn't read the response as a single byte array. Instead, you should use Streams where possibly (it certainly is possible in your case), because they are more efficient.
Don't just ignore unknown exceptions. If there are some exceptions which you want to ignore (e.g. when a website returns 404), specify them explicitly.

I think you should use var more, especially when it's clear what type the object has, because you're just creating it.
You shouldn't read the response as a single byte array. Instead, you should use Streams where possible (it certainly is possible in your case), because they are more efficient.
Don't just ignore unknown exceptions. If there are some exceptions which you want to ignore (e.g. when a website returns 404), specify them explicitly.

Source Link

answered Nov 15, 2012 at 21:18

svick

answered Nov 15, 2012 at 21:18

svick

24.5k
4
53
89

First, some smaller things (in terms of the size of code changed):

I think you should use var more, especially when it's clear what type the object has, because you're just creating it.
You shouldn't read the response as a single byte array. Instead, you should use Streams where possibly (it certainly is possible in your case), because they are more efficient.
Don't just ignore unknown exceptions. If there are some exceptions which you want to ignore (e.g. when a website returns 404), specify them explicitly.

The big thing I would change about your code is how you throttle your code and how you return the results. While throttling your code using Semaphore will work fine, I think you should let some library for asynchronous handling of collections do that for you. Such libraries include Rx and TPL Dataflow. Doing that will also let you avoid using an event, because your method will return "asynchronous collection" instead (IObservable<T> in Rx, ISourceBlock<T> in TPL Dataflow).

With TPL Dataflow, your code could look something like:

public ISourceBlock<DownloadedFile> DownloadFiles(string[] fileIds, string securityCookieString, string securityCookieDomain)
{
 var urls = CreateUrls(fileIds);
 // we have to use TransformManyBlock here, because we want to be able to return 0 or 1 items
 var block = new TransformManyBlock<string, DownloadedFile>(
 async url =>
 {
 var httpClientHandler = new HttpClientHandler();
 if (!string.IsNullOrEmpty(securityCookieString))
 {
 var securityCookie = new Cookie(FormsAuthentication.FormsCookieName, securityCookieString);
 securityCookie.Domain = securityCookieDomain;
 httpClientHandler.CookieContainer.Add(securityCookie);
 }
 return await DownloadFile(url, httpClientHandler);
 }, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = Properties.Settings.Default.maxConcurrentDownloads });
 foreach (var url in urls)
 block.Post(url);
 block.Complete();
 return block;
}
private static async Task<DownloadedFile[]> DownloadFile(string url, HttpClientHandler clientHandler)
{
 var client = new HttpClient(clientHandler);
 var downloadedFile = new DownloadedFile();
 try
 {
 HttpResponseMessage responseMessage = await client.GetAsync(url);
 if (responseMessage.Content.Headers.ContentDisposition == null)
 return new DownloadedFile[0];
 downloadedFile.FileName = Path.Combine(
 Properties.Settings.Default.workingDirectory, responseMessage.Content.Headers.ContentDisposition.FileName);
 if (!Directory.Exists(Properties.Settings.Default.workingDirectory))
 {
 Directory.CreateDirectory(Properties.Settings.Default.workingDirectory);
 }
 using (var httpStream = await responseMessage.Content.ReadAsStreamAsync())
 using (var filestream = new FileStream(
 downloadedFile.FileName, FileMode.Create, FileAccess.Write, FileShare.None, bufferSize: 4096, useAsync: true))
 {
 await httpStream.CopyToAsync(filestream, 4096);
 }
 }
 // TODO: improve
 catch (Exception ex)
 {
 return new DownloadedFile[0];
 }
 return new[] { downloadedFile };
}

lang-cs