8

I just spent a long time breaking my teeth on why this code was 'hanging' for some urls:

let getImage (imageUrl:string) =
 async {
 try
 let req = WebRequest.Create(imageUrl) :?> HttpWebRequest
 req.UserAgent <- "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)";
 req.Method <- "GET";
 req.AllowAutoRedirect <- true;
 req.MaximumAutomaticRedirections <- 4;
 req.Timeout <- 3000; //HAHAHA, nice try!
 let! response1 = req.AsyncGetResponse()
 let response = response1 :?> HttpWebResponse
 use stream = response.GetResponseStream()
 let ms = new MemoryStream()
 let bytesRead = ref 1
 let buffer = Array.create 0x1000 0uy
 while !bytesRead > 0 do
 bytesRead := stream.Read(buffer, 0, buffer.Length)
 ms.Write(buffer, 0, !bytesRead)
 return SuccessfulDownload(imageUrl, ms.ToArray())
 with
 ex -> return FailedDownload(imageUrl, ex.Message)
 }

After managing to track down which of the 3000 urls was hanging, I learned that AsyncGetResponse doesn't take any notice of HttpWebRequest.Timeout. I've done a bit of searching which throws up suggestions of wrapping the async request in a thread with a timer. That's great for C#, but if I'm running 3000 of these through Async.Parallel |> Async.RunSynchronously, what's the best way to handle this problem?

asked Apr 19, 2011 at 8:02
2
  • 1
    You should just do stream.CopyTo ms rather than all the manual copying with buffer and bytesRead. Commented Apr 19, 2011 at 17:40
  • @ildjarn, thanks for the info, I have to admit it was a straight copy-paste from here Commented Apr 20, 2011 at 11:11

1 Answer 1

7

I've only roughly tested this, but it should have the correct behavior:

type System.Net.WebRequest with
 member req.AsyncGetResponseWithTimeout () =
 let impl = async {
 let iar = req.BeginGetResponse (null, null)
 let! success = Async.AwaitIAsyncResult (iar, req.Timeout)
 return if success then req.EndGetResponse iar
 else req.Abort ()
 raise (System.Net.WebException "The operation has timed out") }
 Async.TryCancelled (impl, fun _ -> req.Abort ())

In your code, call req.AsyncGetResponseWithTimeout() instead of req.AsyncGetResponse().

answered Apr 19, 2011 at 14:06
Sign up to request clarification or add additional context in comments.

6 Comments

Hm, I spoke too soon. It works well for a few, but I'm getting a heck of a lot of timeouts now. Is it possible that WebRequest manages the number of concurrent connections internally, and now I'm timing out on queued requests? I'll keep digging...
@Benjol : Yep, as I recall, by default it's internally limited to two simultaneous connections. I seem to remember that being pretty trivial to work around though. I'll look through some old C# code of mine to try and remember how.
Spent a while wrangling this code. My current understanding is that all of these WebRequests are launched 'immediately', so to get it to work I have to set the timeout to the time required to download all of them. Haven't had time to investigate further.
@Benjol : That implies a problem with how you use getImage, not a problem with getImage itself (or with AsyncGetResponseWithTimeout).
|

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.