lua-users home
lua-l archive

Access received headers from socket.http in storing sink

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


Hi
I wanted to show some progress when downloading big files with socket.http, had some problems and made some changes.
The main issue, I've had, is
* How to access the received headers from within the storing sink?
Main reason is, I need the "content-length" from the reply to calculate progress. Other reasons could be the "filename" header, which you will need, when the filename is not detectable from the url.
My solution (*please comment, if this is ok or if there is a better way*):
I have added two lines in socket.http in function trequest(reqt)
 headers = h:receiveheaders()
+ reqt.reply={}
+ reqt.reply.headers=headers
 -- at this point we should have a honest reply from the server
Now I have the possibility to write a receiving function with progress display. Since my request table is passed all through the http layers, while being still the same table, modifications to the request (as made above) can be used in the
sink (as shown below)
--8<--snip--8<--snip--8<--snip--8<--snip--8<--snip--8<--snip--8<--snip--8<--snip
------------------------------------------------------------------------
--- get one url save into a file
-- @param url to get
-- @param file to save
------------------------------------------------------------------------
function get_url_save_long_file(url,file)
 printf("Retrieving %s\n",url)
 local request=url
 if type(url)=="string" then
 request={url=url}
 end
 local fd,err=io.open(file,"wb")
 if not fd then
 Error("open(%s)failed(%s)\n",file,err)
 return nil
 end
 local want
 local have=0
 local p1=io.stdout:seek()
 local t0=socket.gettime()
 ---------------------------------
 -- the receiving filter
 ---------------------------------
 local function sink_fd(chunk, src_err)
 if chunk == nil then
 -- no more data to process, we won't receive more chunks
 fd:close()
 if src_err then
 printf("\n ==> Src_Error=%s\n",src_err)
 return nil,src_err
-- source reports an error, TBD what to do with chunk received up to now
 else
 printf("\n ==> EOF %s\n",dots(have))
 return true -- or anything that evaluates to true
 end
 elseif chunk == "" then
 printf("\n ==> ''\n")
 -- this is assumed to be without effect on the sink, but may
 -- not be if something different than raw text is processed
 -- do nothing and return true to keep filters happy
 return true -- or anything that evaluates to true
 else
 -- try to get expected length
 if have==0 then
 -- this is where I access the header
 local h=request.reply and request.reply.headers
 want=h["content-length"]
 end
 local size=#chunk
 local elapsed=socket.gettime()-t0
 have=have+size
 if p1 then
 io.stdout:seek("set",p1)
 end
 if want then
 local kbs=0.001*have/elapsed
 local total=elapsed*want/have
 local remain=total-elapsed
 local time_for_this=elapsed*size/have
printf(" ==>%d %8s/%8s %6.2fkbs (%s/%s rem %s)%s \r",size,dots(have),dots(want),kbs,t2s(elapsed),t2s(total),t2s(remain),t2s(time_for_this))
 else
 printf(" ==> %8s (%s) \r",dots(have),t2s(elapsed))
 end
 -- chunk has data, process/store it as appropriate
 fd:write(chunk)
 return true -- or anything that evaluates to true
 end
 -- in case of error
 return nil, err
 end
 request.sink=sink_fd
 local ret,sts=http.request(request)
 printf("Retrieved \"%s\" = ret=%s,sts=%s\n",url,vis(ret),vis(sts))
 return sts
end
-->8--end-->8--end-->8--end-->8--end-->8--end-->8--end-->8--end-->8--end
This worked very fine for me, until I met a redirecting website. My sink only got the header from the redirect-reply, not for the real data one.
So I made another change (comments again are welcome)
This is the original
function tredirect(reqt, location)
 local result, code, headers, status = trequest {
 -- the RFC says the redirect URL has to be absolute, but some
 -- servers do not respect that
 url = url.absolute(reqt.url, location),
 source = reqt.source,
 sink = reqt.sink,
 headers = reqt.headers,
 proxy = reqt.proxy,
 nredirects = (reqt.nredirects or 0) + 1,
 create = reqt.create
} -- pass location header back as a hint we redirected
 headers = headers or {}
 headers.location = headers.location or location
 return result, code, headers, status
end
This is my version
function tredirect(reqt, location)
 reqt.url=url.absolute(reqt.url, location)
 reqt.nredirects=(reqt.nredirects or 0) + 1
 local result, code, headers, status = trequest(reqt)
 -- pass location header back as a hint we redirected
 headers = headers or {}
 headers.location = headers.location or location
 return result, code, headers, status
end
You see the difference? Instead of creating a new request i update the current request, so get_url_save_long_file can get the informations needed. Is the ok? Or are there any serious reasons to copy the request as in the original version?
Looking for advice,
Regards JJvB

AltStyle によって変換されたページ (->オリジナル) /