I have wrote an application which asks for the user id, and returns a link to the first available chat message:
Output
Enter user id:
322395
Fetching data...
http://chat.stackoverflow.com/transcript/17?m=231895#231895
Source code
module Main where
import Network.HTTP (simpleHTTP, getRequest, getResponseBody)
import Text.HTML.TagSoup
import Data.List (isInfixOf, isPrefixOf)
type Link = String
type UserID = String
hostname = "http://chat.stackoverflow.com"
getDivText :: String -> [Tag String] -> String
getDivText m = innerText
. takeWhile (~/= "</div>")
. dropWhile (~/= m)
getURLContent :: Link -> IO String
getURLContent lnk = simpleHTTP (getRequest lnk) >>= getResponseBody
getMessages :: UserID -> IO Integer
getMessages uid = do
contents <- getURLContent $ concat [ hostname
, "/users/"
, uid ]
let count = getDivText "<div class=user-message-count-xxl>"
$ (parseTags contents)
return (read count)
getFirstMessagePage :: UserID -> IO String
getFirstMessagePage uid = searchLast "" . ceiling . (/100)
=<< (fmap fromIntegral $ getMessages uid) where
searchLast body pg = do
let lnk = concat [ hostname
, "/users/"
, uid
, "/?tab=recent&pagesize=100&page="
, show pg ]
body' <- getURLContent lnk
if "monologue" `isInfixOf` body'
then return body'
else searchLast body (pg - 1)
getFirstMessage :: UserID -> IO Link
getFirstMessage ui = do
page <- getFirstMessagePage ui
let tags = filter step $ parseTags page
lnk = fromAttrib "href" $ last tags
return $ concat [ hostname, lnk ]
where step t = isTagOpen t
&& "/transcript" `isPrefixOf` fromAttrib "href" t
main :: IO ()
main = do
putStrLn "Enter user id:"
uid <- getLine
putStrLn "Fetching data..."
lnk <- getFirstMessage uid
putStrLn lnk
I tried to keep the HTTP requests to a minimum.
If the user has removed messages, than there will be a difference between the pages available and the number of messages in the profile.
To solve this, the app recursively lowers the page number till it finds the last message.
-
1\$\begingroup\$ I think my favorite part about this is the example data used. That user is my favorite on SE. \$\endgroup\$Loktar– Loktar2015年08月22日 20:16:16 +00:00Commented Aug 22, 2015 at 20:16
1 Answer 1
body
is unused in searchLast
.
I don't like explicit recursion as it is used in searchLast
. MaybeT
has the correct behavior for what you want.
ala X
turns a function like foldMap
into a function like fold
by applying the newtype constructor X
to it and unwrapping the result of the resulting function so that it looks like the type wrapped in the newtype has all the instances defined on the newtype.
alaf X
works the same, except that it turns a function like foldMap
into a function like foldMap
, so that further ala
-functions can be applied.
Alt
is the newtype wrapper that gives Alternative
s like Maybe
a Monoid
instance, therefore ala Alt foldMap
would be asum
, and alaf Alt foldMap
is something like "asumMap
", being to asum
as foldMap
is to fold
.
(ala MaybeT . alaf Alt) foldMap
= ala MaybeT (alaf Alt foldMap)
= ala MaybeT asumMap
(except that asumMap
doesn't exist) takes [m (Maybe a)]
and combines it into m (Maybe a)
using MaybeT
's Alternative
instance.
getFirstMessagePage :: UserID -> IO String
getFirstMessagePage uid = (ala MaybeT . alaf Alt) foldMap
. map fetchMonologue
. reverse . enumFromTo 0 . ceiling . (/100)
=<< (fmap fromIntegral $ getMessages uid) where
fetchMonologue :: Int -> IO (Maybe String)
fetchMonologue pg = do
let lnk = concat [ hostname
, "/users/"
, uid
, "/?tab=recent&pagesize=100&page="
, show pg ]
body' <- getURLContent lnk
return $ mfilter (isInfixOf "monologue") $ Just body'