June 26, 2007
Reconstruct a Feed's History Using Google Reader
Google Reader is more than a feed reader: it's also a platform for feed caching and archiving. That means Google Reader stores all the posts from the subscribed feeds and they're available if you keep scrolling down in the interface.
A simple application for this feature is to retrieve the history of a feed for archiving purposes or to import it in a database. If you visit a blog or a news site, the feed will only contain the latest 10-20 posts, but Google Reader can show you more than that.
Just enter this URL in the address bar:
http://www.google.com/reader/atom/feed/FEED_URL?r=n&n=NUMBER_OF_ITEMS
and replace FEED_URL with the address of the feed and NUMBER_OF_ITEMS with the number of historical posts from the feed.
For example, http://www.google.com/reader/atom/
feed/http://feeds.feedburner.com/GoogleOperatingSystem?r=n&n=100 should return the latest 100 posts from this blog as an ATOM/XML file.
A simple application for this feature is to retrieve the history of a feed for archiving purposes or to import it in a database. If you visit a blog or a news site, the feed will only contain the latest 10-20 posts, but Google Reader can show you more than that.
Just enter this URL in the address bar:
http://www.google.com/reader/atom/feed/FEED_URL?r=n&n=NUMBER_OF_ITEMS
and replace FEED_URL with the address of the feed and NUMBER_OF_ITEMS with the number of historical posts from the feed.
For example, http://www.google.com/reader/atom/
feed/http://feeds.feedburner.com/GoogleOperatingSystem?r=n&n=100 should return the latest 100 posts from this blog as an ATOM/XML file.
Subscribe to:
Post Comments (Atom)
24 comments:
Wooow, who would have thought that?
Reply DeleteHopefully, someone in the future could use this for reading what his/her mother/father wrote about as a teenage. Better that pics and family tales, huh?
Honestly, I never used Google reader before. The way you write this article make me want to try this great application...:)
Reply DeleteI had this question which I asked at DigitalPoint a couple of days back.
Reply DeleteBut how they cache on new feeds ?
Im using google reader for a couple of months, but there is one thing that is missing... a SEARCH! Its amazing how a google application doesnt have search included... and about this feature, its great. I use it too.
Reply DeleteIt requires that you have an Account on Google Reader. Has it always been like this?
Reply DeleteTo use Google Reader, you need an account.
Reply DeleteHow about using this programmatically? Has anyone done it?
Reply DeleteS.,
Reply DeleteUsing Perl and LWP you can easily access this data. Notice that you need to authenticate yourself before accessing Google Reader.
You can download up to a maximum of 5,000 entries per feed.
I was excited to find this article, because within a short period of time my blog's database was lost on both my hosting company's servers and my local computer. Anyway, the majority of the content is cached by Google Reader and I am hoping to find a way to retrieve the lost data and import it back into my WordPress blog.
Reply DeleteI tried this method and see that it works fine in the example provided, but I cannot replicate the results with my blog. I suspect it has to do with the syntax of my RSS feed's URL. My site is cached within Google Reader using the standard WordPress URL "www.domainname.com/?feed=rss2". I didn't list my specific URL, because it's adult-oriented content, but you can find it by following the link from my name (NSFW). Again, I'm guessing but I think that question mark is confusing the function. I am wondering if it should be escaped somehow or if anyone knows another method.
Thanks in advance!
Google Reader caches a feed only if there's at least one subscriber and starting with the moment when someone subscribes to that feed. For example, if the first Google Reader subscriber added the feed on October 21st 2007, the cache will include posts published starting with that date.
Reply DeleteThere is a 1000 items limit for me.
Reply DeleteHow could we bypass this ?
I wan to import all my starred items.
Regards,
Antoine
It doesn't work for me either. I'm trying to recover old posts from a fotolog account and it doesn't really work. It only shows 100 feeds no matter the number I put after n=100. Maybe it has something to do with the RSS feed's url, which is http://www.fotolog.com/username/feed/main/rss20 Any idea?
Reply DeleteThanks a lot
Andrés
Probably nobody subscribed to the feed and Google didn't cache the posts.
Reply DeleteIs there anyway to delete this history so that past posts that have been deleted do not appear in the cache to new subscribers, and possibly current subscribers?
Reply DeleteI was excited to find this article, because within a short period of time my blog's database was lost on both my hosting company's servers and my local computer.
Reply DeleteIs there anyway to delete this history so that past posts that have been deleted do not appear in the cache to new subscribers, and possibly current subscribers?
Reply DeleteSame problem! the content owner have all right to control his content and content in feed, Google reader should be protect this right and give webmaster or content owner to control their content in feed like normal search cache!
No, it's not possible to clear the cache.
Reply DeleteI have the file but can't import it into wordpress, any suggestions on getting this into a format I can import?
Reply DeleteAnyone tried to use feedparser to do this?
Reply DeleteI don't know how I can login from a python script...
I was excited to hear this, for a moment. But after reading the comments, I was disappointed. I wanted to fetch historical feeds of many websites and after reading "Ion Alex Chitu" comment, I felt its the same problem I got with NewsBlur(www.newsblur.com) Unless someone already subscribed to a blog, you wont get an old feedentry of a website, which you cant guarantee for any website.
Reply DeleteCan we use Google Reader(as a platform) for commercial use ?
Reply DeleteThis is brilliant, but has anyone found a solution to the 1000-post limit?
Reply Delete>>> http://www.google.com/reader/atom/feed/http://feeds.feedburner.com/GoogleOperatingSystem?r=n&n=100
Reply DeleteYes it works but I must log in to Google in order to use it.
Is there any way I can use it as anonymous?
The site is really beneficial for everyone to know about this topic. I think if you read blog than you will get some more information from blog. This is really useful blog.Boekhouder utrecht
Reply DeleteNote: Only a member of this blog may post a comment.
[フレーム]