I wrote this script that downloads an RSS feed and converts it to HTML. One of my only concerns is my variable naming because I suck at naming things.
# -*- coding: utf-8 -*-
"""Simple RSS to HTML converter."""
__version__ = "0.0.2"
__author__ = "Ricky L Wilson"
from bs4 import BeautifulSoup
from feedparser import parse as parse_feed
TEMPLATE = u"""
<h2 class='title'>{title}</h2>
<a class='link' href='{link}'>{title}</a>
<span class='description'>{summary}</span>
"""
def entry_to_html(**kwargs):
"""Formats feedparser entry."""
return TEMPLATE.format(**kwargs).encode('utf-8')
def convert_feed(url):
"""Main loop."""
html_fragments = [entry_to_html(**entry) for entry in parse_feed(url).entries]
return BeautifulSoup("\n".join(html_fragments), 'lxml').prettify()
def save_file(url, filename):
"""Saves data to disc."""
with open(filename, 'w') as file_object:
file_object.write(convert_feed(url).encode('utf-8'))
if __name__ == '__main__':
save_file('http://stackoverflow.com/feeds', 'index.html')
with open('index.html') as fobj:
print fobj.read()
2 Answers 2
About the generated HTML
Each feed item should become an article
. This would also allow you to include the generated HTML on any page without having to adjust the heading levels. Unless you know that h2
would always be the correct heading element in your case, you might want to use h1
instead.
You could give the a
element the bookmark
link type.
Unless you remove markup from the feed item’s description, you might want to use a div
instead of a span
, because a span
can’t contain many elements that might appear in the description.
You could specify a class on the article
element, and omit the classes on its child elements. Less potential to conflict with classes that might already be used in the document.
<article class='feed-item'>
<h1>{title}</h1>
<a href='{link}' rel='bookmark'>{title}</a>
<div>{summary}</div>
</article>
An alternative solution would be to use a Mako
template engine, you need to first install it:
pip install Mako
Then, create a template HTML file that you will render on the fly, template.html
:
<html>
<body>
% for feed in feeds:
<div>
<h2 class='title'>${feed.title}</h2>
<a class='link' href='${feed.link}'>${feed.title}</a>
<span class='description'>${feed.summary}</span>
</div>
% endfor
</body>
</html>
Then, you need to render this HTML file with the context of your feeds:
from feedparser import parse as parse_feed
from mako.template import Template
def convert_feed(url, filename):
"""Convert feed to an HTML."""
with open(filename, 'w') as file_object:
html_content = Template(filename='template.html', output_encoding='utf-8').render(feeds=parse_feed(url).entries)
file_object.write(html_content)
if __name__ == '__main__':
convert_feed('http://stackoverflow.com/feeds', 'index.html')