How can I make this more readable?
public class RssFeed
{
string url = "http://www.pwop.com/feed.aspx?show=dotnetrocks&filetype=master";
public string responseBody;
private string DATE_FORMAT = "ddd, dd MMM yyyy HH:mm:ss \\E\\D\\T";
public Story[] Getstories()
{
getContent();
var stories = Parse(responseBody);
return GetMostRecent5(stories, new StoryComparer());
}
public void getContent()
{
HttpWebRequest req = WebRequest.Create(url) as HttpWebRequest;
WebResponse response = req.GetResponse();
StreamReader sr = new StreamReader(response.GetResponseStream());
responseBody = sr.ReadToEnd();
}
static Story[] Parse(string content)
{
var items = new List<string>();
int start = 0;
while (true)
{
var nextItemStart = content.IndexOf("<item>", start);
var nextItemEnd = content.IndexOf("</item>", nextItemStart);
if (nextItemStart < 0 || nextItemEnd < 0) break;
String nextItem = content.Substring(nextItemStart, nextItemEnd + 7 - nextItemStart);
items.Add(nextItem);
start = nextItemEnd;
}
var stories = new List<Story>();
for (byte i = 0; i < items.Count; i++)
{
stories.Add(new Story() {
title = Regex.Match(items[i], "(?<=<title>).*(?=</title>)").Value,
link = Regex.Match(items[i], "(?<=<link>).*(?=</link>)").Value,
date = Regex.Match(items[i], "(?<=<pubDate>).*(?=</pubdate>)").Value
});
}
return stories.ToArray();
}
private static T[] GetMostRecent5<T>(T[] stories, IComparer<T> comparer)
{
List<T> recentStories = stories.Take(5).ToList();
recentStories.Sort(comparer);
var recentStoriesArray = new T[5];
for (int i = 0; i <= 5; i++)
{
recentStoriesArray[i] = recentStories[i];
}
return recentStoriesArray;
}
public class Story
{
public string title;
public string link;
public string date;
public override bool Equals(object obj)
{
return this.Equals(obj as Story);
}
public bool Equals(Story story)
{
return this.link == story.link;
}
}
private class StoryComparer : IComparer<Story>
{
public int Compare(Story x, Story y)
{
if (x.Equals(y))
return 0;
//redundant else
else
return x.date.CompareTo(y.date);
}
}
}
4 Answers 4
Not that ugly. I've seen ugly code, and trust me, this is not ugly.
However, this is kind of a "God Class" that has more than a single responsibility, hence it violates SRP principle.
How to improve
Separate the class to:
ContentFetcher, which is responsible for fetching only
StoryParser, which parses the content and creates instances of your object model (Story[]
)
Both of the above should implement the Try..
pattern, so that the single method in the ContentFetcher
has the following signature:
bool TryFetch(out string content){ /* code */ }
The same goes for StoryParser
.
As for the fetcher itself
You can choose if the URL is either injected in the constructor, or is part of the method signature. Modern OOP coding is biased towards the constructor technique.
The code should implement the
using (...) { ... }
pattern for classes that implement theIDisposable
interface, and be wrapped in a try-catch what so ever.
As for the parser itself
since rss is a format on top of xml, I don't think it's a good idea to parse it using regular expressions. Let the framework do the parsing by using either
XmlDocument
orXDocument
. Or use an external library to parse the string for you (hopefully it's implemented by the above techniques, and not by usingRegex
es).a try-catch might be needed here, too
Putting it all together
So finally, the class you have should use the above two classes (a fetcher and a parser), and then, getting the stories from the feed should look like this:
public Story[] Getstories()
{
string content;
if (!this.fetcher.TryGetContent(out content)) return null;
Story[] stories;
if (!this.parser.TryParse(out stories)) return null;
return GetMostRecent5(stories, new StoryComparer());
}
As you can see, null
is a special value for failure. You can change that to TryGetStories
which returns a boolean
and and out
parameter if you want to avoid the special value of null
.
The actual instances of fetcher
and parser
can be created by the class itself (in the constructor) and/or be injected to the class by the constructor. Let's see the injection pattern:
public class RssFeed
{
private readonly ContentFetcher fetcher;
private readonly StoryParser parser;
public RssFeed(ContentFetcher fetcher, StoryParser parser)
{
if (null == fetcher) throw new ArgumentNullException("fetcher");
if (null == parser) throw new ArgumentNullException("parser");
this.fetcher = fetcher;
this.parser = parser;
}
}
As a final step of refactoring, you can extract the preparing code to its own method, moving the parsed stories to be part of the class' state:
private Story[] stories = null;
private bool Prepare()
{
if (this.stories != null) return true;
string content;
if (!this.fetcher.TryGetContent(out content)) return false;
Story[] temp;
if (!this.parser.TryParse(out temp)) return false;
this.stories = temp;
return true;
}
public Story[] Getstories()
{
if (!Prepare()) return null;
return GetMostRecent5(this.stories, new StoryComparer());
}
Good luck!
To add to what Ron Klein said (in no particular order):
- Use
WebClient
(or possiblyHttpClient
, if you're on .Net 4.5) instead ofWebRequest
andStreamReader
, it will make your code simpler. - Consider using
var
more in your code (although this is a question of style and not everybody would agree with that). - If you know a method returns more specific type than its return type claims (such as
WebRequest.Create()
with HTTP URL), use cast instead ofas
. This way, if something goes wrong, you will immediately get a helpfulInvalidCastException
, instead of a confusingNullReferenceException
later. - Use consistent naming conventions, ideally the ones recommended by Microsoft. (So, don't call one method
getContent
and anotherGetStories
). - Consider using interfaces like
IEnumerable<T>
andIList<T>
(andIReadOnlyList<T>
if you're on .Net 4.5) instead of arrays. Arrays cannot change their length, but the items in them can change. Most of the time, this isn't what you want. You either want to allow mutating both the length and the individual items (in that case, useIList<T>
), or don't allow mutating either (in that case useIEnumerable<T>
orIReadOnlyList<T>
). - A method like
GetMostRecent5()
is awfully specific, consider changing it to something likeGetMostRecent()
and passing the number as parameter. - Don't use fields the way you use
responseBody
. Instead use return value of the method (orout
parameter, as Ron suggested). - Don't use regular expressions to parse XML, use an XML library, such as LINQ to XML. (Ron already mentioned this, but I think this is so important it bears repeating.)
A few additional things:
- Story should probably implement IEquatable<Story>, since you already have the method implementation present on the class.
- Since you are overriding Equals on Story, you should also override GetHashCode. The compiler should actually be giving you warnings about this already.
- Consider changing Story to expose its values as properties rather than public fields. This gives you the flexibility to change their underlying behavior later without breaking anyone referencing the class.
- Re: the XML parsing - I would agree with the sentiment that you should avoid Regex for this. You may consider using Data Contracts.
- If the date order is something you always want to use for stories, you may consider moving the comparison code to the Story class and mark it as implementing IComparable<Story>. Once you do that, your list.Sort will no longer require the comparer parameter, as it will identify Story as IComparable and use its CompareTo method.
- Alternatively, you may simply use a LINQ IEnumerable<T>.OrderBy statement instead of List<T>.Sort.
I want to emphasize how important it is to override
GetHashCode()
is when you overrideEquals
. If you don't, all hashtable based collections/algorithms will sometimes fail in unexpected ways. This includesHashSet<T>
,Dictionary<K,V>
and LINQ functions likeDistinct
orGroupBy
.GetMostRecent5
can be simplified:You can use
ToArray
to create an array from a list instead of writing a loopprivate static T[] GetMostRecent5<T>(T[] stories, IComparer<T> comparer) { List<T> recentStories = stories.Take(5).ToList(); recentStories.Sort(comparer); var recentStoriesArray = recentStories.ToArray(); return recentStoriesArray; }
You can use
Array.Sort
so you don't need to copy from a list.private static T[] GetMostRecent5<T>(T[] stories, IComparer<T> comparer) { T[] recentStories = stories.Take(5).ToArray(); Array.Sort(recentStories, comparer); return recentStories; }
You can use LINQ for sorting, so you don't need a temporary array.
private static T[] GetMostRecent5b<T>(T[] stories, IComparer<T> comparer) { return stories.Take(5).OrderBy(story => story, comparer).ToArray(); }
You don't need a custom comparer. LINQ's
OrderBy
takes a projection as first parameter. You can simply project tostory.Date
.private static T[] GetMostRecent5b<T>(T[] stories) { return stories.Take(5).OrderBy(story => story.Date).ToArray(); }
At this point it's so short that it might not even deserve its own method anymore.
Since you call
Take(5)
before sorting you're relying on the items being sorted by theDate
in descending order. If that assumption is correct you don't actually need to sort again, you can simply reverse the order to ascending.stories.Take(5).Reverse().ToArray();
If that assumption is not always true you need to sort before calling
Take(5)
, else you won't get the most recent items.stories.OrderBy(story => story.Date).Take(5).Reverse().ToArray();