I've developed a simple application to grab golfer index scores from a website that has no API. The application works but is very slow, with 6 users that require updating takes 60 seconds. I've tried to make my web requests asyncronous to offset some of this lag but it only resulted in a 15% increase in performance.
Description of the code:
On my view I have anchor tag that when clicked hides all the elements in the DOM and loads a preloader, after the preloader is appended to the DOM a AJAX call is executed that calls the UpdatedHandicap method on a controller in my project. From there we await a static method GrabIndexValue. All the code works, it's just very slow.
Possible solution:
This website allows me to input multiple GHIN #s however the result set is in a table with strangely generated xpaths:
//*[@id="ctl00_bodyMP_gvLookupResults_ctl02_lblHI"] which returns index of: 10.5
//*[@id="ctl00_bodyMP_gvLookupResults_ctl03_lblHI"] which returns index of: 9
//*[@id="ctl00_bodyMP_gvLookupResults_ctl04_lblHI"] which returns index of: 13.5
I don't know how to dynamically grab those result sets and parse them properly. So I feel like I'm stuck doing 1 web request per index value.
Async controller method:
public async Task<ActionResult> UpdateHandicap()
{
//Fetch all the golfers
var results = db.Users.ToList();
//Iterate through the golfers and update their index value based off their GHIN #. We store this
//value in the database to make our handicap calculation
foreach (Users user in results)
{
user.Index = await Calculations.GrabIndexValue(user.GHID);
db.Entry(user).State = EntityState.Modified;
db.SaveChanges();
}
return RedirectToAction("Index", "Users");
}
Method to actually grab the data:
public static async Task<double?> GrabIndexValue(int ghin)
{
string url = $"http://xxxxxxxxx/Widgets/HandicapLookupResults.aspx?entry=1&ghinno="+ghin+"&css=default&dynamic=&small=0&mode=&tab=0";
HtmlWeb web = new HtmlWeb();
HtmlDocument htmlDoc = await Task.Run(()=>web.Load(url));
string index = null;
try
{
index = htmlDoc.DocumentNode.SelectNodes("//*[@id=\"ctl00_bodyMP_grdClubs\"]/tbody/tr/td[2]")[0].InnerText;
}
catch
{ }
if (index != null)
{
var indexResult = Convert.ToDouble(index);
return indexResult;
}
else
{
return null;
}
}
Any insight into this predicament is greatly appreciated
Thank you
2 Answers 2
public async Task<ActionResult> UpdateHandicap()
We add the Async
suffix to methods that are marked with the async
keyword. This is a naming convention that as you'll see in a moment Entity Framework follows too.
var results = db.Users.ToList();
You can use the Users
directly in the loop, you don't have to call ToList
first.
foreach (Users user in results) { user.Index = await Calculations.GrabIndexValue(user.GHID); db.Entry(user).State = EntityState.Modified; db.SaveChanges(); }
The async
calls are incomplete. SaveChanges
would still block so you also want to use the
await db.SaveChangesAsync();
but do you really need to call SaveChanges
in a loop? This could be bad for the performance. I think you should do it after the loop:
foreach (var user in db.Users)
{
user.Index = await Calculations.GrabIndexValue(user.GHID);
db.Entry(user).State = EntityState.Modified;
}
await db.SaveChangesAsync();
Don't Try if you aren't going to do something with what you catch, I didn't think you were able to write a catch block without a parameter anyway.
Then you are going to want to change that ugly if statement and make it much cleaner with a ternary Statement
This:
try { index = htmlDoc.DocumentNode.SelectNodes("//*[@id=\"ctl00_bodyMP_grdClubs\"]/tbody/tr/td[2]")[0].InnerText; } catch { } if (index != null) { var indexResult = Convert.ToDouble(index); return indexResult; } else { return null; }
Becomes this:
index = htmlDoc.DocumentNode.SelectNodes("//*[@id=\"ctl00_bodyMP_grdClubs\"]/tbody/tr/td[2]")[0].InnerText;
Return index != null ? Convert.ToDouble(index) : null;
You should also use SelectSingleNode
instead of SelectNodes
so that you return a single node instead of an array of nodes.
If you really don't want an exception in the method to stop the whole process then you need to put something into the Catch that will let you know that something happened, logging would be preferable, but you could also put in a negative number for the index of that Golfer and then you would know which Golfer the Exception happened on and you would be able to report on that information as well...(I assumed that these "index" numbers are all positive or null.)
you could also change your
public async Task<ActionResult> UpdateHandicap() { //Fetch all the golfers var results = db.Users.ToList(); //Iterate through the golfers and update their index value based off their GHIN #. We store this //value in the database to make our handicap calculation foreach (Users user in results) { user.Index = await Calculations.GrabIndexValue(user.GHID); db.Entry(user).State = EntityState.Modified; db.SaveChanges(); } return RedirectToAction("Index", "Users"); }
into a Linq statement(ish) and do what @t3chb0t said about saving the changes outside of the loop as well.
another thing that someone mentioned in chat was that you may not need to mess with the EntityState
because EF usually handles that fairly well on its own. so your code would look more like this
public async Task<ActionResult> UpdateHandicap()
{
var results = db.Users.ToList().ForEach(user => user.Index = await Calculations.GrabIndexValue(user.GHID));
db.SaveChanges();
return RedirectToAction("Index", "Users");
}
-
\$\begingroup\$ Hm, well if it fails to return a result I just want it to continue on and go to the next web request. I don't want the application to throw a null reference exception. When I caught the exception before it halted the application. What do you suggest? \$\endgroup\$jon.r– jon.r2017年01月12日 14:40:20 +00:00Commented Jan 12, 2017 at 14:40
-
\$\begingroup\$ I added a little something to address if you want to catch to keep the application going. \$\endgroup\$Malachi– Malachi2017年01月12日 15:06:45 +00:00Commented Jan 12, 2017 at 15:06
-
\$\begingroup\$ Thanks for the feedback, I will look to incorporate this. People who are extremely good at golf could have a negative index/handicap surprisingly enough, \$\endgroup\$jon.r– jon.r2017年01月12日 15:07:58 +00:00Commented Jan 12, 2017 at 15:07
-
\$\begingroup\$ you could make it outrageously negative then so that you know or give high indexes for certain errors \$\endgroup\$Malachi– Malachi2017年01月12日 15:11:22 +00:00Commented Jan 12, 2017 at 15:11
-
\$\begingroup\$ Thanks @Malachi . I never used async/await before. Do you think I implemented it correctly/well? \$\endgroup\$jon.r– jon.r2017年01月12日 15:12:23 +00:00Commented Jan 12, 2017 at 15:12
Explore related questions
See similar questions with these tags.