5
\$\begingroup\$

I wrote code for my needs (i.e. collecting YouTube links from a page) and it is simple, but now I just want to know what can I do just to make it a better and properly written one. For example, I think I should have try/catch blocks/exceptions in my methods, or get/set properties. What tests can I write to test the code? I don't know to be honest, but how would the code look if you had written it?

public class Parser
{
 public static void Main()
 {
 Console.WriteLine("Enter the url of the page:");
 var url = Console.ReadLine();
 var numOfPages = GetNumberOfPages(url);
 var numOfLinks = 1;
 for (int page = 0; page <= numOfPages; page++)
 {
 if (page != 1)
 {
 if (page == 0)
 {
 GetYouTubeLink(url).ForEach(x => Console.WriteLine($"{numOfLinks++}." +
 $"{x.Replace("embed/", "watch?v=").Replace("?feature=oembed", "")}"));
 }
 else
 {
 var newUrl = url + string.Format($"{page}/");
 GetYouTubeLink(newUrl).ForEach(x => Console.WriteLine($"{numOfLinks++}." +
 $"{x.Replace("embed/", "watch?v=").Replace("?feature=oembed", "")}"));
 newUrl = string.Empty;
 }
 }
 }
 }
 private static HtmlDocument GetWebPageHtmlFromUrl(string url)
 {
 var webGet = new HtmlWeb();
 return webGet.Load(url);
 }
 private static int GetNumberOfPages(string url)
 {
 var pages = GetWebPageHtmlFromUrl(url)
 .DocumentNode.SelectNodes("//div[@class='page-link']")
 .Descendants("a")
 .Last().InnerText;
 return int.Parse(pages);
 }
 private static List<string> GetYouTubeLink(string url)
 {
 var links = new List<string>();
 var iframeLinks = GetWebPageHtmlFromUrl(url).DocumentNode.SelectNodes("//iframe");
 if (iframeLinks != null)
 {
 foreach (var link in iframeLinks)
 {
 links.Add(link.Attributes["src"].Value);
 }
 }
 return links;
 }
}
Sᴀᴍ Onᴇᴌᴀ
29.5k16 gold badges45 silver badges201 bronze badges
asked Apr 26, 2018 at 0:12
\$\endgroup\$
1
  • \$\begingroup\$ CRs are so much easier when I can just paste the code in VS. Thank you! \$\endgroup\$ Commented Apr 30, 2018 at 22:14

1 Answer 1

1
\$\begingroup\$

Comments inline:

public class Parser
{
 public static void Main()
 {
 Console.WriteLine("Enter the url of the page:");
 var url = Console.ReadLine();
 var numOfPages = GetNumberOfPages(url);
 var numOfLinks = 1;
 for (int page = 0; page <= numOfPages; page++)
 {
 // if (page != 1) is redundant in the original code, did you mean numOfPages?
 var newUrl = url + (page > 0 ? string.Format($"{page}/") : "");
 GetYouTubeLinks(newUrl).ForEach(x => Console.WriteLine($"{numOfLinks++}." +
 x.Replace("embed/", "watch?v=").Replace("?feature=oembed", ""))); // remove string interpolation since it's just one value
 // newUrl = string.Empty; // Not needed
 }
 }
 private static HtmlDocument GetWebPageHtmlFromUrl(string url)
 {
 var webGet = new HtmlWeb();
 return webGet.Load(url);
 }
 private static int GetNumberOfPages(string url)
 {
 var pages = GetWebPageHtmlFromUrl(url)
 .DocumentNode.SelectNodes("//div[@class='page-link']") // This parsing may work on specific pages, but it won't work on all pages. This is concerning since the user is prompted to enter any URL?
 .Descendants("a")
 .Last().InnerText;
 return int.Parse(pages);
 }
 private static List<string> GetYouTubeLinks(string url) // Made plural
 {
 var iframeLinks = GetWebPageHtmlFromUrl(url).DocumentNode.SelectNodes("//iframe");
 if (iframeLinks == null) // reduce nesting
 {
 return new List<string>(0);
 }
 return iframeLinks.Select(link => link.Attributes["src"].Value).ToList(); // simplify allocation
 }
}
answered Apr 30, 2018 at 22:13
\$\endgroup\$
5
  • \$\begingroup\$ Actually I am checking if the page is not 1, because the first page actually is 0 and the second is 2. And the other thing is that I am getting the number of pages only from the first one, so I don't need to call this method on every page. \$\endgroup\$ Commented May 1, 2018 at 23:12
  • \$\begingroup\$ Why not just make the first page 1? I.e. for (int page = 1; page <= numOfPages; page++) I'm confused by your "getting the number of pages" once comment. I'm only calling GetNumberOfPages once, same as you. My comment about numOfPages wouldn't have required any recalculation, I was just trying to figure out what you were trying to accomplish. \$\endgroup\$ Commented May 2, 2018 at 15:17
  • \$\begingroup\$ Well, will try to explain it. So when I call GetNumberOfPages(), on every page on the bottom there is pagination and the number of pages, So I just call this method on the first page and I have the number of the pages. First page in the URL is like this: website.com/ , the second is website.com/2/ so I don't have page 1 as you suggested me to use if in the for loop (don't have website.com/1/) that's why I check if the loop isn't 1. \$\endgroup\$ Commented May 2, 2018 at 20:39
  • \$\begingroup\$ So you could just start at 1 instead of 0 and do var newUrl = url + (page > 1 ? string.Format($"{page}/") : ""); \$\endgroup\$ Commented May 2, 2018 at 23:29
  • \$\begingroup\$ Yep, it works and looks great to be honest, thank you for the help! Great tips and suggestions :) \$\endgroup\$ Commented May 3, 2018 at 1:20

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.