5
\$\begingroup\$

I have the ToHtml() Extension Method

public static class StringMethodExtensions
{
 private static string _paraBreak = "\r\n\r\n";
 private static string _link = "<a href=\"{0}\">{1}</a>";
 private static string _linkNoFollow = "<a href=\"{0}\" rel=\"nofollow\">{1}</a>";
 /// <summary>
 /// Returns a copy of this string converted to HTML markup.
 /// </summary>
 public static string ToHtml(this string s)
 {
 return ToHtml(s, false);
 }
 /// <summary>
 /// Returns a copy of this string converted to HTML markup.
 /// </summary>
 /// <param name="nofollow">If true, links are given "nofollow"
 /// attribute</param>
 public static string ToHtml(this string s, bool nofollow)
 {
 StringBuilder sb = new StringBuilder();
 int pos = 0;
 while (pos < s.Length)
 {
 // Extract next paragraph
 int start = pos;
 pos = s.IndexOf(_paraBreak, start);
 if (pos < 0)
 pos = s.Length;
 string para = s.Substring(start, pos - start).Trim();
 // Encode non-empty paragraph
 if (para.Length > 0)
 EncodeParagraph(para, sb, nofollow);
 // Skip over paragraph break
 pos += _paraBreak.Length;
 }
 // Return result
 return sb.ToString();
 }
 /// <summary>
 /// Encodes a single paragraph to HTML.
 /// </summary>
 /// <param name="s">Text to encode</param>
 /// <param name="sb">StringBuilder to write results</param>
 /// <param name="nofollow">If true, links are given "nofollow"
 /// attribute</param>
 private static void EncodeParagraph(string s, StringBuilder sb, bool nofollow)
 {
 // Start new paragraph
 sb.AppendLine("<p>");
 // HTML encode text
 s = HttpUtility.HtmlEncode(s);
 // Convert single newlines to <br>
 s = s.Replace(Environment.NewLine, "<br />\r\n");
 // Encode any hyperlinks
 EncodeLinks(s, sb, nofollow);
 // Close paragraph
 sb.AppendLine("\r\n</p>");
 }
 /// <summary>
 /// Encodes [[URL]] and [[Text][URL]] links to HTML.
 /// </summary>
 /// <param name="text">Text to encode</param>
 /// <param name="sb">StringBuilder to write results</param>
 /// <param name="nofollow">If true, links are given "nofollow"
 /// attribute</param>
 private static void EncodeLinks(string s, StringBuilder sb, bool nofollow)
 {
 // Parse and encode any hyperlinks
 int pos = 0;
 while (pos < s.Length)
 {
 // Look for next link
 int start = pos;
 pos = s.IndexOf("[[", pos);
 if (pos < 0)
 pos = s.Length;
 // Copy text before link
 sb.Append(s.Substring(start, pos - start));
 if (pos < s.Length)
 {
 string label, link;
 start = pos + 2;
 pos = s.IndexOf("]]", start);
 if (pos < 0)
 pos = s.Length;
 label = s.Substring(start, pos - start);
 int i = label.IndexOf("][");
 if (i >= 0)
 {
 link = label.Substring(i + 2);
 label = label.Substring(0, i);
 }
 else
 {
 link = label;
 }
 // Append link
 sb.Append(String.Format(nofollow ? _linkNoFollow : _link, link, label));
 // Skip over closing "]]"
 pos += 2;
 }
 }
 }
}

Do you see any improvement / issue?

200_success
146k22 gold badges190 silver badges479 bronze badges
asked Nov 24, 2014 at 21:16
\$\endgroup\$
4
  • 3
    \$\begingroup\$ I personally like the var keyword more than actual types :) \$\endgroup\$ Commented Nov 24, 2014 at 21:23
  • \$\begingroup\$ Are you trying to handle some standard markup language, or are you inventing your own syntax for specifying links? \$\endgroup\$ Commented Nov 24, 2014 at 21:23
  • \$\begingroup\$ I want to use some standard markup language. \$\endgroup\$ Commented Nov 29, 2014 at 9:03
  • \$\begingroup\$ Looks like this code originally came from here \$\endgroup\$ Commented Nov 17, 2020 at 15:27

2 Answers 2

2
\$\begingroup\$
public static string ToHtml(this string s, bool nofollow)
{
 s = HttpUtility.HtmlEncode(s);
 string[] paragraphs = s.Split(new string[]{"\r\n\r\n"}, StringSplitOptions.None);
 StringBuilder sb = new StringBuilder();
 foreach (string par in paragraphs) {
 sb.AppendLine("<p>");
 string p = par.Replace(Environment.NewLine, "<br />\r\n");
 if (nofollow) {
 p = Regex.Replace(p, @"\[\[(.+)\]\[(.+)\]\]", "<a href=\"2ドル\" rel=\"nofollow\">1ドル</a>");
 p = Regex.Replace(p, @"\[\[(.+)\]\]", "<a href=\"1ドル\" rel=\"nofollow\">1ドル</a>");
 } else {
 p = Regex.Replace(p, @"\[\[(.+)\]\[(.+)\]\]", "<a href=\"2ドル\">1ドル</a>");
 p = Regex.Replace(p, @"\[\[(.+)\]\]", "<a href=\"1ドル\">1ドル</a>");
 sb.AppendLine(p);
 }
 sb.AppendLine("</p>");
 }
 return sb.ToString();
}

In general, if you find yourself with a variable that you pass around through a bunch of static methods (like nofollow), it may be time to make those methods not static.

answered Nov 24, 2014 at 23:43
\$\endgroup\$
2
  • \$\begingroup\$ This code won't compile: you have an ' char before the string literal in the s.Split method call. And the p variable is read-only. And you forgot to append p to the sb. And replacements in the Regex.Replace should be 1ドル but not the \1円. \$\endgroup\$ Commented Nov 25, 2014 at 1:34
  • \$\begingroup\$ Thanks for catching all that. I've edited the post. I also didn't realize C# foreach iteration variables were read-only... seems silly. \$\endgroup\$ Commented Nov 26, 2014 at 8:53
1
\$\begingroup\$

Spacing

Your spacing is too tight, and you should always expand two-line if statements to use braces for clarity. One StyleCop guideline which works well for spacing is to have a space before and after statements using curly braces.

int start = pos;
pos = s.IndexOf(_paraBreak, start);
if (pos < 0)
 pos = s.Length;
string para = s.Substring(start, pos - start).Trim();

Looks better as:

int start = pos;
pos = s.IndexOf(_paraBreak, start);
if(pos < 0)
{
 pos = s,Length;
}
string para = s.SubString(start, pos - start).Trim();

var

Prefer var when declaring a local variable where it's obvious from the right-hand side of the declaration. This saves time when changing variable types, and also keeps the code shorter:

string para = s.SubString(start, pos - start).Trim();

Becomes:

var para = s.SubString(start, pos - start).Trim();

Naming

Variable names like para and s are unhelpful to a maintenance programmer, use longer variable names that describe the purpose of the variable.

Guard clauses

You should validate the parameters to your methods. Yes, even the this parameter, because somebody can do this:

string input = null;
input.ToHTML();

Which causes a NullReferenceException when it should be an ArgumentNullException or a returned empty string.

answered Dec 2, 2014 at 13:18
\$\endgroup\$
2
  • \$\begingroup\$ Obviously it's a matter of taste, but you really don't need braces AND a blank line of padding for a one statement if block. Six lines of code just for a single conditional call or assignment is ridiculous. The blank lines alone do just fine to separate the block, and then you can leave off the braces without visual issue. \$\endgroup\$ Commented Jul 26, 2019 at 20:38
  • \$\begingroup\$ The issue I have with leaving the braces off of single-line if-statements is that (for me at least) I find it harder to register the end of the if statement than if it was wrapped nicely in braces, and so it increases the risk that I try to add another line, thinking it would be part of the if condition, when it wouldn't. I do not personally consider lines of code (when they're not statements) to have a "cost" on modern screens, and giving code room to breathe is better than condensing it, in my opinion. \$\endgroup\$ Commented Jul 29, 2019 at 15:14

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.