I have the ToHtml()
Extension Method
public static class StringMethodExtensions
{
private static string _paraBreak = "\r\n\r\n";
private static string _link = "<a href=\"{0}\">{1}</a>";
private static string _linkNoFollow = "<a href=\"{0}\" rel=\"nofollow\">{1}</a>";
/// <summary>
/// Returns a copy of this string converted to HTML markup.
/// </summary>
public static string ToHtml(this string s)
{
return ToHtml(s, false);
}
/// <summary>
/// Returns a copy of this string converted to HTML markup.
/// </summary>
/// <param name="nofollow">If true, links are given "nofollow"
/// attribute</param>
public static string ToHtml(this string s, bool nofollow)
{
StringBuilder sb = new StringBuilder();
int pos = 0;
while (pos < s.Length)
{
// Extract next paragraph
int start = pos;
pos = s.IndexOf(_paraBreak, start);
if (pos < 0)
pos = s.Length;
string para = s.Substring(start, pos - start).Trim();
// Encode non-empty paragraph
if (para.Length > 0)
EncodeParagraph(para, sb, nofollow);
// Skip over paragraph break
pos += _paraBreak.Length;
}
// Return result
return sb.ToString();
}
/// <summary>
/// Encodes a single paragraph to HTML.
/// </summary>
/// <param name="s">Text to encode</param>
/// <param name="sb">StringBuilder to write results</param>
/// <param name="nofollow">If true, links are given "nofollow"
/// attribute</param>
private static void EncodeParagraph(string s, StringBuilder sb, bool nofollow)
{
// Start new paragraph
sb.AppendLine("<p>");
// HTML encode text
s = HttpUtility.HtmlEncode(s);
// Convert single newlines to <br>
s = s.Replace(Environment.NewLine, "<br />\r\n");
// Encode any hyperlinks
EncodeLinks(s, sb, nofollow);
// Close paragraph
sb.AppendLine("\r\n</p>");
}
/// <summary>
/// Encodes [[URL]] and [[Text][URL]] links to HTML.
/// </summary>
/// <param name="text">Text to encode</param>
/// <param name="sb">StringBuilder to write results</param>
/// <param name="nofollow">If true, links are given "nofollow"
/// attribute</param>
private static void EncodeLinks(string s, StringBuilder sb, bool nofollow)
{
// Parse and encode any hyperlinks
int pos = 0;
while (pos < s.Length)
{
// Look for next link
int start = pos;
pos = s.IndexOf("[[", pos);
if (pos < 0)
pos = s.Length;
// Copy text before link
sb.Append(s.Substring(start, pos - start));
if (pos < s.Length)
{
string label, link;
start = pos + 2;
pos = s.IndexOf("]]", start);
if (pos < 0)
pos = s.Length;
label = s.Substring(start, pos - start);
int i = label.IndexOf("][");
if (i >= 0)
{
link = label.Substring(i + 2);
label = label.Substring(0, i);
}
else
{
link = label;
}
// Append link
sb.Append(String.Format(nofollow ? _linkNoFollow : _link, link, label));
// Skip over closing "]]"
pos += 2;
}
}
}
}
Do you see any improvement / issue?
-
3\$\begingroup\$ I personally like the var keyword more than actual types :) \$\endgroup\$Knerd– Knerd2014年11月24日 21:23:28 +00:00Commented Nov 24, 2014 at 21:23
-
\$\begingroup\$ Are you trying to handle some standard markup language, or are you inventing your own syntax for specifying links? \$\endgroup\$200_success– 200_success2014年11月24日 21:23:37 +00:00Commented Nov 24, 2014 at 21:23
-
\$\begingroup\$ I want to use some standard markup language. \$\endgroup\$sDima– sDima2014年11月29日 09:03:43 +00:00Commented Nov 29, 2014 at 9:03
-
\$\begingroup\$ Looks like this code originally came from here \$\endgroup\$sonyisda1– sonyisda12020年11月17日 15:27:53 +00:00Commented Nov 17, 2020 at 15:27
2 Answers 2
public static string ToHtml(this string s, bool nofollow)
{
s = HttpUtility.HtmlEncode(s);
string[] paragraphs = s.Split(new string[]{"\r\n\r\n"}, StringSplitOptions.None);
StringBuilder sb = new StringBuilder();
foreach (string par in paragraphs) {
sb.AppendLine("<p>");
string p = par.Replace(Environment.NewLine, "<br />\r\n");
if (nofollow) {
p = Regex.Replace(p, @"\[\[(.+)\]\[(.+)\]\]", "<a href=\"2ドル\" rel=\"nofollow\">1ドル</a>");
p = Regex.Replace(p, @"\[\[(.+)\]\]", "<a href=\"1ドル\" rel=\"nofollow\">1ドル</a>");
} else {
p = Regex.Replace(p, @"\[\[(.+)\]\[(.+)\]\]", "<a href=\"2ドル\">1ドル</a>");
p = Regex.Replace(p, @"\[\[(.+)\]\]", "<a href=\"1ドル\">1ドル</a>");
sb.AppendLine(p);
}
sb.AppendLine("</p>");
}
return sb.ToString();
}
In general, if you find yourself with a variable that you pass around through a bunch of static methods (like nofollow
), it may be time to make those methods not static.
-
\$\begingroup\$ This code won't compile: you have an
'
char before the string literal in thes.Split
method call. And thep
variable is read-only. And you forgot to appendp
to thesb
. And replacements in theRegex.Replace
should be1ドル
but not the\1円
. \$\endgroup\$Dmitry– Dmitry2014年11月25日 01:34:03 +00:00Commented Nov 25, 2014 at 1:34 -
\$\begingroup\$ Thanks for catching all that. I've edited the post. I also didn't realize C# foreach iteration variables were read-only... seems silly. \$\endgroup\$raylu– raylu2014年11月26日 08:53:47 +00:00Commented Nov 26, 2014 at 8:53
Spacing
Your spacing is too tight, and you should always expand two-line if statements to use braces for clarity. One StyleCop guideline which works well for spacing is to have a space before and after statements using curly braces.
int start = pos;
pos = s.IndexOf(_paraBreak, start);
if (pos < 0)
pos = s.Length;
string para = s.Substring(start, pos - start).Trim();
Looks better as:
int start = pos;
pos = s.IndexOf(_paraBreak, start);
if(pos < 0)
{
pos = s,Length;
}
string para = s.SubString(start, pos - start).Trim();
var
Prefer var
when declaring a local variable where it's obvious from the right-hand side of the declaration. This saves time when changing variable types, and also keeps the code shorter:
string para = s.SubString(start, pos - start).Trim();
Becomes:
var para = s.SubString(start, pos - start).Trim();
Naming
Variable names like para
and s
are unhelpful to a maintenance programmer, use longer variable names that describe the purpose of the variable.
Guard clauses
You should validate the parameters to your methods. Yes, even the this parameter, because somebody can do this:
string input = null;
input.ToHTML();
Which causes a NullReferenceException
when it should be an ArgumentNullException
or a returned empty string.
-
\$\begingroup\$ Obviously it's a matter of taste, but you really don't need braces AND a blank line of padding for a one statement
if
block. Six lines of code just for a single conditional call or assignment is ridiculous. The blank lines alone do just fine to separate the block, and then you can leave off the braces without visual issue. \$\endgroup\$TheAtomicOption– TheAtomicOption2019年07月26日 20:38:22 +00:00Commented Jul 26, 2019 at 20:38 -
\$\begingroup\$ The issue I have with leaving the braces off of single-line if-statements is that (for me at least) I find it harder to register the end of the if statement than if it was wrapped nicely in braces, and so it increases the risk that I try to add another line, thinking it would be part of the if condition, when it wouldn't. I do not personally consider lines of code (when they're not statements) to have a "cost" on modern screens, and giving code room to breathe is better than condensing it, in my opinion. \$\endgroup\$Nick Udell– Nick Udell2019年07月29日 15:14:32 +00:00Commented Jul 29, 2019 at 15:14