Challenge
Write a program which takes the URL of a PPCG answer as input and outputs the length of the answer's first code block in bytes.
Code Blocks
There are three different ways of formatting code blocks:
Four spaces before the code (-
represents a space, here)
----Code goes here
Surrounding the code in <pre>
tags:
<pre>Code goes here</pre>
or:
<pre>
Code goes here
</pre>
Surrounding the code in <pre><code>
tags (in that order):
<pre><code>Code goes here</code></pre>
Rules
You should only use the first code block in answer. Assume that the supplied answer will always have a code block.
You must find the length of the answer in bytes, not any other metric such as characters.
Solutions using the Stack Exchange Data Explorer or the Stack Exchange API are allowed. URL shorteners are disallowed.
Assume all programs use UTF-8 encoding.
Examples
Input: https://codegolf.stackexchange.com/a/91904/30525
Output: 34
Input: https://codegolf.stackexchange.com/a/91693/30525
Output: 195
Input: https://codegolf.stackexchange.com/a/80691/30525
Output: 61
Note: Despite what the answer may say, due to APL using a different encoding, 33 bytes isn't its UTF-8 byte count
Winning
The shortest code in bytes wins.
3 Answers 3
C#, (削除) 249 (削除ここまで) 270 bytes
u=>{var c=new System.Net.Http.HttpClient().GetStringAsync(u).Result;var i=c.IndexOf("de>",c.IndexOf('"'+u.Split('/')[4]))+3;return System.Text.Encoding.UTF8.GetByteCount(c.Substring(i,c.IndexOf("<",i)-i-1).Replace(">",">").Replace("<","<").Replace("&","&"));};
I've made some assumptions based on observations of the HTML of pages. Hopefully they hold true for all answer posts.
+21 bytes to unencode &
to &
. My answer failed to count it's own bytes. ;P
Ungolfed:
/*Func<string, int> Lambda =*/ u =>
{
// Download HTML of the page.
// Although new System.New.WebClient().DownloadString(u) looks shorter, it doesn't
// preserve the UTF8 encoding.
var c = new System.Net.Http.HttpClient().GetStringAsync(u).Result;
// Using the answer id from the URL, find the index of the start of the code block.
// Empirical observation has found that all code blocks are in <code> tags,
// there are no other HTML tags that end with "de>",
// and that '>' and '<' are encoded to '>'/'<' so no jerk can put "de>"
// before the first code block.
var i = c.IndexOf("de>", c.IndexOf('"' + u.Split('/')[4])) + 3;
// Get the substring of the code block text, unencode '>'/'<'/'&' and get the byte count.
// Again, empirical observation shows the closing </code> tag is always on a new
// line, so always remove 1 character when getting the substring.
return System.Text.Encoding.UTF8.GetByteCount(c.Substring(i, c.IndexOf("<", i) - i - 1).Replace(">", ">").Replace("<", "<")..Replace("&", "&"));
};
Results:
Input: http://codegolf.stackexchange.com/a/91904/30525
Output: 34
Input: http://codegolf.stackexchange.com/a/91693/30525
Output: 195 (at the time of this answer the first post has 196, but I can't find the 196th
byte, even counting by hand, so assuming 195 is correct)
Input: http://codegolf.stackexchange.com/a/80691/30525
Output: 61
Input: http://codegolf.stackexchange.com/a/91995/58106 (this answer)
Output: 270
-
\$\begingroup\$ I find 195 Bytes too \$\endgroup\$Jörg Hülsermann– Jörg Hülsermann2016年09月02日 01:17:30 +00:00Commented Sep 2, 2016 at 1:17
-
\$\begingroup\$ Thanks, I've edited the question and my answer with 195 bytes \$\endgroup\$Beta Decay– Beta Decay2016年09月02日 05:23:31 +00:00Commented Sep 2, 2016 at 5:23
jQuery JavaScript, 97 Bytes
console.log((new Blob([$("#answer-"+location.hash.substr(1)+" code").eq(0).text().trim()])).size)
old version 151 Bytes
s=$('#answer-'+location.hash.substr(1)+' code').eq(0).text().trim();c=0;for(i=s.length;i--;)d=s.charCodeAt(i),c=d<128?c+1:d<2048?c+2:c+3;console.log(c)
old version 163 Bytes
s=$('#answer-'+location.hash.substr(1)+' code').first().text().trim();c=0;for(i=0;i<s.length;i++){d=s.charCodeAt(i);c=(d<128)?c+1:(d<2048)?c+2:c+3;}console.log(c);
input ist location and jQuery is active on the site
-
\$\begingroup\$ Move the
c=0
into the for, to save a byte.for(c=i=0;i<s.lengh;i++)
\$\endgroup\$Paul Schmitz– Paul Schmitz2016年09月02日 07:42:02 +00:00Commented Sep 2, 2016 at 7:42 -
\$\begingroup\$ And remove the
;
before and after theconsole.log(c)
. \$\endgroup\$Paul Schmitz– Paul Schmitz2016年09月02日 08:19:27 +00:00Commented Sep 2, 2016 at 8:19 -
\$\begingroup\$ @PaulSchmitz Thank You. I prefer to use a while loop instead of your proposal. \$\endgroup\$Jörg Hülsermann– Jörg Hülsermann2016年09月02日 08:52:34 +00:00Commented Sep 2, 2016 at 8:52
-
1\$\begingroup\$
s=$("#answer-"+location.hash.substr(1)+" code").eq(0).text().trim();c=0;for(i=s.length;i--;)d=s.charCodeAt(i),c=128>d?c+1:2048>d?c+2:c+3;console.log(c)
is 5 bytes shorter. 4 for removing()
around the conditions, and 1 by replacingwhile
withfor
. \$\endgroup\$Paul Schmitz– Paul Schmitz2016年09月02日 08:58:08 +00:00Commented Sep 2, 2016 at 8:58
Java 7, (削除) 314 (削除ここまで) 313 bytes
int c(String u)throws Exception{String x="utf8",s=new java.util.Scanner(new java.net.URL(u).openStream(),x).useDelimiter("\\A").next();int j=s.indexOf("de>",s.indexOf('"'+u.split("/")[4]))+3;return s.substring(j,s.indexOf("<",j)-1).replace(">",">").replace("<","<").replace("&","&").getBytes(x).length;}
Shamelessly stolen from @milk's C# answer and ported to Java 7.
NOTE: This assumes all code is in <code>
blocks. Currently it won't work with just <pre>
tags (but who uses those anyway?.. ;p).
Ungolfed & test cases:
class M{
static int c(String u) throws Exception{
String x = "utf8",
s = new java.util.Scanner(new java.net.URL(u).openStream(), x).useDelimiter("\\A").next();
int j = s.indexOf("de>", s.indexOf('"'+u.split("/")[4])) + 3;
return s.substring(j, s.indexOf("<", j) - 1).replace(">", ">").replace("<", "<").replace("&", "&")
.getBytes(x).length;
}
public static void main(String[] a) throws Exception{
System.out.println(c("https://codegolf.stackexchange.com/a/91904/30525"));
System.out.println(c("https://codegolf.stackexchange.com/a/91693/30525"));
System.out.println(c("https://codegolf.stackexchange.com/a/80691/30525"));
System.out.println(c("https://codegolf.stackexchange.com/a/91995/58106"));
}
}
Output:
34
195
61
270
Explore related questions
See similar questions with these tags.
<pre>
tags, and so are newlines in<code>
tags, so your Code goes here examples have three different lengths. If that's intentional, it should be mentioned. \$\endgroup\$