# [00:44] <Dashiva> Just to see if I got the terminology right... HTMLDivElement is an 'interface object', HTMLDivElement.prototype (also somediv.[[prototype]]) is an 'interface prototype object' and somediv is a 'host object implementing an interface'.
# [00:44] <heycam> Dashiva, i see the bug (the "go to step 23")
# [02:12] <takkaria> could anyone confirm my reading of the tokenisation part of the spec, that '<h a="¬">' should result in an 'a' attribute containing the not symbol?
# [02:13] <takkaria> oh, wait, I think I see my problem
# [09:21] <Dashiva> heycam: If you would jump in s14 there's no such property. That means that in s15, there can't be a property with putforwards either (since there's no property)
# [09:24] <Dashiva> And you jump in s15 if there is no property (or there is a property, and it's not putforwards)
# [09:24] <Dashiva> Essentially s15 is all of s14 + check putforwards
# [09:26] <heycam> true i could combine s14 and s15 to say "If O does not have a property with name P, or if it does but it does not blah blah [PutForwards] ..."
# [09:27] <heycam> i split them only to avoid a big conditional thing like that
# [09:27] <Dashiva> I just figured 'does not correspond to an attribute' included the case 'there is no such attribute'
# [09:28] <Dashiva> Also, it seems a bit string to split on attribute exists/not in step 14, and then combine them into step 19, and then split again in step 21
# [09:28] <heycam> or you figured "If the property on O with name P does not correspond ..." could still evaluate to false if there is no such property
# [09:29] <Dashiva> Yeah. I see it might not be spec-level clarity.
# [09:31] <heycam> so with the splitting on 14 / combining on 19 / splitting on 21... could you elaborate?
# [09:32] <Dashiva> If the property does not exist, or if the property exists but is not putforward, in both cases it is necessary to run [[CanPut]].
# [09:41] <Dashiva> PutForwards implies a Readonly attribute, so we can't CanPut before it anyhows, I believe
# [09:42] <heycam> right (well, the [[CanPut]] could be done, but not the return based on it)
# [09:43] * heycam wonders if he should write up his algorithms as C code and then run gcc -S -O9 on them to try to minimise them :)
# [09:44] <Dashiva> Would unifying 14 and 15 be easier if the conditional was inverted? e.g. if there is a property with name P and that property has extattr putforwards, jump to steps equivalent to 16-18. And then the canput stuff following instead of in a jump.
# [09:47] <Dashiva> Oh well. As long as [[CanPut]] happens for both new and old properties, I suppose this is just bikeshedding.
# [09:48] <heycam> i should have a "you can implement these how you want as long as the observable effects are the same" in the spec somewhere (but don't yet)
# [09:49] <Dashiva> By the way, is the special casing of 2^32-1 for integer indexes explained somewhere?
# [11:31] <Hixie> teehee, on june 6th rb asked gregory to forward many of his comments on html5 to the xhtml2 group
# [11:31] <Hixie> but as far as i can tell, that didn't happen
# [11:34] <zcorpan> "When the defaultPlaybackRate or playbackRate attributes change value", setting to the value it already has means it hasn't changed right
# [11:40] <Hixie> i can specify it the other way if that's what UAs want
# [11:40] <Hixie> hey does anyone know what the status of the svgwg's work on the svg parsing thing is?
# [11:40] <Hixie> i don't recall ever hearing back from them
# [11:40] <Hixie> and it's been what, two months now?
# [11:42] <hsivonen> ouch. document.write writing a script that document.writes is more complex than I first thought :-(
# [11:43] <hsivonen> Hixie: do you know if Gecko and WebKit really call the parser re-entrantly or whether they interleave parser and script execution using a run queue?
# [11:43] <Hixie> no idea what their implementation does
# [11:43] <Hixie> the net result is basically what the spec says though
# [11:43] <Hixie> though the spec is mildly more complex to handle script async and script defer
# [11:44] <hsivonen> I had happily assumed a run queue model, but now that I try to write it down in the case where document.write document.writes, it starts to get ugly
# [11:44] <virtuelv> hsivonen: document write is evil
# [11:45] <virtuelv> I hacked up a bunch of testcases that indicates that no two browsers behave the same
# [11:45] <Hixie> the browsers are pretty close to each other, most of the differences are obvious bugs
# [11:45] <virtuelv> Hixie: I'll ask if I can release the TC's somewhere
# [11:45] <virtuelv> I'm not sure what are bugs or not
# [11:46] <virtuelv> document.write in setTimeout calls are so much fun
# [11:46] <hsivonen> Hixie: anyway, I have trouble wrapping my head around the spec assertion that the tree builder is re-entrant
# [11:46] <Hixie> they should just blow away the document
# [11:46] <hsivonen> Hixie: it seems to me that the parser doesn't need to be re-entrant
# [11:47] <virtuelv> Hixie: suffice it to say that browsers in general don't cancel the timeout
# [11:47] <Hixie> hsivonen: just like any thing that you can implement as a recursive function can be implemented with a hand-managed stack? or more so?
# [11:48] <Hixie> hsivonen: recursive algorithms are what the spec uses generally because they're easier to explain and understand, but they're not what i'd recommend implementing
# [11:48] <Hixie> hsivonen: though iirc the parser's re-entrancy is only ever one level deep
# [11:49] <hsivonen> Hixie: instead of recursing, a single-threaded browser engine should be able to put the parser state on the heap and spin through the event loop
# [11:50] <Hixie> well the parser has to be interruptible in a browser anyway
# [11:50] <Hixie> so it would just interrupt itself
# [11:51] <hsivonen> So I'm again in a situation where I'm trying to unroll the spec definition into something else but equivalent
# [11:51] <Hixie> that is an expected situation, yes
# [11:52] <Hixie> no spec will ever directly map to all implementations
# [11:55] <hsivonen> Hmm. I can't figure how to make three-level-deep document.write have the right execution order in a GWT context where the browser owns the script execution but the script owns the parser
# [11:55] <hsivonen> too bad. I can't use GWT to fully try stuff out then
# [11:59] <roc> annevk: somewhat abusive use of "20 lines" though
# [11:59] <annevk> Hixie, I don't think there was much progress, though they assigned some actions
# [11:59] <annevk> roc, "20 lines" is a dubious concept anyway :)
# [11:59] <hsivonen> roc: I seriously think the same tokenizer should handle both HTML and SVG islands
# [12:00] <Hixie> hsivonen: gecko has some thing where adding text to a script that hasn't executed will still execute
# [12:00] <Hixie> hsivonen: i'm not sure if we need to put that in html5 or if gecko can change, it's something i should probably look at
# [12:00] <roc> it seemed clear that a) the SVG group has a requirement that parsing of SVG fragments should impose XML-esque validation with strict error handling and that b) you have a requirement that parsing SVG fragments should not impose strict error handling
# [12:01] <Hixie> that appears to be the case, yes
# [12:01] <hsivonen> roc: in addition to requiring non-strict error handling for SVG fragments, I also want to require avoiding complexity in the parser
# [12:01] <roc> so it doesn't really matter what the proposed solution is, since those two requirements are irreconcilable
# [12:02] <hsivonen> roc: I'm OK with complexity imposed by the legacy Web
# [12:02] <hsivonen> roc: I'm not OK with introducing new complexity for XML purity
# [12:02] <hsivonen> we have to define what document.write does inside an SVG fragment
# [12:03] <hsivonen> if the same tokenizer is used, document.write inside SVG doesn't cause new complexity
# [12:03] <Hixie> roc: they're not completely irreconcilable, you could come up with some schemes where you fall from one to the other (and the remainder is treated as html, not svg)
# [12:03] <Hixie> roc: but it does seem unlikely that such a scheme would fit into the various constraints we have
# [12:03] <hsivonen> if we'd change to an XML parser mid-stream, we'd have to figure out what exactly document.write does and how it can be implemented sanely
# [12:03] <annevk> that's what they're looking at, but it's not really clear to me how that works with tokenizing etc.
# [12:03] <roc> that's what Doug wanted to do actually
# [12:04] <Hixie> i was just wondering if there was anything i needed to work on
# [12:05] <hsivonen> I'd much rather spend my time delivering an additional serializer to address the copy-paste requirements of the SVG WG than to spend my time writing two tokenizers that can be switched mid-stream
# [12:05] <Hixie> i'm not convinced such a scheme would work, but we'll find out in due course
# [12:05] <annevk> Hixie, I guess that's part of WF2, but it'd be nice if it was in a spec so it becomes easier to advocate Koreans for instance into using it
# [12:06] <roc> hsivonen: it's clear that having browsers reserialize as XML when copy/pasting is the right thing, but the SVG WG insists that copy-pasting in a text editor is a requirement
# [12:06] <hsivonen> I'm not convinced the complexity imposed by additional strictness is a good use of anyone's limited resources
# [12:06] <hsivonen> roc: I'd prefer to go with the right thing
# [12:10] <annevk> if we just implemented whatever the W3C came up with we'd never beat Flash
# [12:10] <Philip`> I think it wasn't clear whether they'd be unhappy with using the current SVG-in-HTML parsing rules and just defining conformance so that the content must be well-formed XML, so that conforming HTML5 SVG could always be copied into a real XML document, which seems to me kind of nicer than actual XML parsing since it wouldn't affect the parser at all
# [12:10] <annevk> Hixie, I can't really get anything better than the old Netscape one, I guess I should try harder
# [12:24] * othermaciej_ is now known as othermaciej
# [12:27] <hsivonen> I guess I should go read some browser source on how document.write is implemented
# [12:29] <othermaciej> document.write writes at the current insertion point
# [12:29] <othermaciej> which might be beyond the end of the current script element
# [12:30] <othermaciej> if it document.wrote something already
# [12:31] <hsivonen> othermaciej: in WebKit, does the script engine call into the parser on document.write or schedule stuff for running and return to the main loop?
# [12:31] <othermaciej> actually the insertion point starts right after the script, per-script, so when you document.write a script and then document.write something else, and that script document.writes, I think the content gets inserted before whatever was written after the second script
# [12:31] <othermaciej> it just inserts stuff into the queue of characters to process
# [12:31] <othermaciej> it does not return to the main loop at all
# [12:31] <othermaciej> script execution during parsing is synchronous and blocks the parser
# [12:32] <hsivonen> but when document.written data gets parsed, are there script engine stack frames on the call stack?
# [12:33] <hsivonen> so if document.write document.writes, will the script engine be entered re-entrantly?
# [12:33] <othermaciej> the script engine can be re-entered via the parser using innerHTML
# [12:33] <othermaciej> but not document.write, I don't think
# [12:33] <othermaciej> since nothing that document.write writes (to the current, parsing document) is processed until after the script completes execution
# [12:34] <hsivonen> othermaciej: that doesn't seem to be the case
# [13:15] <heycam> roc, sorry for the delay on your www-svg issues
# [13:15] <heycam> (note that a couple of them are on the agenda for today's telcon (which i won't be at))
# [13:16] <hsivonen> I'm getting curious about support for incremental rendering of document.written content
# [13:17] <annevk> <script> without async is not so good for incremental rendering
# [13:19] <heycam> as for html in svg, i have to say that i (persionally) haven't had time to look into it. i believe the others did something on it at the F2F recently, and are writing up some text, but i haven't seen it.
# [13:23] <hsivonen> annevk: <script> without async *could* be good for incremental rendering, if a) layout runs on a different thread or b) the script engine stack is really on the heap, so that the C++ stack can unwind without breaking script state
# [13:24] <hsivonen> however, I just saw some Gecko code that makes me want to write a test case to see if incremental layout works from document.written content
# [13:25] <hsivonen> it seems safe to assume that Presto and Trident have concurrency models that differ significantly from Gecko and WebKit, which I gather are more alike
# [13:58] <Lachy> hey, I just realised that <iframe sandbox=""> could potentially create a false sense of security. If a website author uses it to embed user comments and hosts the comment file on the same domain as the parent page, it only protects the user as long as the comment page isn't viewed outside of the iframe.
# [14:00] <Lachy> so websites would still need to filter out as much as possible, and keep sandbox="" only as a last line of defence.
# [14:02] <Philip`> That would matter if you used <iframe sandbox src=...>, but you can't use that because it's unsafe in UAs that don't support sandboxing
# [14:02] <Lachy> at the moment, that's the only way to do it, since there isn't yet a way to embed the markup inline.
# [14:02] <Philip`> so you'd have to use <iframe sandbox doc="user's raw comment" src="some-file-with-user's-sanitised-comment.html"> and then it's alright anyway
# [14:05] <Lachy> I'm just not convinced sandox is a good solution for sandboxing user contribued content. It could be good for embedding content from 3rd party sites though.
# [14:06] <Philip`> The sanitised comment might strip out lots of safe content and styles that aren't in the whitelist, so it's uglier for users than the original unsanitised (sandboxed) comment
# [14:07] <Philip`> so if their UA supports sandboxing then they can benefit from getting the unsanitised version
# [14:07] <Lachy> if your sanitiser code is that bad, it's time to rewrite it.
# [14:09] <Philip`> If it was possible to write a sanitiser that wasn't bad, why would anyone want client-side sandboxing?
# [14:09] <Lachy> I just don't think it's a good idea to ever send unsanitised code to browsers. Imagine if there was a browser bug that inadvertently allowed something it shouldn't, and there were sites that published fully unsanitised comments, then you can bet there would be exploits pretty quickly.
# [14:10] <Lachy> incase their sanitiser missed something by mistake, not because it could accidently remove something it shouldn't.
# [14:12] <othermaciej> having both sanitization and client-side sandboxing is safer
# [14:12] <othermaciej> to get an exploit through, you'd need matching mistakes on both ends
# [14:13] <Lachy> othermaciej, right. But if the server didn't sanitise, then the browser is the only line of defence, and you'd better hope there's no bug in it.
# [14:14] <Lachy> because I'm concerned that sandbox="" could create a false sense of security, where silly developers don't bother sanitising the code, and then it really doesn't depend on there being matching mistakes on both ends.
# [14:15] <Lachy> it's just one really big mistake on the server side and whatever small mistake in the browser is enough.
# [14:16] <Lachy> and if even Philip` is suggesting sending unsanitised code to browsers, I'm sure there will be developers in the wild crazy enough to do so as well.
# [14:18] <othermaciej> one obviously shouldn't even remotely consider sending unsanitized code to browsers until sandbox="" is widely implemented, which won't be for a while
# [14:19] <othermaciej> but even then it's probably not a good idea
# [14:19] <Philip`> You could send unsanitised code in doc="" even if sandbox="" isn't implemented yet, as long as nobody implements doc="" before implementing sandbox=""
# [14:20] <othermaciej> I'm willing to entertain the notion that adding sandbox="" could in some cases hurt security through second-order effects like you describe
# [14:21] <othermaciej> but I think it is likely not the case, because the vast number of sites currently doing blacklist-based instead of whitelist-based sanitization may well be better off with sandbox="" and no server-side sanitization at all
# [14:27] <Lachy> why? Blacklist sanitation is better than nothing, even though it's clearly a flawed approach.
# [14:28] <virtuelv> and especially if it's about cleaning markup
# [14:28] <othermaciej> I would expect every blacklist-based filter to have holes, while browsers could plausibly implement sandbox="" close to correctly
# [14:29] <othermaciej> nontheless, I think it is best to both filter on the server and use client-side restrictions
# [17:03] <annevk> If someone else wants to reply again to the sandboxing thread, be my guest. It seems obvious to me that his solution is flawed, but I can't really put it in words.
# [17:12] <Philip`> (and it's not possible to do on the client even with the not-yet-implemented canvas text support, since it needs to analyse the shape of the characters)
# [17:43] <csarven> Is Safari treating it properly if <object type="text/html" data="#foo"></object> includes the current document? Firefox2, Opera9.5 and IE7 doesn't.
# [18:28] <csarven> annevk If data value is a fragment identifier of the current page, is it expected for the UA to not request the current page seperately?
# [18:30] <annevk> they're not handled specifically so I'd expect an additional request
# [18:30] <annevk> though given that three browsers disagree with WebKit it might be worth noting somewhere to see whether Opera/Firefox are planning on fixing this
# [22:03] <hsivonen> Hixie: does the spec expect the relationship of normal parsing and the first level of doc.write and the relationship of the first and second levels of doc.write to be different somehow
# [22:05] * hsivonen wonders if document.write was considered a simple feature in the Netscape 2 design phase
# [22:06] <Philip`> I imagine they designed it to be simple to implement in their architecture, and as hard as possible for any competitors to implement in their different architectures :-)
# [22:08] <Hixie> hsivonen: step 3 of document.write()
# [22:08] <hsivonen> "If there is a script that will execute as soon as the parser resumes, then the method must now return without further processing of the input stream."?
# [22:09] <hsivonen> why does the behavior othermaciej showed earlier follow?
# [22:10] <hsivonen> in that case, the second level of document.write wrote a third level of script that executed similarly relative to the second level as the second level did relative to the first level
# [22:11] <Hixie> give me a concrete example and i'll try to describe how it runs
# [22:19] <gsnedders> jgraham_: As far as I can see, you deviate from the spec
# [22:20] <Hixie> step 4: run the tokeniser on the new input (just "a"), which just causes a text node saying "a" to be appended after the <script> in the DOM
# [22:21] <gsnedders> jgraham_: Meaning Hixie's algorithm is broken, and it's not an implementation bug on my behalf
# [22:21] * gsnedders hopes it is Hixie's fault so then he doesn't have to deal with it :P
# [22:21] <Hixie> step 2: just after the "a" (and before the insertion point) we insert |<script>document.write("b<script>document.write('c'); alert(4);<\/script>"); alert(3);</script>|.
# [22:26] <jgraham_> The relevant part of the spec is "Otherwise, if the element being entered has a rank equal to or greater than the heading of the current section, then create a new section and append it to the outline of the current outlinee element, so that this new section is the new last section of that outline. Let current section be that new section. Let the element being entered be the new heading for the current section."
# [22:26] <jgraham_> But I have no idea if that's the same as my implementation or not
# [22:27] <hsivonen> Hixie: so far, we haven't had "a script that will execute as soon as the parser resumes", right?
# [22:27] <Hixie> correct (in fact i think we won't because in this example you never have <script src="">)
# [22:27] <gsnedders> jgraham_: It's totally different
# [22:28] <gsnedders> jgraham_: the spec needs three lines
# [22:28] <Hixie> so i guess the spec doesn't (and shouldn't) limit it to two deep for inline scripts
# [22:29] * Hixie breathes a sigh of relief because this was getting complicated
# [22:29] <hsivonen> Hixie: the concept of "a script that will execute as soon as the parser resumes" could be more clearly named so that the reader wouldn't try to see if an inline script can be it
# [22:29] <Hixie> any suggestions for a better name?
# [22:43] <Philip`> gsnedders: things[x:y:z] means things[x], things[x+z], things[x+2*z], ... up to things[x+n*z] where n is the maximum number such that x+n*z < y, or something like that
# [22:43] <Philip`> gsnedders: and if x is omitted it means 0, and if y is omitted it means len(things)
# [22:44] <Philip`> gsnedders: oh but my explanation breaks down a bit
# [22:44] <Hixie> hsivonen: more like document.write("<script src=alert2><\/script>");alert(1);
# [22:44] <Philip`> gsnedders: but if you ignore all the details, then that means self[::-1] is just reversed(self)
# [22:44] <hsivonen> Hixie: that seems counter-intuitive
# [22:44] <Philip`> gsnedders: (but compatible with older Pythons that don't have 'reversed')
# [22:45] <Hixie> hsivonen: specifically, <script>document.write("<script src=alert2><\/script>");document.write("this won't be tokenised until after the alert1 and alert2");alert(1);</script>
# [22:49] <Hixie> hsivonen: specifically in the spec this is implemented by having the "pause until the script has completed loading" be only ever executed in the very outer tree construction invokcation
# [22:55] <hsivonen> Hixie: so regarding my email, I could substitute checking if the tree builder is invoked recursively with checking if there's a document.write on the call stack?
# [22:57] <Hixie> i believe that is currently equivalent, yes
# [22:57] <Hixie> be wary of changes that violate that assumption, in case we ever add other ways to be reentrant
# [23:03] <Hixie> well what you really want to make sure is that a document.write() doesn't write a script that does another document.write(), but there's no guarantee that the second will be run while the first is on the call stack
# [23:03] <Hixie> (or rather, you want to block that but only when it's in an infinite loop)
# [23:03] * Quits: jgraham_ (n=james@81-86-219-217.dsl.pipex.com) ("I get eaten by the worms")
# [23:04] <Hixie> another way is just to limit how much cpu/ram a script can use
# [23:04] <gsnedders> Infinite loops are _awesome_!
# [23:05] <gsnedders> I mean, how else are you going to use a modern CPU?
# [23:36] <Hixie> yeah that was my reasoning too, but the referrers were all over hte place
# [23:36] <roc> Lachy: no way, it should be over 70
# [23:36] <Lachy> oh, 71. It just paused at 51 for some reasonj
# [23:36] <Philip`> You ought to just cache the results based on UA string - there's no point having a hundred million people run the same test with an identical browser, and it's an awful waste of energy
# [23:42] <othermaciej> Lachy: you should be able to get it then (though I am not sure how to find the download, probably the search feature will find it)
# [23:52] <othermaciej> Lachy: you can actually force quit the installer when it tells you to reboot, as long as you quit all WebKit apps
# [23:52] <othermaciej> Lachy: if you are willing to live on the edge a little
# [23:53] <Lachy> ah, too late. I already restarted.
# [23:53] <othermaciej> anyway, it has "Save as Web Application" which among other things supports HTML5 <link rel="icon" sizes=...> and <meta name="application-name" ...>