Re: pfps-04 from Peter F. Patel-Schneider on 2003年07月24日 (www-rdf-comments@w3.org from July to September 2003)

From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
Date: 2003年7月24日 11:31:16 -0400 (EDT)
To: bwm@hplb.hpl.hp.com
Cc: phayes@ai.uwf.edu, www-rdf-comments@w3.org
Message-Id: <20030724.113116.43884887.pfps@research.bell-labs.com>

From: Brian McBride <bwm@hplb.hpl.hp.com>
Subject: Re: pfps-04
Date: 24 Jul 2003 15:53:39 +0100
> On Wed, 2003年07月23日 at 23:03, Peter F. Patel-Schneider wrote:
> 
> [...]
> 
> > Therefore for the RDF entailment rules to be complete, no XML Literal can
> > have a character string as its denotation.
> 
> Right. The denotation of an XML Literal is an octet sequence, as
> defined by the xml canonicalization spec, see the note in:
> 
> 
> http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-concepts-20030117/#section-XMLLiteral
Unfortunately this does not answer the question. Octet sequence is
undefined in http://www.w3.org/TR/2002/REC-xml-exc-c14n-20020718/. At
least some places in this document appear to indicate that an octet
sequence is just a sequence of (Unicode?) characters. (See for example,
the example in Section 2.2 of ``the Canonical XML version of elem2 from the
second case''.) Also, the phrase ``exclusive canonical XML refers to XML
that is in exclusive canonical form'' appears to indicate that exclusive
canonical XML is a subset of XML, again indicating that octets should
probably be a restricted form of (Unicode?) characters.
Following pointers leads to
http://www.w3.org/TR/2001/REC-xml-c14n-20010315, where the canonical form
of an XML document is a physical representation of the document encoded in
UTF-8, and talks about octets encoding various kinds of characters. This
doesn't help matters too much.
So the question boils down to whether octets and Unicode characters are
disjoint. 
[...]
Peter F. Patel-Schneider
Bell Labs Research
Lucent Technologies

Received on Thursday, 24 July 2003 11:31:35 UTC