8.18
top
← prev up next →

libxml2: Bindings for XML ValidationπŸ”— i

Philip McGrath <philip at philipmcgrath dot com>

(require libxml2 ) package: libxml2

This package provides a Racket interface to functionality from the C library libxml2.

Racket already has many mature XML-related libraries implemented natively in Racket: libxml2 does not aim to replace them, nor to implement the entire libxml2 C API. Rather, the goal is to use libxml2 for functionality not currently available from the native Racket XML libraries, beginning with validation.

Note that libxml2 is in an early stage of development: before relying on this library, please see in particular the notes on Safety & Stability.

1DTD ValidationπŸ”— i

The initial goal for libxml2 is to support XML validation, beginning with document type definitions.

procedure

( dtd? v)boolean?

v:any/c

procedure

( file->dtd pth)dtd?

A DTD object, recognized by the predicate dtd? , is a Racket value encapsulating an XML document type definition, which is a formal specification of the structure of an XML document. A DTD object can be used with functions like dtd-validate-xml-string to validate an XML document against the encapsulated document type definition.

Currently, the only way to construct a DTD object is from a stand-alone DTD file using file->dtd . Additional mechanisms may be added in the future.

Examples:
> (define dtd-file
'("<!ELEMENT example (good)>"
"<!ELEMENT good (#PCDATA)>")
#:exists'truncate/replace
dtd-file)
> (define example-dtd
(file->dtd dtd-file))
> example-dtd

#<dtd>

> (delete-file dtd-file)

procedure

doc
[ error-buffer-file])
(or/c 'valid
dtd:dtd?
doc:string?
error-buffer-file:(or/c #fpath-string? )=#f
Parses the string doc as XML and validates it according to the DTD object dtd. If doc is both well-formed and valid, dtd-validate-xml-string returns 'valid; otherwise, it returns an immutable string containing an error message.

Internally, dtd-validate-xml-string and related functions use a file as buffer to collect any error messages from libxml2. If error-buffer-file is provided and is not #false, it will be used as the buffer: it will be created if it does not already exist, and any existing contents will likely be overwritten. If error-buffer-file is #false (the default), a temporary file will be used.

Examples:
example-dtd
"<example><good>This is a good doc.</good></example>")

'valid

> (define buffer-file
example-dtd
(string-append "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
"<example><good>So is this.</good></example>")
buffer-file)

'valid

> (define (show-stringstr)
(let loop([lst(regexp-split #rx"\n"str)])
(match lst
['()(void )]
[(cons strlst)
#:when(<= (string-length str)60)
(looplst)]
[(cons (pregexp #px"^(.{,60})\\s+(.*)$"(list _ ab))lst)
(loop(cons (string-append ""b)lst))])))
> (show-string
example-dtd
"<ill-formed"
buffer-file))

Entity: line 1: parser error : Couldn't find end of Start

Tag ill-formed line 1

> (show-string
example-dtd
"<example><bad>This is invalid.</bad></example>"))

element example: validity error : Element example content

does not follow the DTD, expecting (good), got (bad)

element bad: validity error : No declaration for element bad

> (delete-file buffer-file)

procedure

doc
[ error-buffer-file])
(or/c 'valid
dtd:dtd?
doc:xexpr/c
error-buffer-file:(or/c #fpath-string? )=#f
Like dtd-validate-xml-string , but validates the x-expression doc. Because doc is an x-expression, it will always be at least well-formed.

Examples:
> (dtd-validate-xexpr example-dtd
'(example(good)))

'valid

> (show-string
(dtd-validate-xexpr example-dtd
'(example(bad))))

element example: validity error : Element example content

does not follow the DTD, expecting (good), got (bad)

element bad: validity error : No declaration for element bad

procedure

doc
[ error-buffer-file])
(or/c 'valid
dtd:dtd?
error-buffer-file:(or/c #fpath-string? )=#f
Like dtd-validate-xml-string , but validates the XML document in the file doc.

2Checking Shared Library AvailabilityπŸ”— i

If the libxml2 shared library cannot be loaded, the Racket interface defers raising any exception until a client program attempts to use the foreign functionality. In other words, (require libxml2 ) should not cause an exception, even if attempting to load the shared library fails. (Currently, an immediate exception may be raised if the shared library is loaded, but does not provide the needed functionality.)

procedure

( libxml2-available? )boolean?

Returns #true if and only if the libxml2 shared library was loaded successfully. When (libxml2-available? ) returns #false, indicating that the shared library could not be loaded, most functions provided by libxml2 will raise an exception of the exn:fail:unsupported:libxml2 structure type.

Added in version 0.0.1 of package libxml2.

Raised by functions from this library that depend on the libxml2 shared library when the foreign library could not be loaded. The who field identifies the origin of the exception, potentially in terms of the C API or other internal names.

See also libxml2-available? .

Added in version 0.0.1 of package libxml2.

3Usage NotesπŸ”— i

3.1Platform DependenciesπŸ”— i

All of this library’s functionality depends on having the libxml2 shared library available. It is included by default with Mac OS and is readily available on GNU/Linux via the system package manager. For Windows users, there are plans to distribute the necessary libraries through the Racket package manager, but this has not yet been implemented.

3.2Safety & StabilityπŸ”— i

The goal for libxml2 is to provide a safe interface for Racket clients. However, this library is still in an early stage of development: there are likely subtle bugs, and, since libxml2 is implemented using unsafe functionality, these bugs could have bad consequences. More fundamentally, there may be bugs and security vulnerabilities in the underlying libxml2 shared library. Please give careful thought to these issues when deciding whether or how to use libxml2 in your programs.

In terms of stability, libxml2 is in an early stage of development: backwards-compatibility is not guaranteed. However, I have no intention of breaking things gratuitously. If you use libxml2 now, I encourage you to be in touch; I am happy to consult with users about potential changes.

top
← prev up next →

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /