aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
-rw-r--r--ChangeLog9
-rw-r--r--doc/xmlreader.html82
-rw-r--r--python/tests/Makefile.am3
-rwxr-xr-xpython/tests/reader4.py45
-rwxr-xr-xpython/tests/reader6.py118
-rw-r--r--relaxng.c43
-rw-r--r--result/relaxng/tutor10_7_3.err2
-rw-r--r--result/relaxng/tutor10_8_3.err2
-rw-r--r--result/relaxng/tutor3_2_1.err4
-rw-r--r--result/relaxng/tutor3_5_2.err6
-rw-r--r--result/relaxng/tutor9_5_2.err4
-rw-r--r--result/relaxng/tutor9_5_3.err2
-rw-r--r--result/relaxng/tutor9_6_2.err2
-rw-r--r--result/relaxng/tutor9_6_3.err2
14 files changed, 286 insertions, 38 deletions
diff --git a/ChangeLog b/ChangeLog
index ce1a6b6b..64ed432e 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+Thu Apr 17 14:51:57 CEST 2003 Daniel Veillard <daniel@veillard.com>
+
+ * relaxng.c: some cleanups
+ * doc/xmlreader.html: extended the document to cover RelaxNG and
+ tree operations
+ * python/tests/Makefile.am python/tests/reader[46].py: added some
+ xmlReader example/regression tests
+ * result/relaxng/tutor*.err: updated the output of a number of tests
+
Thu Apr 17 11:35:37 CEST 2003 Daniel Veillard <daniel@veillard.com>
* relaxng.c: valgrind pointed out an uninitialized variable error.
diff --git a/doc/xmlreader.html b/doc/xmlreader.html
index 7b4ab994..fd956466 100644
--- a/doc/xmlreader.html
+++ b/doc/xmlreader.html
@@ -13,6 +13,8 @@ H3 {font-family: Verdana,Arial,Helvetica}
A:link, A:visited, A:active { text-decoration: underline }-->
+
+
</style>
<title>Libxml2 XmlTextReader Interface tutorial</title>
</head>
@@ -42,6 +44,9 @@ examples using both C and the Python bindings:</p>
attributes</a></li>
<li><a href="#Validating">Validating a document</a></li>
<li><a href="#Entities">Entities substitution</a></li>
+ <li><a href="#L1142">Relax-NG Validation</a></li>
+ <li><a href="#Mixing">Mixing the reader and tree or XPath
+ operations</a></li>
</ul>
<p></p>
@@ -147,8 +152,7 @@ def streamFile(filename):
ret = reader.Read()
if ret != 0:
- print "%s : failed to parse" % (filename)
-</pre>
+ print "%s : failed to parse" % (filename)</pre>
<p>The only things worth adding are that the <a
href="http://dotgnu.org/pnetlib-doc/System/Xml/XmlTextReader.html">xmlTextReader
@@ -390,9 +394,79 @@ the validation feature is just:</p>
<h2><a name="Entities">Entities substitution</a></h2>
-<p>@@TODO@@</p>
+<p>By default the xmlReader will report entities as such and not replace them
+with their content. This default behaviour can however be overriden using:</p>
+
+<p><code>reader.SetParserProp(libxml2.PARSER_SUBST_ENTITIES,1)</code></p>
+
+<h2><a name="L1142">Relax-NG Validation</a></h2>
+
+<p style="font-size: 10pt">Introduced in version 2.5.7</p>
+
+<p>Libxml2 can now validate the document being read using the xmlReader using
+Relax-NG schemas. While the Relax NG validator can't always work in a
+streamable mode, only subsets which cannot be reduced to regular expressions
+need to have their subtree expanded for validation. In practice it means
+that, unless the schemas for the top level element content is not expressable
+as a regexp, only chunk of the document needs to be parsed while
+validating.</p>
+
+<p>The steps to do so are:</p>
+<ul>
+ <li>create a reader working on a document as usual</li>
+ <li>before any call to read associate it to a Relax NG schemas, either the
+ preparsed schemas or the URL to the schemas to use</li>
+ <li>errors will be reported the usual way, and the validity status can be
+ obtained using the IsValid() interface of the reader like for DTDs.</li>
+</ul>
-<p> </p>
+<p>Example, assuming the reader has already being created and that the schema
+string contains the Relax-NG schemas:</p>
+
+<p><code>rngp = libxml2.relaxNGNewMemParserCtxt(schema, len(schema))<br>
+rngs = rngp.relaxNGParse()<br>
+reader.RelaxNGSetSchema(rngs)<br>
+ret = reader.Read()<br>
+while ret == 1:<br>
+ ret = reader.Read()<br>
+if ret != 0:<br>
+ print "Error parsing the document"<br>
+if reader.IsValid() != 1:<br>
+ print "Document failed to validate"</code><br>
+See <code>reader6.py</code> in the sources or documentation for a complete
+example.</p>
+
+<h2><a name="Mixing">Mixing the reader and tree or XPath operations</a></h2>
+
+<p style="font-size: 10pt">Introduced in version 2.5.7</p>
+
+<p>While the reader is a streaming interface, its underlying implementation
+is based on the DOM builder of libxml2. As a result it is relatively simple
+to mix operations based on both models under some constraints. To do so the
+reader has an Expand() operation allowing to grow the subtree under the
+current node. It returns a pointer to a standard node wich can be manipulated
+in the usual ways. The node will get all its ancestors and the full subtree
+available. Usual operations like XPath queries can be used on that reduced
+view of the document. Here is an example extracted from reader5.py in the
+sources which extract and prints the bibliography for the "Dragon" compiler
+book from the XML 1.0 recommendation:</p>
+<pre>f = open('../../test/valid/REC-xml-19980210.xml')
+input = libxml2.inputBuffer(f)
+reader = input.newTextReader("REC")
+res=""
+while reader.Read():
+ while reader.Name() == 'bibl':
+ node = reader.Expand() # expand the subtree
+ if node.xpathEval("@id = 'Aho'"): # use XPath on it
+ res = res + node.serialize()
+ if reader.Next() != 1: # skip the subtree
+ break;</pre>
+
+<p>Note however that the node instance returned by the Expand() call is only
+valid until the next Read() operation. The Expand() operation does not
+affects the Read() ones, however usually once processed the full subtree is
+not useful anymore, and the Next() operation allows to skip it completely and
+process to the successor or return 0 if the document end is reached. </p>
<p><a href="mailto:veillard@redhat.com">Daniel Veillard</a></p>
diff --git a/python/tests/Makefile.am b/python/tests/Makefile.am
index 761046a5..0c16acf0 100644
--- a/python/tests/Makefile.am
+++ b/python/tests/Makefile.am
@@ -23,6 +23,9 @@ PYTESTS= \
reader.py \
reader2.py \
reader3.py \
+ reader4.py \
+ reader5.py \
+ reader6.py \
ctxterror.py\
readererr.py\
relaxng.py
diff --git a/python/tests/reader4.py b/python/tests/reader4.py
new file mode 100755
index 00000000..0269cb0c
--- /dev/null
+++ b/python/tests/reader4.py
@@ -0,0 +1,45 @@
+#!/usr/bin/python -u
+#
+# this tests the basic APIs of the XmlTextReader interface
+#
+import libxml2
+import StringIO
+import sys
+
+# Memory debug specific
+libxml2.debugMemory(1)
+
+def tst_reader(s):
+ f = StringIO.StringIO(s)
+ input = libxml2.inputBuffer(f)
+ reader = input.newTextReader("tst")
+ res = ""
+ while reader.Read():
+ res=res + "%s (%s) [%s] %d\n" % (reader.NodeType(),reader.Name(),
+ reader.Value(), reader.IsEmptyElement())
+ if reader.NodeType() == 1: # Element
+ while reader.MoveToNextAttribute():
+ res = res + "-- %s (%s) [%s]\n" % (reader.NodeType(),
+ reader.Name(),reader.Value())
+ return res
+
+expect="""1 (test) [None] 0
+1 (b) [None] 1
+1 (c) [None] 1
+15 (test) [None] 0
+"""
+
+res = tst_reader("""<test><b/><c/></test>""")
+
+if res != expect:
+ print "Did not get the expected error message:"
+ print res
+ sys.exit(1)
+
+# Memory debug specific
+libxml2.cleanupParser()
+if libxml2.debugMemory(1) == 0:
+ print "OK"
+else:
+ print "Memory leak %d bytes" % (libxml2.debugMemory(1))
+ libxml2.dumpMemory()
diff --git a/python/tests/reader6.py b/python/tests/reader6.py
new file mode 100755
index 00000000..fe22079f
--- /dev/null
+++ b/python/tests/reader6.py
@@ -0,0 +1,118 @@
+#!/usr/bin/python -u
+#
+# this tests the entities substitutions with the XmlTextReader interface
+#
+import sys
+import StringIO
+import libxml2
+
+schema="""<element name="foo" xmlns="http://relaxng.org/ns/structure/1.0"
+ datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes">
+ <oneOrMore>
+ <element name="label">
+ <text/>
+ </element>
+ <optional>
+ <element name="opt">
+ <empty/>
+ </element>
+ </optional>
+ <element name="item">
+ <data type="byte"/>
+ </element>
+ </oneOrMore>
+</element>
+"""
+# Memory debug specific
+libxml2.debugMemory(1)
+
+#
+# Parse the Relax NG Schemas
+#
+rngp = libxml2.relaxNGNewMemParserCtxt(schema, len(schema))
+rngs = rngp.relaxNGParse()
+del rngp
+
+#
+# Parse and validate the correct document
+#
+docstr="""<foo>
+<label>some text</label>
+<item>100</item>
+</foo>"""
+
+f = StringIO.StringIO(docstr)
+input = libxml2.inputBuffer(f)
+reader = input.newTextReader("correct")
+reader.RelaxNGSetSchema(rngs)
+ret = reader.Read()
+while ret == 1:
+ ret = reader.Read()
+
+if ret != 0:
+ print "Error parsing the document"
+ sys.exit(1)
+
+if reader.IsValid() != 1:
+ print "Document failed to validate"
+ sys.exit(1)
+
+#
+# Parse and validate the incorrect document
+#
+docstr="""<foo>
+<label>some text</label>
+<item>1000</item>
+</foo>"""
+
+err=""
+expect="""RNG validity error: file error line 3 element text
+Type byte doesn't allow value '1000'
+RNG validity error: file error line 3 element text
+Error validating datatype byte
+RNG validity error: file error line 3 element text
+Element item failed to validate content
+"""
+
+def callback(ctx, str):
+ global err
+ err = err + "%s" % (str)
+libxml2.registerErrorHandler(callback, "")
+
+f = StringIO.StringIO(docstr)
+input = libxml2.inputBuffer(f)
+reader = input.newTextReader("error")
+reader.RelaxNGSetSchema(rngs)
+ret = reader.Read()
+while ret == 1:
+ ret = reader.Read()
+
+if ret != 0:
+ print "Error parsing the document"
+ sys.exit(1)
+
+if reader.IsValid() != 0:
+ print "Document failed to detect the validation error"
+ sys.exit(1)
+
+if err != expect:
+ print "Did not get the expected error message:"
+ print err
+ sys.exit(1)
+
+#
+# cleanup
+#
+del f
+del input
+del reader
+del rngs
+libxml2.relaxNGCleanupTypes()
+
+# Memory debug specific
+libxml2.cleanupParser()
+if libxml2.debugMemory(1) == 0:
+ print "OK"
+else:
+ print "Memory leak %d bytes" % (libxml2.debugMemory(1))
+ libxml2.dumpMemory()
diff --git a/relaxng.c b/relaxng.c
index c98e04e2..d453b93e 100644
--- a/relaxng.c
+++ b/relaxng.c
@@ -8,11 +8,9 @@
/**
* TODO:
- * - error reporting
- * - handle namespace declarations as attributes.
* - add support for DTD compatibility spec
* http://www.oasis-open.org/committees/relax-ng/compatibility-20011203.html
- * - report better mem allocations at runtime and abort immediately.
+ * - report better mem allocations pbms at runtime and abort immediately.
*/
#define IN_LIBXML
@@ -836,7 +834,6 @@ xmlRelaxNGFreeDefine(xmlRelaxNGDefinePtr define)
* @size: the default size for the container
*
* Allocate a new RelaxNG validation state container
- * TODO: keep a pool in the ctxt
*
* Returns the newly allocated structure or NULL in case or error
*/
@@ -1989,7 +1986,7 @@ xmlRelaxNGGetErrorString(xmlRelaxNGValidErr err, const xmlChar *arg1,
case XML_RELAXNG_ERR_EXTRADATA:
return(xmlCharStrdup("Extra data in the document"));
default:
- TODO
+ return(xmlCharStrdup("Unknown error !"));
}
if (msg[0] == 0) {
snprintf(msg, 1000, "Unknown error code %d", err);
@@ -2279,12 +2276,6 @@ xmlRelaxNGSchemaTypeCheck(void *data ATTRIBUTE_UNUSED,
xmlSchemaTypePtr typ;
int ret;
- /*
- * TODO: the type should be cached ab provided back, interface subject
- * to changes.
- * TODO: handle facets, may require an additional interface and keep
- * the value returned from the validation.
- */
if ((type == NULL) || (value == NULL))
return(-1);
typ = xmlSchemaGetPredefinedType(type,
@@ -2956,9 +2947,9 @@ xmlRelaxNGCompile(xmlRelaxNGParserCtxtPtr ctxt, xmlRelaxNGDefinePtr def) {
case XML_RELAXNG_LIST:
case XML_RELAXNG_PARAM:
case XML_RELAXNG_VALUE:
- TODO /* This should not happen and generate an internal error */
- printf("trying to compile %s\n", xmlRelaxNGDefName(def));
-
+ /* This should not happen and generate an internal error */
+ fprintf(stderr, "RNG internal error trying to compile %s\n",
+ xmlRelaxNGDefName(def));
break;
}
return(ret);
@@ -3302,7 +3293,6 @@ xmlRelaxNGParseValue(xmlRelaxNGParserCtxtPtr ctxt, xmlNodePtr node) {
}
}
}
- /* TODO check ahead of time that the value is okay per the type */
return(def);
}
@@ -4878,10 +4868,9 @@ xmlRelaxNGParseAttribute(xmlRelaxNGParserCtxtPtr ctxt, xmlNodePtr node) {
ctxt->nbErrors++;
break;
case XML_RELAXNG_NOOP:
- TODO
if (ctxt->error != NULL)
ctxt->error(ctxt->userData,
- "Internal error, noop found\n");
+ "RNG Internal error, noop found in attribute\n");
ctxt->nbErrors++;
break;
}
@@ -5199,16 +5188,27 @@ xmlRelaxNGParseElement(xmlRelaxNGParserCtxtPtr ctxt, xmlNodePtr node) {
ret->attrs = cur;
break;
case XML_RELAXNG_START:
+ if (ctxt->error != NULL)
+ ctxt->error(ctxt->userData,
+ "RNG Internal error, start found in element\n");
+ ctxt->nbErrors++;
+ break;
case XML_RELAXNG_PARAM:
+ if (ctxt->error != NULL)
+ ctxt->error(ctxt->userData,
+ "RNG Internal error, param found in element\n");
+ ctxt->nbErrors++;
+ break;
case XML_RELAXNG_EXCEPT:
- TODO
+ if (ctxt->error != NULL)
+ ctxt->error(ctxt->userData,
+ "RNG Internal error, except found in element\n");
ctxt->nbErrors++;
break;
case XML_RELAXNG_NOOP:
- TODO
if (ctxt->error != NULL)
ctxt->error(ctxt->userData,
- "Internal error, noop found\n");
+ "RNG Internal error, noop found in element\n");
ctxt->nbErrors++;
break;
}
@@ -5438,9 +5438,6 @@ xmlRelaxNGCheckReference(xmlRelaxNGDefinePtr ref,
name);
ctxt->nbErrors++;
}
- /*
- * TODO: make a closure and verify there is no loop !
- */
}
/**
diff --git a/result/relaxng/tutor10_7_3.err b/result/relaxng/tutor10_7_3.err
index ebbc9aa4..bc3d6acd 100644
--- a/result/relaxng/tutor10_7_3.err
+++ b/result/relaxng/tutor10_7_3.err
@@ -1,2 +1,2 @@
RNG validity error: file ./test/relaxng/tutor10_7_3.xml line 2 element card
-Element addressBook has extra content: card
+Element card failed to validate attributes
diff --git a/result/relaxng/tutor10_8_3.err b/result/relaxng/tutor10_8_3.err
index 34eb5e94..06229bf1 100644
--- a/result/relaxng/tutor10_8_3.err
+++ b/result/relaxng/tutor10_8_3.err
@@ -1,2 +1,2 @@
RNG validity error: file ./test/relaxng/tutor10_8_3.xml line 2 element card
-Element addressBook has extra content: card
+Element card failed to validate attributes
diff --git a/result/relaxng/tutor3_2_1.err b/result/relaxng/tutor3_2_1.err
index 83e9a57c..73577fcb 100644
--- a/result/relaxng/tutor3_2_1.err
+++ b/result/relaxng/tutor3_2_1.err
@@ -1,4 +1,2 @@
RNG validity error: file ./test/relaxng/tutor3_2_1.xml line 1 element email
-Expecting element name, got email
-RNG validity error: file ./test/relaxng/tutor3_2_1.xml line 1 element email
-Element card failed to validate content
+Did not expect element email there
diff --git a/result/relaxng/tutor3_5_2.err b/result/relaxng/tutor3_5_2.err
index ed09a330..80acb18f 100644
--- a/result/relaxng/tutor3_5_2.err
+++ b/result/relaxng/tutor3_5_2.err
@@ -1,2 +1,4 @@
-RNG validity error: file ./test/relaxng/tutor3_5_2.xml line 2 element card
-Element addressBook has extra content: card
+RNG validity error: file ./test/relaxng/tutor3_5_2.xml line 2 element email
+Expecting element name, got email
+RNG validity error: file ./test/relaxng/tutor3_5_2.xml line 2 element email
+Element card failed to validate content
diff --git a/result/relaxng/tutor9_5_2.err b/result/relaxng/tutor9_5_2.err
index 650ca981..ede3b450 100644
--- a/result/relaxng/tutor9_5_2.err
+++ b/result/relaxng/tutor9_5_2.err
@@ -1,2 +1,4 @@
RNG validity error: file ./test/relaxng/tutor9_5_2.xml line 2 element card
-Element addressBook has extra content: card
+Invalid sequence in interleave
+RNG validity error: file ./test/relaxng/tutor9_5_2.xml line 2 element card
+Element card failed to validate attributes
diff --git a/result/relaxng/tutor9_5_3.err b/result/relaxng/tutor9_5_3.err
index eee06c7c..4566bccb 100644
--- a/result/relaxng/tutor9_5_3.err
+++ b/result/relaxng/tutor9_5_3.err
@@ -1,2 +1,2 @@
RNG validity error: file ./test/relaxng/tutor9_5_3.xml line 2 element card
-Element addressBook has extra content: card
+Invalid attribute error for element card
diff --git a/result/relaxng/tutor9_6_2.err b/result/relaxng/tutor9_6_2.err
index 259cb073..1a10f1b6 100644
--- a/result/relaxng/tutor9_6_2.err
+++ b/result/relaxng/tutor9_6_2.err
@@ -1,2 +1,2 @@
RNG validity error: file ./test/relaxng/tutor9_6_2.xml line 2 element card
-Element addressBook has extra content: card
+Element card failed to validate attributes
diff --git a/result/relaxng/tutor9_6_3.err b/result/relaxng/tutor9_6_3.err
index 2157e524..e92c5f1a 100644
--- a/result/relaxng/tutor9_6_3.err
+++ b/result/relaxng/tutor9_6_3.err
@@ -1,2 +1,2 @@
RNG validity error: file ./test/relaxng/tutor9_6_3.xml line 2 element card
-Element addressBook has extra content: card
+Invalid attribute error for element card