diff options
| author | MDT 2002 John Fleck <jfleck@inkstain.net> | 2002-06-04 15:10:36 +0000 |
|---|---|---|
| committer | John Fleck <jfleck@src.gnome.org> | 2002-06-04 15:10:36 +0000 |
| commit | 598f6ebb94b00e6583f4b330dcad08ab3a0c5a36 (patch) | |
| tree | b53a2e9533f3373486db82e84b872d7a6d06aaf7 /doc/tutorial/xmltutorial.xml | |
| parent | 6f4e5e7697b6a0d034a476b2da677a7bbec165d4 (diff) | |
| download | android_external_libxml2-598f6ebb94b00e6583f4b330dcad08ab3a0c5a36.tar.gz android_external_libxml2-598f6ebb94b00e6583f4b330dcad08ab3a0c5a36.tar.bz2 android_external_libxml2-598f6ebb94b00e6583f4b330dcad08ab3a0c5a36.zip | |
apa.html apb.html apc.html apd.html ar01s02.html ar01s03.html ar01s04.html
Tue Jun 4 09:09:18 MDT 2002 John Fleck <jfleck@inkstain.net>
* added doc/tutorial, including:
apa.html
apb.html
apc.html
apd.html
ar01s02.html
ar01s03.html
ar01s04.html
ar01s05.html
ar01s06.html
includeaddattribute.c
includeaddkeyword.c
includekeyword.c
includestory.xml
index.html
xmltutorial.xml
libxml tutorial, including generated html
Diffstat (limited to 'doc/tutorial/xmltutorial.xml')
| -rw-r--r-- | doc/tutorial/xmltutorial.xml | 372 |
1 files changed, 372 insertions, 0 deletions
diff --git a/doc/tutorial/xmltutorial.xml b/doc/tutorial/xmltutorial.xml new file mode 100644 index 00000000..be9a1ba3 --- /dev/null +++ b/doc/tutorial/xmltutorial.xml @@ -0,0 +1,372 @@ +<?xml version="1.0"?> +<!DOCTYPE article PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN" + "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd" [ +<!ENTITY KEYWORD SYSTEM "includekeyword.c"> +<!ENTITY STORY SYSTEM "includestory.xml"> +<!ENTITY ADDKEYWORD SYSTEM "includeaddkeyword.c"> +<!ENTITY ADDATTRIBUTE SYSTEM "includeaddattribute.c"> +]> +<article> + <articleinfo> + <title>Libxml Tutorial</title> + <author> + <firstname>John</firstname> + <surname>Fleck</surname> + </author> + <copyright> + <year>2002</year> + <holder>John Fleck</holder> + </copyright> + <revhistory> + <revision> + <revnumber>1</revnumber> + <date>June 4,2002</date> + </revision> + </revhistory> + </articleinfo> + <abstract> + <para>Libxml is a freely licensed C language library for handling + <acronym>XML</acronym>, portable across a large number of platforms. This + tutorial provides examples of its basic functions.</para> + </abstract> + <sect1 id="introduction"> + <title>Introduction</title> + <para>Libxml is a C language library implementing functions for reading, + creating and manipulating <acronym>XML</acronym> data. This tutorial + provides example code and explanations of its basic functionality.</para> + <para>Libxml and more details about its use are available on <ulink + url="http://www.xmlsoft.org/">the project home page</ulink>. Included there is complete <ulink url="http://xmlsoft.org/html/libxml-lib.html"> + <acronym>API</acronym> documentation</ulink>. This tutorial is not meant + to substitute for that complete documentation, but to illustrate the + functions needed to use the library to perform basic operations. +<!-- + Links to + other resources can be found in <xref linkend="furtherresources" />. +--> +</para> + <para>The tutorial is based on a simple <acronym>XML</acronym> application I + use for articles I write. The format includes metadata and the body + of the article.</para> + <para>The example code in this tutorial demonstrates how to: + <itemizedlist> + <listitem> + <para>Parse the document.</para> + </listitem> + <listitem> + <para>Extract the text within a specified element.</para> + </listitem> + <listitem> + <para>Add an element and its content.</para> + </listitem> + <listitem> + <para>Extract the value of an attribute.</para> + </listitem> + <listitem> + <para>Add an attribute.</para> + </listitem> + </itemizedlist> + + </para> + <para>Full code for the examples is included in the appendices.</para> + + </sect1> + + <sect1 id="xmltutorialdatatypes"> + <title>Data Types</title> + <para><application>Libxml</application> declares a number of datatypes we + will encounter repeatedly, hiding the messy stuff so you do not have to deal + with it unless you have some specific need.</para> + <para> + <variablelist> + <varlistentry> + <term><ulink + url="http://xmlsoft.org/html/libxml-tree.html#XMLCHAR">xmlChar</ulink></term> + <listitem> + <para>A basic replacement for char, a byte in a UTF-8 encoded + string.</para> + </listitem> + </varlistentry> + <varlistentry> + <term> + <ulink url="http://xmlsoft.org/html/libxml-tree.html#XMLDOC">xmlDoc</ulink></term> + <listitem> + <para>A structure containing the tree created by a parsed doc. <ulink + url="http://xmlsoft.org/html/libxml-tree.html#XMLDOCPTR">xmlDocPtr</ulink> + is a pointer to the structure.</para> + </listitem> + </varlistentry> + <varlistentry> + <term><ulink + url="http://xmlsoft.org/html/libxml-tree.html#XMLNODEPTR">xmlNodePtr</ulink> + and <ulink url="http://xmlsoft.org/html/libxml-tree.html#XMLNODE">xmlNode</ulink></term> + <listitem> + <para>A structure containing a single node. <ulink + url="http://xmlsoft.org/html/libxml-tree.html#XMLNODEPTR">xmlNodePtr</ulink> + is a pointer to the structure, and is used in traversing the document tree.</para> + </listitem> + </varlistentry> + </variablelist> + </para> + + </sect1> + + <sect1 id="xmltutorialparsing"> + <title>Parsing the file</title> + <para>Parsing the file requires only the name of the file and a single + function call, plus error checking. Full code: <xref + linkend="keywordappendix" /></para> + <para> + <programlisting> + <co id="declaredoc" /> xmlDocPtr doc; + <co id="declarenode" /> xmlNodePtr cur; + + <co id="parsefile" /> doc = xmlParseFile(docname); + + <co id="checkparseerror" /> if (doc == NULL ) { + fprintf(stderr,"Document not parsed successfully. \n"); + return; + } + + <co id="getrootelement" /> cur = xmlDocGetRootElement(doc); + + <co id="checkemptyerror" /> if (cur == NULL) { + fprintf(stderr,"empty document\n"); + xmlFreeDoc(doc); + return; + } + + <co id="checkroottype" /> if (xmlStrcmp(cur->name, (const xmlChar *) "story")) { + fprintf(stderr,"document of the wrong type, root node != story"); + xmlFreeDoc(doc); + return; + } + + </programlisting> + <calloutlist> + <callout arearefs="declaredoc"> + <para>Declare the pointer that will point to your parsed document.</para> + </callout> + <callout arearefs="declarenode"> + <para>Declare a node pointer (you'll need this in order to + interact with individual nodes).</para> + </callout> + <callout arearefs="checkparseerror"> + <para>Check to see that the document was successfully parsed.</para> + </callout> + <callout arearefs="getrootelement"> + <para>Retrieve the document's root element.</para> + </callout> + <callout arearefs="checkemptyerror"> + <para>Check to make sure the document actually contains something.</para> + </callout> + <callout arearefs="checkroottype"> + <para>In our case, we need to make sure the document is the right + type. "story" is the root type of my documents.</para> + </callout> + </calloutlist> + </para> + </sect1> + + <sect1 id="xmltutorialgettext"> + <title>Retrieving Element Content</title> + <para>Retrieving the content of an element involves traversing the document + tree until you find what you are looking for. In this case, we are looking + for an element called "keyword" contained within element called "story". The + process to find the node we are interested in involves tediously walking the + tree. We assume you already have an xmlDocPtr called <varname>doc</varname> + and an xmlNodPtr called <varname>cur</varname>.</para> + + <para> + <programlisting> + <co id="getchildnode" /> cur = cur->xmlChildrenNode; + <co id="huntstoryinfo" /> while (cur != NULL) { + if ((!xmlStrcmp(cur->name, (const xmlChar *)"storyinfo"))){ + parseStory (doc, cur); + } + + cur = cur->next; + } + + </programlisting> + + <calloutlist> + <callout arearefs="getchildnode"> + <para>Get the first child node of <varname>cur</varname>. At this + point, <varname>cur</varname> points at the document root, which is + the element "story".</para> + </callout> + <callout arearefs="huntstoryinfo"> + <para>This loop iterates through the elements that are children of + "story", looking for one called "storyinfo". That + is the element that will contain the "keywords" we are + looking for. It uses the <application>libxml</application> string + comparison + function, <function><ulink + url="http://xmlsoft.org/html/libxml-parser.html#XMLSTRCMP">xmlStrcmp</ulink></function>. If there is a match, it calls the function <function>parseStory</function>.</para> + </callout> + </calloutlist> + </para> + + <para> + <programlisting> +void +parseStory (xmlDocPtr doc, xmlNodePtr cur) { + + <co id="anothergetchild" /> cur = cur->xmlChildrenNode; + <co id="findkeyword" /> while (cur != NULL) { + if ((!xmlStrcmp(cur->name, (const xmlChar *)"keyword"))) { + <co id="foundkeyword" /> printf("keyword: %s\n", xmlNodeListGetString(doc, cur->xmlChildrenNode, 1)); + } + cur = cur->next; + } + return; +} + </programlisting> + <calloutlist> + <callout arearefs="anothergetchild"> + <para>Again we get the first child node.</para> + </callout> + <callout arearefs="findkeyword"> + <para>Like the loop above, we then iterate through the nodes, looking + for one that matches the element we're interested in, in this case + "keyword".</para> + </callout> + <callout arearefs="foundkeyword"> + <para>When we find the "keyword" element, we need to print + its contents. Remember that in <acronym>XML</acronym>, the text + contained within an element is a child node of that element, so we + turn to <varname>cur->xmlChildrenNode</varname>. To retrieve it, we + use the function <function><ulink + url="http://xmlsoft.org/html/libxml-tree.html#XMLNODELISTGETSTRING">xmlNodeListGetString</ulink></function>, which also takes the <varname>doc</varname> pointer as an argument. In this case, we just print it out.</para> + </callout> + </calloutlist> + </para> + + </sect1> + +<sect1 id="xmltutorialwritingcontent"> + <title>Writing element content</title> + <para>Writing element content uses many of the same steps we used above + — parsing the document and walking the tree. We parse the document, + then traverse the tree to find the place we want to insert our element. For + this example, we want to again find the "storyinfo" element and + this time insert a keyword. Then we'll write the file to disk. Full code: + <xref linkend="addkeywordappendix" /></para> + + <para> + The main difference in this example is in + <function>parseStory</function>: + + <programlisting> +void +parseStory (xmlDocPtr doc, xmlNodePtr cur, char *keyword) { + + <co id="addkeyword" /> xmlNewTextChild (cur, NULL, "keyword", keyword); + return; +} + </programlisting> + <calloutlist> + <callout arearefs="addkeyword"> + <para>The <function><ulink + url="http://xmlsoft.org/html/libxml-tree.html#XMLNEWTEXTCHILD">xmlNewTextChild</ulink></function> + function adds a new child element at the + current node pointer's location in the + tree, specificied by <varname>cur</varname>.</para> + </callout> + </calloutlist> + </para> + + <para> + Once the node has been added, we would like to write the document to + file. Is you want the element to have a namespace, you can add it here as + well. In our case, the namespace is NULL. + <programlisting> + xmlSaveFormatFile (docname, doc, 1); + </programlisting> + The first parameter is the name of the file to be written. You'll notice + it is the same as the file we just read. In this case, we just write over + the old file. The second parameter is a pointer to the xmlDoc + structure. Setting the third parameter equal to one ensures indenting on output. + </para> + + </sect1> +<!-- + <sect1 id="xmltutorialattribute"> + <title>Retrieving Attributes</title> + <para>Retrieving</para> + </sect1> +--> + + + <sect1 id="xmltutorialwritingattribute"> + <title>Writing Attribute</title> + <para>Writing an attribute is similar to writing text to a new element. In + this case, we'll add a reference <acronym>URI</acronym> to our + document. Full code:<xref linkend="addattributeappendix" />.</para> + <para> + A <sgmltag>reference</sgmltag> is a child of the <sgmltag>story</sgmltag> + element, so finding the place to put our new element and attribute is + simple. As soon as we do the error-checking test in our + <function>parseDoc</function>, we are in the right spot to add our + element. But before we do that, we need to make a declaration using a + datatype we have not seen yet: + <programlisting> + xmlAttrPtr newattr; + </programlisting> + We also need an extra xmlNodePtr: + <programlisting> + xmlNodePtr newnode; + </programlisting> + </para> + <para> + The rest of <function>parseDoc</function> is the same as before until we + check to see if our root element is <sgmltag>story</sgmltag>. If it is, + then we know we are at the right spot to add our element: + + <programlisting> + <co id="addreferencenode" /> newnode = xmlNewTextChild (cur, NULL, "reference", NULL); + <co id="addattributenode" /> newattr = xmlNewProp (newnode, "uri", uri); + </programlisting> + <calloutlist> + <callout arearefs="addreferencenode"> + <para>First we add a new node at the location of the current node + pointer, <varname>cur.</varname> using the <ulink + url="http://xmlsoft.org/html/libxml-tree.html#XMLNEWTEXTCHILD">xmlNewTextChild</ulink> function.</para> + </callout> + </calloutlist> + </para> + + <para>Once the node is added, the file is written to disk just as in the + previous example in which we added an element with text content.</para> + + </sect1> + +<!-- + <appendix id="furtherresources"> + <title>Further Resources</title> + <para></para> + </appendix> +--> + <appendix id="sampledoc"> + <title>Sample Document</title> + <programlisting>&STORY;</programlisting> + </appendix> + <appendix id="keywordappendix"> + <title>Code for Keyword Example</title> + <para> + <programlisting>&KEYWORD;</programlisting> + </para> + </appendix> +<appendix id="addkeywordappendix"> + <title>Code for Add Keyword Example</title> + <para> + <programlisting>&ADDKEYWORD;</programlisting> + </para> + </appendix> +<appendix id="addattributeappendix"> + <title>Code for Add Attribute Example</title> + <para> + <programlisting>&ADDATTRIBUTE;</programlisting> + </para> + </appendix> +</article> |
