expat/expatfaq.html

101 lines
3.0 KiB
HTML

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
"http://www.w3.org/TR/REC-html40/loose.dtd">
<!--
Note for SuSE package maintainers: this file was taken
verbatim from http://www.jclark.com/xml/expatfaq.html
and has since has obsolete information removed.
-->
<HTML>
<TITLE>expat FAQ</TITLE>
<BODY>
<H1>Frequently Asked Questions about Expat</H1>
<H4>Where can I get help in using expat?</H4>
<p>Try the xml-dev mailing list (subscribe by mailing to <a
href="mailto:majordomo@xml.org&BODY=subscribe%20xml-dev">majordomo@xml.org</a>
with the message <code>subscribe xml-dev</code>). Alternatively try
the mailing lists hosted by <A
href="http://expat.sourceforge.net">sourceforge.net</A>.</P>
<H4>Where is expat's API documented?</H4>
<p>In <code>xmlparse/xmlparse.h</code>. There's also an advanced,
low-level API you can use which is documented in
<code>xmltok/xmltok.h</code>.</p>
<p>There's also an excellent <a
href="http://www.xml.com/pub/1999/09/expat/index.html">article</a>
about expat on XML.com by Clark Cooper.</p>
<H4>Is there a simple example of using expat's API?</H4>
<p>See <code>sample/elements.c</code></p>
<H4>How can I get expat to deal with non-ASCII characters?</H4>
<P>By default, expat assumes that documents are encoded in UTF-8. In
UTF-8, ASCII characters are represented by a single byte as they would
be in ASCII, but non-ASCII characters are represented by a sequence of
two or more bytes all with the 8th bit set. The encoding most widely
used for European languages is ISO 8859-1 which is not compatible with
UTF-8. To use this encoding, expat must be told either by supplying
an argument of <code>"iso-8859-1"</code> to
<code>XML_ParserCreate</code>, or by starting the document with
<code>&lt;?xml version="1.0" encoding="iso-8859-1"?&gt;</code>.</P>
<H4>What encodings does expat support?</H4>
<P>expat has built in support for the following encodings:</P>
<ul>
<li><code>utf-8</code></li>
<li><code>utf-16</code></li>
<li><code>iso-8859-1</code></li>
<li><code>us-ascii</code></li>
</ul>
<P>Additional encodings can be supported by using
<code>XML_SetUnknownEncodingHandler</code>.</P>
<H4>How can I get expat to validate my XML documents?</H4>
<p>You can't. expat is not a validating parser.</p>
<H4>How can I get expat to read my DTD?</H4>
<p>Compile with <code>-DXML_DTD</code> and call
<code>XML_SetParamEntityParsing</code>.</p>
<H4>How can I get expat to recover from errors?</H4>
<p>You can't. All well-formedness errors stop processing. Note that
the XML Recommendation does not permit conforming XML processors to
continue normal processing after a fatal error.</p>
<H4>How do I get at the characters between tags?</H4>
<p>Use <code>XML_SetCharacterDataHandler</code>.</p>
<H4>How can I minimize the size of expat?</H4>
<p>Compile with <code>-DXML_MIN_SIZE</code>. With Visual C++, use the
<code>Win32 MinSize</code> configuration: this creates an
<code>xmlparse.dll</code> that does not require
<code>xmltok.dll</code>.</p>
<ADDRESS>
<A HREF="mailto:jjc@jclark.com">James Clark</A>
</ADDRESS>
</BODY>
</HTML>