$94 GRAYBYTE WORDPRESS FILE MANAGER $78

SERVER : premium201.web-hosting.com #1 SMP Wed Mar 26 12:08:09 UTC 2025
SERVER IP : 172.67.217.254 | ADMIN IP 216.73.216.180
OPTIONS : CRL = ON | WGT = ON | SDO = OFF | PKEX = OFF
DEACTIVATED : mail

/opt/alt/libxml2/usr/share/doc/alt-libxml2-devel/tutorial/

HOME
Current File : /opt/alt/libxml2/usr/share/doc/alt-libxml2-devel/tutorial//ar01s09.html
<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Encoding Conversion</title><meta name="generator" content="DocBook XSL Stylesheets V1.61.2"><link rel="home" href="index.html" title="Libxml Tutorial"><link rel="up" href="index.html" title="Libxml Tutorial"><link rel="previous" href="ar01s08.html" title="Retrieving Attributes"><link rel="next" href="apa.html" title="A.�Compilation"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Encoding Conversion</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="ar01s08.html">Prev</a>�</td><th width="60%" align="center">�</th><td width="20%" align="right">�<a accesskey="n" href="apa.html">Next</a></td></tr></table><hr></div><div class="sect1" lang="en"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="xmltutorialconvert"></a>Encoding Conversion</h2></div></div><div></div></div><p><a class="indexterm" name="id2587348"></a>
Data encoding compatibility problems are one of the most common
      difficulties encountered by programmers new to <span class="acronym">XML</span> in
      general and <span class="application">libxml</span> in particular. Thinking
      through the design of your application in light of this issue will help
      avoid difficulties later. Internally, <span class="application">libxml</span>
      stores and manipulates data in the UTF-8 format. Data used by your program
      in other formats, such as the commonly used ISO-8859-1 encoding, must be
      converted to UTF-8 before passing it to <span class="application">libxml</span>
      functions. If you want your program's output in an encoding other than
      UTF-8, you also must convert it.</p><p><span class="application">Libxml</span> uses
      <span class="application">iconv</span> if it is available to convert
    data. Without <span class="application">iconv</span>, only UTF-8, UTF-16 and
    ISO-8859-1 can be used as external formats. With
    <span class="application">iconv</span>, any format can be used provided
    <span class="application">iconv</span> is able to convert it to and from
    UTF-8. Currently <span class="application">iconv</span> supports about 150
    different character formats with ability to convert from any to any. While
    the actual number of supported formats varies between implementations, every
    <span class="application">iconv</span> implementation is almost guaranteed to
    support every format anyone has ever heard of.</p><div class="warning" style="margin-left: 0.5in; margin-right: 0.5in;"><table border="0" summary="Warning"><tr><td rowspan="2" align="center" valign="top" width="25"><img alt="[Warning]" src="images/warning.png"></td><th align="left">Warning</th></tr><tr><td colspan="2" align="left" valign="top"><p>A common mistake is to use different formats for the internal data
	in different parts of one's code. The most common case is an application
	that assumes ISO-8859-1 to be the internal data format, combined with
	<span class="application">libxml</span>, which assumes UTF-8 to be the
	internal data format. The result is an application that treats internal
	data differently, depending on which code section is executing. The one or
	the other part of code will then, naturally, misinterpret the data.
      </p></td></tr></table></div><p>This example constructs a simple document, then adds content provided
    at the command line to the document's root element and outputs the results
    to <tt class="filename">stdout</tt> in the proper encoding. For this example, we
    use ISO-8859-1 encoding. The encoding of the string input at the command
    line is converted from ISO-8859-1 to UTF-8. Full code: <a href="aph.html" title="H.�Code for Encoding Conversion Example">Appendix�H, <i>Code for Encoding Conversion Example</i></a></p><p>The conversion, encapsulated in the example code in the
      <tt class="function">convert</tt> function, uses
      <span class="application">libxml's</span>
    <tt class="function">xmlFindCharEncodingHandler</tt> function:
      </p><pre class="programlisting">
	<a name="handlerdatatype"></a><img src="images/callouts/1.png" alt="1" border="0">xmlCharEncodingHandlerPtr handler;
        <a name="calcsize"></a><img src="images/callouts/2.png" alt="2" border="0">size = (int)strlen(in)+1; 
        out_size = size*2-1; 
        out = malloc((size_t)out_size); 

&#8230;
	<a name="findhandlerfunction"></a><img src="images/callouts/3.png" alt="3" border="0">handler = xmlFindCharEncodingHandler(encoding);
&#8230;
	<a name="callconversionfunction"></a><img src="images/callouts/4.png" alt="4" border="0">handler-&gt;input(out, &amp;out_size, in, &amp;temp);
&#8230;	
	<a name="outputencoding"></a><img src="images/callouts/5.png" alt="5" border="0">xmlSaveFormatFileEnc("-", doc, encoding, 1);
      </pre><p>
      </p><div class="calloutlist"><table border="0" summary="Callout list"><tr><td width="5%" valign="top" align="left"><a href="#handlerdatatype"><img src="images/callouts/1.png" alt="1" border="0"></a> </td><td valign="top" align="left"><p><tt class="varname">handler</tt> is declared as a pointer to an
	    <tt class="function">xmlCharEncodingHandler</tt> function.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#calcsize"><img src="images/callouts/2.png" alt="2" border="0"></a> </td><td valign="top" align="left"><p>The <tt class="function">xmlCharEncodingHandler</tt> function needs
	  to be given the size of the input and output strings, which are
	    calculated here for strings <tt class="varname">in</tt> and
	  <tt class="varname">out</tt>.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#findhandlerfunction"><img src="images/callouts/3.png" alt="3" border="0"></a> </td><td valign="top" align="left"><p><tt class="function">xmlFindCharEncodingHandler</tt> takes as its
	    argument the data's initial encoding and searches
	    <span class="application">libxml's</span> built-in set of conversion
	    handlers, returning a pointer to the function or NULL if none is
	    found.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#callconversionfunction"><img src="images/callouts/4.png" alt="4" border="0"></a> </td><td valign="top" align="left"><p>The conversion function identified by <tt class="varname">handler</tt>
	  requires as its arguments pointers to the input and output strings,
	  along with the length of each. The lengths must be determined
	  separately by the application.</p></td></tr><tr><td width="5%" valign="top" align="left"><a href="#outputencoding"><img src="images/callouts/5.png" alt="5" border="0"></a> </td><td valign="top" align="left"><p>To output in a specified encoding rather than UTF-8, we use
	    <tt class="function">xmlSaveFormatFileEnc</tt>, specifying the
	    encoding.</p></td></tr></table></div><p>
    </p></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="ar01s08.html">Prev</a>�</td><td width="20%" align="center"><a accesskey="u" href="index.html">Up</a></td><td width="40%" align="right">�<a accesskey="n" href="apa.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Retrieving Attributes�</td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top">�A.�Compilation</td></tr></table></div></body></html>


Current_dir [ NOT WRITEABLE ] Document_root [ NOT WRITEABLE ]


[ Back ]
NAME
SIZE
LAST TOUCH
USER
CAN-I?
FUNCTIONS
..
--
1 Apr 2025 8.34 AM
root / root
0755
images
--
1 Apr 2025 8.34 AM
root / root
0755
apa.html
2.064 KB
26 Jun 2020 12.29 PM
root / root
0644
apb.html
1.979 KB
26 Jun 2020 12.29 PM
root / root
0644
apc.html
3.021 KB
26 Jun 2020 12.29 PM
root / root
0644
apd.html
3.156 KB
26 Jun 2020 12.29 PM
root / root
0644
ape.html
3.023 KB
26 Jun 2020 12.29 PM
root / root
0644
apf.html
2.88 KB
26 Jun 2020 12.29 PM
root / root
0644
apg.html
2.951 KB
26 Jun 2020 12.29 PM
root / root
0644
aph.html
3.516 KB
26 Jun 2020 12.29 PM
root / root
0644
api.html
1.854 KB
26 Jun 2020 12.29 PM
root / root
0644
ar01s02.html
3.424 KB
7 Apr 2022 4.55 PM
root / root
0644
ar01s03.html
5.584 KB
26 Jun 2020 12.29 PM
root / root
0644
ar01s04.html
6.136 KB
7 Apr 2022 4.55 PM
root / root
0644
ar01s05.html
7.277 KB
7 Apr 2022 4.55 PM
root / root
0644
ar01s06.html
3.788 KB
7 Apr 2022 4.55 PM
root / root
0644
ar01s07.html
3.805 KB
7 Apr 2022 4.55 PM
root / root
0644
ar01s08.html
3.832 KB
7 Apr 2022 4.55 PM
root / root
0644
ar01s09.html
7.425 KB
26 Jun 2020 12.29 PM
root / root
0644
includeaddattribute.c
1.109 KB
26 Jun 2020 12.29 PM
root / root
0644
includeaddkeyword.c
1.289 KB
26 Jun 2020 12.29 PM
root / root
0644
includeconvert.c
1.766 KB
26 Jun 2020 12.29 PM
root / root
0644
includegetattribute.c
1.144 KB
26 Jun 2020 12.29 PM
root / root
0644
includekeyword.c
1.319 KB
26 Jun 2020 12.29 PM
root / root
0644
includexpath.c
1.456 KB
26 Jun 2020 12.29 PM
root / root
0644
index.html
5.831 KB
7 Apr 2022 4.55 PM
root / root
0644
ix01.html
2.485 KB
26 Jun 2020 12.29 PM
root / root
0644

GRAYBYTE WORDPRESS FILE MANAGER @ 2025 CONTACT ME
Static GIF