From: Daniel Veillard (Daniel.Veillard@imag.fr)
Date: Thu Feb 22 2001 - 17:24:22 EST
On Thu, Feb 22, 2001 at 02:05:43PM -0800, James McCann wrote:
>
> I am using the SAX interface, and have indicated the encoding of the
> xml document, and the encoding member of the context is set correctly,
> but the SAX callbacks get text in the UTF-8 representation. It seems to me
> that this behavior is wrong; if the user has correctly indicated an
> encoding,
> the library should use that encoding when calling the callbacks. Are there
> any plans to fix this?
No, the internal representation is UTF8. The encoding conversion in
done before the data is passed to the parser. I don't think it's
a bug, but a feature :-)
> If there are no objections I will submit patches to fix this when I get
> time (for now I have written some functions that do the transcoding
> and then call my callbacks, but I don't think this is a good solution).
> I realize that SAX is something of an ad-hoc thing so there may not be
> standards to adhere to (are there standards? I couldn't find any) but
> the current behavior does not seem correct to me.
This is the intended behavior. And I'm afraid you would have
serious troubles trying to change libxml use only the original
encoding. Please check
http://xmlsoft.org/encoding.html
for more explanation on those issues.
Daniel
-- Daniel Veillard | Red Hat Network http://redhat.com/products/network/ veillard@redhat.com | libxml Gnome XML toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/ ---- Message from the list xml@rpmfind.net Archived at : http://xmlsoft.org/messages/ to unsubscribe: echo "unsubscribe xml" | mail majordomo@rpmfind.net
This archive was generated by hypermail 2b29 : Thu Feb 22 2001 - 18:43:39 EST