Re: [xml] Encoding and Win32

Date view Thread view Subject view Author view

From: Daniel Veillard (Daniel.Veillard@imag.fr)
Date: Sat Feb 03 2001 - 03:18:26 EST


On Fri, Feb 02, 2001 at 11:06:18AM -0800, Dave Madole wrote:
>
> Hi,
>
> I feel that my original comment could use a little clarification.
>
> First, I tried using "iconv", but it didn't really cut it for a couple of
> reasons, not the least of which was the fact that its use was completely opaque
> on my Red Hat linux box - man iconv, man -k iconv, man unicode, etc. produce
> nothing appropriate.

  Your systeme is probably misinstalled:
--------
man iconv:
ICONV(3) Linux Programmer's Manual ICONV(3)

NAME
       iconv - perform character set conversion

SYNOPSIS
       #include <iconv.h>

       size_t iconv (iconv_t cd,
                     const char* * inbuf, size_t * inbytesleft,
                     char* * outbuf, size_t * outbytesleft);
...
--------
  info iconv returns the information too

> I searched around for the data files, etc. to no avail.
> There are man pages on my solaris box, but they aren't really appropriate, as
> the character encodings actually SEEM to have different names (cross platform
> transparency is critical to my app).

  Simply compile GNU iconv on Solaris and you will get same behaviour.

> Our friends at Red Hat could use a little prodding as far as the doc goes.
> Doing a search for "iconv" on the RedHat sites turns up nothing useful.

  Depends what you call useful ...

> May I suggest adding a link to the online Linux iconv documentation to the
> xmlsoft.org web site?

  It is in the FAQ:
   http://xmlsoft.org/FAQ.html#Compilatio
with pointers to both the official specification at the opengroup and
a pointer to at least one portable implementation of the library

> Secondly, the data from which I am building my document is coming from various
> sources, some in utf-8, some in "wrong endian" utf-16 (Oracle), some in "wierd"
> and basically unpredictable multi-byte Asian character encodings. In my case
> it actually makes sense to convert to a "neutral" utf-16 intermediate
> encoding.

  libxml will natively (without iconv) support utf8 and both UTF16 encodings.
Converting the asian to UTF8 directly does make sense...

> Also I am dealing with a situation where I need to be told on the
> fly what destination encoding to use from a very large range of Asian character
> encodings - ICU makes this much easier because it accepts just about anything
> the user might enter as the name of the encoding. It also provides support for
> collating and sorting, etc.

  Okay, that's something I can't comment on ...

> Finally, I had meant to add that I certainly appreciate the work that went into
> putting iconv support into libxml, and understand that it IS the standard and
> that it should be used when possible and didn't mean to imply that it was a
> wrong choice in any way. No doubt in most cases it is more than adequate.
> ICU probably is more than most people would need and is a bit fat, but it works
> very well and is copiously documented.

  Okay,

Daniel

-- 
Daniel Veillard      | Red Hat Network http://redhat.com/products/network/
veillard@redhat.com  | libxml Gnome XML toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
----
Message from the list xml@rpmfind.net
Archived at : http://xmlsoft.org/messages/
to unsubscribe: echo "unsubscribe xml" | mail  majordomo@rpmfind.net


Date view Thread view Subject view Author view

This archive was generated by hypermail 2b29 : Sat Feb 03 2001 - 04:47:47 EST