Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 16 Dec 2000 03:06:04 -0600
From:      "Michael C . Wu" <keichii@iteration.net>
To:        doc@freebsd.org, i18n@freebsd.org
Subject:   Docbook and CJK languages
Message-ID:  <20001216030604.B46336@peorth.iteration.net>

next in thread | raw e-mail | index | archive | help

While working on some freebsd-taiwan docbook, we discovered this
problem with Docbook/SGML not handling 2 byte characters correctly.

For example:
I have this line of text ("AA" and "BB" are two examples of 2 byte chars)

<PARA> AABBAABBAABBAABB </PARA>

When I compile this with output specified to text files.  The correct
behavior to cut them into two lines would be:
AABBAABBAABB/n
AABB/n

However, sometimes the output comes out looking like:
AABBAABBAABBA/n
ABB/n
(Note the broken AA char in the last part of the first line)

This causes the whole doc to be broken and unreadable.  Since
subsequent encoding/decoding is off-by-one.  And the problem
can repeat several times in the documentation.

Is there any way to fix this?  Is there an SGML tag that I can 
specify?  Or is this a lacking feature of Docbook?

-- 
+------------------------------------------------------------------+
| keichii@peorth.iteration.net         | keichii@bsdconspiracy.net |
| http://peorth.iteration.net/~keichii | Yes, BSD is a conspiracy. |
+------------------------------------------------------------------+


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-doc" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20001216030604.B46336>