Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 01 Feb 2018 19:51:15 -0800
From:      Bakul Shah <bakul@bitblocks.com>
To:        Farhan Khan <khanzf@gmail.com>
Cc:        freebsd-hackers@freebsd.org
Subject:   Re: Printing UTF-8 characters
Message-ID:  <20180202035130.C51F8156E80B@mail.bitblocks.com>
In-Reply-To: Your message of "Thu, 01 Feb 2018 10:42:36 -0500." <CAFd4kYB_eU00Z5nBzp-iNGuELN4cy_ADGABb-boq4Fvn-a0XMg@mail.gmail.com>
References:  <CAFd4kYD_Q9Y84LvCGELVodt%2B30KM_KzNzoLOzudZm9kaLqGPaQ@mail.gmail.com> <20180201072831.GA2239@c720-r314251> <CAFd4kYB_eU00Z5nBzp-iNGuELN4cy_ADGABb-boq4Fvn-a0XMg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, 01 Feb 2018 10:42:36 -0500 Farhan Khan <khanzf@gmail.com> wrote:
> Sorry, that was a poorly phrased question on my part. Let me try again.
> I am trying to make text align in columns in a terminal. My
> understanding is that characters above 0x7E are 3 bytes in length. A
> modern terminal will render that as either a single question-mark or
> the character itself, making terminal column alignment easy. But how
> would an older terminal display a 3-byte character? I am worried that
> would render as 3 question marks and throw off column alignment. If
> so, is there a proper way to perform alignment for both newer and
> older terminals?

UTF-8 can use upto 4 bytes to encode a unicode point,
depending on the script.

For what you want, you can use openoffice like programs that
understand unicode and can do complex text layout. Normal
terminal programs typically use monospace (fixed width) fonts
are simply not capable of what you want. The assumption that
one char means one rectangular cell on the screen is too
deeply woven in them.  Particularly for Indic languages this
just doesn't work, You may have N unicode points, each of
which require 3 bytes, all together map to a one single glyph.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20180202035130.C51F8156E80B>