Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 28 Dec 2011 20:36:07 +0100
From:      Ulrich =?utf-8?B?U3DDtnJsZWlu?= <uqs@spoerlein.net>
To:        Doug Barton <dougb@FreeBSD.org>
Cc:        svn-src-head@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org
Subject:   Re: svn commit: r228909 - head/games/fortune/datfiles
Message-ID:  <20111228193607.GF83814@acme.spoerlein.net>
In-Reply-To: <4EFB58F1.6020206@FreeBSD.org>
References:  <201112271021.pBRALvxB048644@svn.freebsd.org> <20111228155340.GE83814@acme.spoerlein.net> <4EFB58F1.6020206@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 2011-12-28 at 09:59:13 -0800, Doug Barton wrote:
> On 12/28/2011 07:53, Ulrich Spörlein wrote:
> > On Tue, 2011-12-27 at 10:21:57 +0000, Doug Barton wrote:
> >> Author: dougb
> >> Date: Tue Dec 27 10:21:57 2011
> >> New Revision: 228909
> >> URL: http://svn.freebsd.org/changeset/base/228909
> >>
> >> Log:
> >>   1. Remove a bunch of duplicates. Usually this means removing them from
> >>      fortunes, but occasionally remove them from the other 2 files when
> >>      they are not offensive, or not murphy'ish enough.
> >>   
> >>      Where the version in fortunes had better attribution and/or formatting,
> >>      copy it over.
> >>   
> >>   2. Fix a few typos
> >>   
> >>   3. Use the full name of François De La Rochefoucauld, fix one of his
> >>      quotes, and remove the duplicate of it.
> > 
> > Sigh,
> > 
> > except for a stupid Unicode version of an apostrophe (’ vs ')
> 
> That seems like an easy thing to fix?

Sure, somebody must have snuk that in while I was not watching ;]
However, the real solution would be some sort of pre-submit check or
even breaking the build when the datfile is not 7bit clean.

The state is that all datfiles were ASCII clean some time in the past,
except for gerrold.limerick which has a unicode (C) in a comment, so it
doesn't actually affect operation of fortune so I left it in.

> > this file
> > was ASCII. And I made it so for a reason. We don't currently have a way
> > to iconv fortune(6)'s output to the users LC_CTYPE. ASCII is the common
> > denominator so that's what we have to choose to be bug free.
> 
> What breaks for non-ASCII text?

If your terminal is ISO8859-1 (aka latin1) or an other non-UTF-8 groking
terminal, you'll get garbage instead of François. Not a biggie but ugly
anyhow.

> > My plan was to teach fortune to use bsdiconv once that is ready and in
> > the tree to convert from Unicode to the users' locale. But until that is
> > ready, we have to stick to ASCII.
> 
> I'm not opposed to doing that, but I want to make sure that a) it's for
> a good reason, and b) that we have some way to know what needs to be
> added back when it's safe.
> 
> Meanwhile, I did actually test this change and it worked for me, so I
> thought it was safe to proceed.

Your terminal understands UTF-8, so you don't see a difference between
ASCII and Unicode chars. Try setting LANG to, e.g. en_US.ISO8859-1 and
run xterm +u8 with it (just to make sure). Then, when displaying a quote
you get:

% fortune -m Rochefoucauld
%% (fortunes)
Absence diminishes mediocre passions and increases
great ones, as the wind blows out candles and fans fires.
                -- François De La Rochefoucauld

(I hope this makes it through the way I see it). It all boils down to
that fact that fortune(6) is not locale aware and thus only ASCII chars
are safe to display (no EBCDIC does not count).

> > This is not a backout request, 
> 
> I've no objection to making a change. Apparently the De should be de
> anyway, so what do you suggest?

I cannot speak to that with any authority.

Uli

PS: I'd love for us to drop supporting anything but Unicode, but then
again I'd also would like to have a pony ...



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20111228193607.GF83814>