Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 3 Nov 2015 10:05:45 -0500
From:      Pedro Giffuni <pfg@FreeBSD.org>
To:        Baptiste Daroussin <bapt@FreeBSD.org>
Cc:        current@FreeBSD.org
Subject:   Re: [CFT] Unicode collation string and reworked locale definitions
Message-ID:  <BA8DB065-2D80-4136-9C93-C454444968E7@FreeBSD.org>
In-Reply-To: <20151103071758.GC31432@ivaldir.etoilebsd.net>
References:  <C3FA8B28-BC4B-4E6D-807D-679C09684128@FreeBSD.org> <20151103071758.GC31432@ivaldir.etoilebsd.net>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Baptiste;

> Il giorno 03/nov/2015, alle ore 02:17, Baptiste Daroussin =
<bapt@FreeBSD.org> ha scritto:
>=20
> On Mon, Nov 02, 2015 at 06:59:15PM -0500, Pedro Giffuni wrote:
>> First of all, congratulations to Baptiste and Marino for succeeding =
where
>> I failed many moons ago. Also huge thanks to Nexenta and Garret =
D=E2=80=99Amore
>> for relicensing localedef for us.
>>=20
>> Concerning regex;
>>=20
>> Gabor@ did a lot of work on libtre but according to him it was not up =
to the
>> task performancewise. We would also lose features if we move to =
libtre.
>>=20
>> I think our regex code actually has most of what is needed for =
multibyte
>> already. I have a patch that turns on the functionality but I =
haven=E2=80=99t found
>> any brave soul that will do the testing:
>>=20
>> https://people.freebsd.org/~pfg/patches/regex-multibyte.diff
>>=20
> I think it this can be tested once the collation branch is merged.

Absolutely: support for collation is critical and badly needed even =
without
resolving the regex issues.

> Note that
> dragonfly and musl libc both uses a patched version of libtre for the =
regex
> implementation.
>=20

I am aware. Also note that Gabor had some patches too, in order to make
it usable for bsdgrep:

https://wiki.freebsd.org/Regex

> =46rom my non scientific testing libtre was more reliable and =
performant then our
> current regex.

According to Gabor, the general performance was better until you take =
into
account multibyte support where it was clearly inferior to GNU regex.

> Anyway it will be relatively "easy" to test using the AT&T
> testsuite the reliability and performance of both implementations: =
ours + your
> patch and patched libtre.
>=20


What worries me about libtre is that it lacks important functionality =
like word
delimiters. We even brought the sysv delimiters to be more compatible =
with
Solaris and GNU and we can=E2=80=99t back those out now:

https://svnweb.freebsd.org/base?view=3Drevision&revision=3D268066

Pedro.





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?BA8DB065-2D80-4136-9C93-C454444968E7>