Skip site navigation (1)Skip section navigation (2)
Date:      Wed,  1 Dec 2004 14:40:14 +0100
From:      Alexander Leidinger <Alexander@Leidinger.net>
To:        current@freebsd.org
Cc:        tode@bpanet.de
Subject:   Bug in our ru_RU.KOI8-R locale (with patch)?
Message-ID:  <1101908414.41adc9be50c73@netchild.homeip.net>

next in thread | raw e-mail | index | archive | help
This message is in MIME format.

---MOQ1101908412fd6f338b5af8faca0ce8ba33784fec07
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit

Hi,

I got a report that our ru_RU.KOI8-R locale seems to be broken. Attached
is a test program (test.pl, tested with perl 5.8.2) and some test input
(test.txt) which is supposed to show the problem. I can't read any
kyrillic language, so I can't really confirm if the attached patch is the
right fix.

If you run the test program you should see something like this (strange
looking text maybe because of the webmailer I use):
---snip---
Match small (RegEx with i flag): 0
Match small (RegEx without i flag): 8
Match for normal (RegEx with i flag): 17
Match for normal (RegEx without i flag): 9

Case - Check for '&#1103;&#1107;&#1112;&#1098;&#1096;&#1101;'
lc() => &#1103;&#1107;&#1112;&#1098;&#1096;&#1101;
uc() => &#1071;&#1075;&#1080;&#1066;&#1064;&#1069;
lcfirst() => &#1103;&#1107;&#1112;&#1098;&#1096;&#1101;
ucfirst() => &#1071;&#1107;&#1112;&#1098;&#1096;&#1101;

Case - Check for '&#1071;&#1107;&#1112;&#1098;&#1096;&#1101;'
lc() => &#1103;&#1107;&#1112;&#1098;&#1096;&#1101;
uc() => &#1071;&#1075;&#1080;&#1066;&#1064;&#1069;
lcfirst() => &#1103;&#1107;&#1112;&#1098;&#1096;&#1101;
ucfirst() => &#1071;&#1107;&#1112;&#1098;&#1096;&#1101;
---snip---

I'm told the "Case - Check" parts are correct with the patch, but not
without it (lc() -> lower case the entire string; uc() -> upper case the
entire string; lcfirst() -> lower case the first character; ...). Can
someone please confirm this?

If this is correct we've solved only a part of the problem. The other
part seems to be related to LC_COLLATE. "Match small" with the i flag
(case insensitive matching) shouldn't print 0 when "Match normal" with
the i flag doesn't print 0. Any ideas how to solve this?

If the patch isn't correct we still have a bug somwhere (please CC
perl@freebsd.org then). Why isn't perl able to do a case insensitive
match in the ru_RU.KOI8-R locale?

BTW.: this affects 4.x (problem noticed here), 5.x and -current (I've
tested the patch here).

Bye,
Alexander.

-- 
http://www.Leidinger.net/     Alexander @ Leidinger.net: PGP ID = B0063FE7
http://www.FreeBSD.org/        netchild @ FreeBSD.org  : PGP ID = 72077137

---MOQ1101908412fd6f338b5af8faca0ce8ba33784fec07
Content-Type: application/octet-stream; name="test.pl"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="test.pl"

IyEvdXNyL2Jpbi9lbnYgcGVybAoKdXNlIGxvY2FsZTsKCm15ICRmaWxlCQk9ICd0ZXN0LnR4dCc7
Cm15ICRwdXNoa2luX3NtYWxsCT0gJ+/z+Oro7Sc7Cm15ICRwdXNoa2luX25vcm1hbAk9ICfP8/jq
6O0nOwoKbXkgJGRhdGEJCT0gTG9hZEZpbGUoJGZpbGUpOwoKbXkgJGNvdW50X25vcm1hbF9pCT0g
MDsKbXkgJGNvdW50X3NtYWxsX2kJPSAwOwpteSAkY291bnRfbm9ybWFsICAgICAgPSAwOwpteSAk
Y291bnRfc21hbGwgICAgICAgPSAwOwoKZm9yZWFjaCBteSAkbGluZSAoQHskZGF0YX0pIHsKCSRj
b3VudF9ub3JtYWxfaSsrIGlmICgkbGluZSA9fiBtLyRwdXNoa2luX25vcm1hbC9pc2cpOwoJJGNv
dW50X3NtYWxsX2krKyBpZiAoJGxpbmUgPX4gbS8kcHVzaGtpbl9zbWFsbC9pc2cpOwoJJGNvdW50
X25vcm1hbCsrIGlmICgkbGluZSA9fiBtLyRwdXNoa2luX25vcm1hbC9zZyk7CiAgICAgICAgJGNv
dW50X3NtYWxsKysgaWYgKCRsaW5lID1+IG0vJHB1c2hraW5fc21hbGwvc2cpOwp9CgpwcmludCAi
TWF0Y2ggc21hbGwgKFJlZ0V4IHdpdGggaSBmbGFnKTogJGNvdW50X3NtYWxsX2lcbiI7CnByaW50
ICJNYXRjaCBzbWFsbCAoUmVnRXggd2l0aG91dCBpIGZsYWcpOiAkY291bnRfc21hbGxcbiI7Cgpw
cmludCAiTWF0Y2ggZm9yIG5vcm1hbCAoUmVnRXggd2l0aCBpIGZsYWcpOiAkY291bnRfbm9ybWFs
X2lcbiI7CnByaW50ICJNYXRjaCBmb3Igbm9ybWFsIChSZWdFeCB3aXRob3V0IGkgZmxhZyk6ICRj
b3VudF9ub3JtYWxcblxuIjsKVGVzdENhc2UoJHB1c2hraW5fc21hbGwpOwpUZXN0Q2FzZSgkcHVz
aGtpbl9ub3JtYWwpOwoKZXhpdCgwKTsKCgpzdWIgVGVzdENhc2UgewoJbXkgJHN0cmluZwk9IHNo
aWZ0KEBfKTsKCXByaW50ICJDYXNlIC0gQ2hlY2sgZm9yIFwnJHN0cmluZ1wnXG4iOwoJcHJpbnQg
ImxjKCkgPT4gIi5sYygkc3RyaW5nKS4iXG4iOwoJcHJpbnQgInVjKCkgPT4gIi51Yygkc3RyaW5n
KS4iXG4iOwoJcHJpbnQgImxjZmlyc3QoKSA9PiAiLmxjZmlyc3QoJHN0cmluZykuIlxuIjsKCXBy
aW50ICJ1Y2ZpcnN0KCkgPT4gIi51Y2ZpcnN0KCRzdHJpbmcpLiJcbiI7CgkKCXByaW50ICJcbiI7
CgoJcmV0dXJuIDE7Cn0KCgpzdWIgTG9hZEZpbGUgewoJbXkgJGZpbGUJPSBzaGlmdChAXyk7Cglt
eSBAdmFsdWUJPSAoKTsKCW9wZW4oRklMRSwgIjwkZmlsZSIpOwoJQHZhbHVlCQk9IDxGSUxFPjsK
CWNsb3NlKEZJTEUpOwoJY2hvbXAoQHZhbHVlKTsKCXJldHVybiBcQHZhbHVlOwp9Cgo=

---MOQ1101908412fd6f338b5af8faca0ce8ba33784fec07
Content-Type: text/plain; name="test.txt"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="test.txt"

7/P46ujtDQrP8/jq6O0NClRlc3QNClRlc3QNClRFU1QNCnRFU1QNCu/z+Oro7Q0Kz/P46ujtDQpU
ZXN0DQpUZXN0DQpURVNUDQp0RVNUDQrv8/jq6O0NCu/z+Oro7Q0K7/P46ujtDQrv8/jq6O0NCs/z
+Oro7Q0Kz/P46ujtDQrP8/jq6O0NCs/z+Oro7Q0Kz/P46ujtDQrv8/jq6O0NCs/z+Oro7Q0Kz/P4
6ujtDQrv8/jq6O0NCg0KQ09VTlQgbG93ZXIgOCB1cHBlciA5DQoNCg==

---MOQ1101908412fd6f338b5af8faca0ce8ba33784fec07
Content-Type: application/octet-stream; name="mklocale:ru_RU.KOI8-R.diff"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="mklocale:ru_RU.KOI8-R.diff"

LS0tIC91c3Ivc3JjL3NoYXJlL21rbG9jYWxlL3J1X1JVLktPSTgtUi5zcmMJRnJpIE5vdiAzMCAw
NjowNTo1MyAyMDAxCisrKyBydV9SVS5LT0k4LVIuc3JjCVdlZCBEZWMgIDEgMTM6Mzg6NTkgMjAw
NApAQCAtMTMsMjcgKzEzLDI3IEBACiBDT05UUk9MCQkweDAwIC0gMHgxZiAweDdmCiBESUdJVAkJ
JzAnIC0gJzknCiBHUkFQSAkJMHgyMSAtIDB4N2UgMHg4MCAtIDB4OTkJMHg5YiAtIDB4ZmYKLUxP
V0VSCQknYScgLSAneicgMHhhMyAweGMwIC0gMHhkZgorTE9XRVIJCSdhJyAtICd6JyAweGIzIDB4
ZTAgLSAweGZmCiBQVU5DVAkJMHgyMSAtIDB4MmYgMHgzYSAtIDB4NDAgMHg1YiAtIDB4NjAgMHg3
YiAtIDB4N2UKIFNQQUNFCQkweDA5IC0gMHgwZCAweDIwIDB4OWEKLVVQUEVSCQknQScgLSAnWicg
MHhiMyAweGUwIC0gMHhmZgorVVBQRVIJCSdBJyAtICdaJyAweGEzIDB4YzAgLSAweGRmCiBYRElH
SVQgICAgICAgICAgJzAnIC0gJzknICdhJyAtICdmJyAnQScgLSAnRicKIEJMQU5LCQknICcgJ1x0
JyAweDlhCiBQUklOVAkJMHgyMCAtIDB4N2UgMHg4MCAtIDB4ZmYKIAogTUFQTE9XRVIgICAgICAg
CTwnQScgLSAnWicgOiAnYSc+CiBNQVBMT1dFUiAgICAgICAJPCdhJyAtICd6JyA6ICdhJz4KLU1B
UExPV0VSCTwweGIzICAweGEzPgotTUFQTE9XRVIgICAgICAgIDwweGEzICAweGEzPgotTUFQTE9X
RVIJPDB4ZTAgLSAweGZmIDogMHhjMD4KLU1BUExPV0VSCTwweGMwIC0gMHhkZiA6IDB4YzA+CitN
QVBMT1dFUgk8MHhiMyAgMHhiMz4KK01BUExPV0VSICAgICAgICA8MHhhMyAgMHhiMz4KK01BUExP
V0VSCTwweGUwIC0gMHhmZiA6IDB4ZTA+CitNQVBMT1dFUgk8MHhjMCAtIDB4ZGYgOiAweGUwPgog
CiBNQVBVUFBFUiAgICAgICAJPCdBJyAtICdaJyA6ICdBJz4KIE1BUFVQUEVSICAgICAgIAk8J2En
IC0gJ3onIDogJ0EnPgotTUFQVVBQRVIgICAgICAgIDwweGIzICAweGIzPgotTUFQVVBQRVIJPDB4
YTMgIDB4YjM+Ci1NQVBVUFBFUgk8MHhlMCAtIDB4ZmYgOiAweGUwPgotTUFQVVBQRVIJPDB4YzAg
LSAweGRmIDogMHhlMD4KK01BUFVQUEVSICAgICAgICA8MHhiMyAgMHhhMz4KK01BUFVQUEVSCTww
eGEzICAweGEzPgorTUFQVVBQRVIJPDB4ZTAgLSAweGZmIDogMHhjMD4KK01BUFVQUEVSCTwweGMw
IC0gMHhkZiA6IDB4YzA+CiAKIFRPRElHSVQgICAgICAgCTwnMCcgLSAnOScgOiAwPgogVE9ESUdJ
VCAgICAgICAJPCdBJyAtICdGJyA6IDEwPgo=

---MOQ1101908412fd6f338b5af8faca0ce8ba33784fec07--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1101908414.41adc9be50c73>