Date: Sat, 4 May 2013 16:03:15 +0400 From: Sergey Kandaurov <pluknet@freebsd.org> To: Andrey Chernov <ache@freebsd.org> Cc: svn-src-stable@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org, svn-src-stable-9@freebsd.org, Jilles Tjoelker <jilles@stack.nl> Subject: Re: svn commit: r250215 - stable/9/lib/libc/locale Message-ID: <CAE-mSO%2BJOTcfx1vDbiux8LpikZV0J1ti2HJ0ypCsotfeJ4qKzg@mail.gmail.com> In-Reply-To: <5184ED7E.3040703@freebsd.org> References: <201305031552.r43FqiPN024580@svn.freebsd.org> <5183E899.4000503@freebsd.org> <CAE-mSO%2BB_p_HCbKwSO-rJ%2BdforcPEfThmOxy%2BKi_1e9zPn3q_w@mail.gmail.com> <20130503195540.GA52657@stack.nl> <CAE-mSOLT6EdaYQheNka%2B%2BNPZRbUFM=kXv6i9k=uRiyQTy1JuuA@mail.gmail.com> <5184ED7E.3040703@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
On 4 May 2013 15:14, Andrey Chernov <ache@freebsd.org> wrote: > On 04.05.2013 0:48, Sergey Kandaurov wrote: >> On 3 May 2013 23:55, Jilles Tjoelker <jilles@stack.nl> wrote: >>> Some sort of perfect hashing can also be an option, although it makes it >>> harder to add new properties or adds a build dependency on gperf(1) that >>> we would like to get rid of. >> I hacked a bit on wctype. Speaking about speed, it shows about 1-3.5x >> improvement over the previous fast version (before r250215). >> >> Time spend for 2097152 wctype() calls for each of wctype property >> current previous mine >> alnum 0.090554676 0.035821210 0.033270579 >> alpha 0.172074310 0.052461036 0.044916572 >> blank 0.261109989 0.055735281 0.036682745 >> cntrl 0.357318986 0.069249831 0.038292782 >> digit 0.436381530 0.094194364 0.039249005 >> graph 0.540954812 0.085580099 0.043331460 >> lower 0.618306476 0.095665215 0.044070399 >> print 0.707443135 0.132559305 0.048216097 >> punct 0.788922052 0.142809109 0.062871432 >> space 0.888263108 0.150516644 0.054086142 >> upper 0.966903461 0.173593592 0.054027834 >> xdigit 0.406611275 0.201614227 0.060695939 >> ideogram 0.439763499 0.239640723 0.068566486 >> special 0.523128094 0.249156298 0.099278051 >> phonogram 0.564975870 0.260972651 0.135751471 >> rune 0.637392247 0.235195497 0.064093971 >> >> Index: locale/wctype.c >> =================================================================== >> --- locale/wctype.c (revision 250217) >> +++ locale/wctype.c (working copy) >> @@ -74,6 +74,9 @@ >> "special\0" /* BSD extension */ >> "phonogram\0" /* BSD extension */ >> "rune\0"; /* BSD extension */ >> + static const size_t propnamlen[] = { >> + 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 6, 8, 7, 9, 4, 0 >> + }; >> static const wctype_t propmasks[] = { >> _CTYPE_A|_CTYPE_D, >> _CTYPE_A, >> @@ -92,16 +95,17 @@ >> _CTYPE_Q, >> 0xFFFFFF00L >> }; >> - size_t len1, len2; >> + const size_t *len2; >> const char *p; >> const wctype_t *q; >> >> - len1 = strlen(property); >> q = propmasks; >> - for (p = propnames; (len2 = strlen(p)) != 0; p += len2 + 1) { >> - if (len1 == len2 && memcmp(property, p, len1) == 0) >> + len2 = propnamlen; >> + for (p = propnames; *len2 != 0; ) { >> + if (property[0] == p[0] && strcmp(property, p) == 0) >> return (*q); >> - q++; >> + p += *len2 + 1; >> + q++; len2++; >> } >> >> return (0UL); >> [...] > > BTW, I don't run tests and look in asm code for sure, but it seems > property[0] == p[0] is unneeded because almost every compiler tries to > inline strcmp(). Doesn't seem so (in-lining), see below. Apparently property[0] == p[0] is cheaper than strcmp() for negative checks. Removing this condition brings perf. numbers back to the "previous" column. Looking into asm: # property[0] == p[0] 4d: 44 3a 75 00 cmp 0x0(%rbp),%r14b 51: 75 dd jne 30 <wctype_l+0x30> # strcmp() 53: 48 89 ee mov %rbp,%rsi 56: 4c 89 ff mov %r15,%rdi 59: e8 00 00 00 00 callq 5e <wctype_l+0x5e> 5e: 85 c0 test %eax,%eax 60: 75 ce jne 30 <wctype_l+0x30> -- wbr, pluknet
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAE-mSO%2BJOTcfx1vDbiux8LpikZV0J1ti2HJ0ypCsotfeJ4qKzg>