From owner-freebsd-chat@FreeBSD.ORG Fri Jul 16 09:27:47 2010 Return-Path: Delivered-To: freebsd-chat@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 53DC7106566C for ; Fri, 16 Jul 2010 09:27:47 +0000 (UTC) (envelope-from olli@lurza.secnetix.de) Received: from lurza.secnetix.de (lurza.secnetix.de [IPv6:2a01:170:102f::2]) by mx1.freebsd.org (Postfix) with ESMTP id C83C38FC1B for ; Fri, 16 Jul 2010 09:27:46 +0000 (UTC) Received: from lurza.secnetix.de (localhost [127.0.0.1]) by lurza.secnetix.de (8.14.3/8.14.3) with ESMTP id o6G9RUhI020755; Fri, 16 Jul 2010 11:27:45 +0200 (CEST) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by lurza.secnetix.de (8.14.3/8.14.3/Submit) id o6G9RU34020754; Fri, 16 Jul 2010 11:27:30 +0200 (CEST) (envelope-from olli) Date: Fri, 16 Jul 2010 11:27:30 +0200 (CEST) Message-Id: <201007160927.o6G9RU34020754@lurza.secnetix.de> From: Oliver Fromme To: freebsd-chat@FreeBSD.ORG, deeptech71@gmail.com In-Reply-To: X-Newsgroups: list.freebsd-chat User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (FreeBSD/6.4-PRERELEASE-20080904 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.3.4 (lurza.secnetix.de [127.0.0.1]); Fri, 16 Jul 2010 11:27:45 +0200 (CEST) Cc: Subject: Re: is strlen()'s read-4-bytes-ahead a standard? X-BeenThere: freebsd-chat@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Non technical items related to the community List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Jul 2010 09:27:47 -0000 deeptech71@gmail.com wrote: > Xin LI wrote: > > On 2010/07/15 15:38, deeptech71@gmail.com wrote: > > > Some C implementations use the read-4-bytes-ahead technique to speed > > > up strlen(). Does the C standard state anything about strlen() being > > > allowed to read past the terminating zero? > > > > It's not 4-bytes-ahead, but read a whole (aligned) word at one time. > > > > I think C standard does not dictate in this detail. > > OK, can anyone confirm this? When Xin LI states it, it doesn't need confirmation. ;-) You can look up for yourself, it's in section 7.21.6.3 (page 333) of ISO/IEC 9899:1999 a.k.a. "C99". It only states that "The strlen function computes the length of the string" and "The strlen function returns the number of characters that precede the terminating null character". Nothing more. > > But why? > > Just wondering. There's no reason not to read the string as aligned words. Because they're aligned, there's no risk to accidentally hit the next VM page after the end of the string. On the other hand, I don't think it is clear that doing this for strlen() would be a performance win in every situation. BTW, some languages (and also some string libraries for C) store the length separately for every string, so you don't have to iterate through the whole string to get its length. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht Mün- chen, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd I suggested holding a "Python Object Oriented Programming Seminar", but the acronym was unpopular. -- Joseph Strout