From owner-freebsd-performance@FreeBSD.ORG Mon Mar 23 23:43:22 2009 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 893421065745 for ; Mon, 23 Mar 2009 23:43:22 +0000 (UTC) (envelope-from ab@addr.com) Received: from proxy1.addr.com (proxy1.addr.com [38.113.244.28]) by mx1.freebsd.org (Postfix) with ESMTP id 72AA78FC1B for ; Mon, 23 Mar 2009 23:43:22 +0000 (UTC) (envelope-from ab@addr.com) Received: from ABPC (c-24-6-143-28.hsd1.ca.comcast.net [24.6.143.28]) by proxy1.addr.com (8.12.11/8.12.8/Submit) with SMTP id n2NMnFdi062843 for ; Mon, 23 Mar 2009 15:49:16 -0700 (PDT) Message-ID: <7E0B3E3BC5054DF2A0BC814501B08324@ABPC> From: "Anthony Bourov" To: Date: Mon, 23 Mar 2009 15:44:05 -0700 MIME-Version: 1.0 X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Windows Mail 6.0.6001.18000 X-MimeOLE: Produced By Microsoft MimeOLE V6.0.6001.18049 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: nsdispatch performance issue for large group files (libc) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 23 Mar 2009 23:43:23 -0000 Regarding performance of: lib/libc/net/nsdispatch.c When used from: lib/libc/net/getgrent.c (called by initgroups()) I don't normally post here but I wanted to get some feed back on a = performance issue that I spotted. I run a large number of high-volume = web hosting servers and noticed on some of the servers a severe decrease = in Apache's performance when the /etc/group file is large (over 100,000 = entries in a group file as it is combined across servers). I did a trace and found the following operation: stat("/etc/nsswitch.conf", {st_mode=3D052, st_size=3D4503681233059861, = ...}) =3D 0 repeating as many times as there is groups in the group file. I narrowed = the problem down to where apache calls "initgroups()" before forking = each process (nothing wrong here). And init groups goes through every = entry in the group file using getgrent(), which in turn calls nsdispatch = and which for every single call does "stat" on "/etc/nsswitch.conf" to = see if it changed.=20 This issue impacts different servers differently, on most of the SCSI = servers this delays apache startup my maybe a minute, however, on a Dell = SATA raid the "stat" command was significantly slower and caused = everything to come to a halt for several minuted every time apache = starts. In my opinion this is a very significant performance issue when working = with large servers. Most programs, including apache, will call = "initgroups()" for every time they fork, and it the group file is large = this means as many "stat" requests on the file system as there are = entries in the group file for every single fork() that the server does. For myself I just made it never test "stat" on "/etc/nsswitch.conf" = after the first time since I know that file is never modified. However, = a better solution would be to somehow let nsdispatch know that it is = being ran in batch mode and should not keep testing if the file has = changed. This would effect both "getgrent" and "getpwent".=20