From owner-freebsd-hackers Wed Jul 21 4:54: 6 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from penrose.isocor.ie (penrose.isocor.ie [194.106.155.117]) by hub.freebsd.org (Postfix) with ESMTP id CA46914F6A for ; Wed, 21 Jul 1999 04:53:56 -0700 (PDT) (envelope-from peter.edwards@isocor.ie) Received: from isocor.ie (194.106.155.218) by penrose.isocor.ie; 21 Jul 1999 12:52:21 +0100 Message-ID: <3795B50B.EFBBEFE2@isocor.ie> Date: Wed, 21 Jul 1999 12:54:51 +0100 From: Peter Edwards Organization: ISOCOR X-Mailer: Mozilla 4.6 [en] (X11; I; SunOS 5.6 i86pc) X-Accept-Language: en MIME-Version: 1.0 To: Peter Jeremy Cc: will@iki.fi, hackers@FreeBSD.ORG Subject: Re: speed of file(1) References: <99Jul21.210648est.40326@border.alcanet.com.au> Content-Type: multipart/mixed; boundary="------------8B774876DAB92991344CB5AD" Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG This is a multi-part message in MIME format. --------------8B774876DAB92991344CB5AD Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit A quick look at the source reveals: A MAXMAGIS constant in file.h that estimates a limit of 1000 lines in magic. (The real number is 4802) An array sized on MAXMAGIS, that is reallocated every ALLOC_INTR lines of magic once MAXMAGIS is exceeded. The patch updates MAXMAGIS to 5000 (give a bit of room to grow) And makes ALLOC_INCR a variable that is bigger, and doubles every time it is used, to attenuate the problem if there ever ends up being 10000 entries in magic. Results on a 90Mhz Pentium: new verson time ./file ./file ./file: FreeBSD/i386 compact demand paged dynamically linked executable not stripped 0.14 real 0.11 user 0.02 sys old verson: ./file: FreeBSD/i386 compact demand paged dynamically linked executable not stripped 0.79 real 0.60 user 0.16 sys -- Peter. Peter Jeremy wrote: > > Ville-Pertti Keinonen wrote: > >jeremyp@gsmx07.alcatel.com.au (Peter Jeremy) writes: > >> I can't believe these figures. > > Based on the figures below, maybe I was overly hasty in this statement. > The changes between 2.x and 3.x magic files have far more impact than > I would have expected. > > >What are your results, then? > > All timings with everything cached (although the 386 only has 8MB > which limits the cacheability). For the 2.2.5 systems, I give timings > with both the 2.2.5 magic and the 4.0 magic (which is the same as > 3.2-RELEASE, in /tmp). > > i386SX-25 running 2.2.5 (roughly as posted earlier): > % /usr/bin/time file src/Z/dhcp-2.0b1pl26.tar.gz > src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified: Thu Jan 1 10:00:00 1970, os: Unix > 2.82 real 1.92 user 0.84 sys > % /usr/bin/time file -m /tmp/magic src/Z/dhcp-2.0b1pl26.tar.gz > src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified: Thu Jan 1 10:00:00 1970, os: Unix > 4.05 real 2.67 user 1.23 sys > > 486DX2-50 running 2.2.5: > % /usr/bin/time file src/Z/dhcp-3.0-alpha-19990423.tar.gz > src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last modified: Thu Jan 1 10:00:00 1970, os: Unix > 1.43 real 0.96 user 0.38 sys > % /usr/bin/time file -m /tmp/magic src/Z/dhcp-3.0-alpha-19990423.tar.gz > src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last modified: Thu Jan 1 10:00:00 1970, os: Unix > 2.15 real 1.62 user 0.44 sys > > PII-266 running 4.0-CURRENT: > % /usr/bin/time file src/Z/dhcp-1.4.0p6.tar.gz > src/Z/dhcp-1.4.0p6.tar.gz: gzip compressed data, deflated, last modified: Wed Mar 3 20:57:52 1999, os: Unix > 0.13 real 0.09 user 0.03 sys > > When I profile file in a slow system (like a 386 or 486), there is an > obvious performance bottleneck: The problem is the memcpy() invoked > from fgets(). The only solution would seem to be to mmap() magic > and parse it, rather than using fgets() to read it. This bottleneck > will also be far more obvious on bandwidth-starved systems (like > 386SX and 486DX2/4), whereas virtually the whole thing fits into the > L2 cache on my P-II. > > Peter > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-hackers" in the body of the message --------------8B774876DAB92991344CB5AD Content-Type: text/plain; charset=us-ascii; name="file.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="file.patch" Common subdirectories: file/Magdir and file.new/Magdir diff -c file/apprentice.c file.new/apprentice.c *** file/apprentice.c Wed Jan 28 07:36:21 1998 --- file.new/apprentice.c Wed Jul 21 12:35:21 1999 *************** *** 50,55 **** --- 50,56 ---- static void eatsize __P((char **)); static int maxmagic = 0; + static int alloc_incr = 256; static int apprentice_1 __P((char *, int)); *************** *** 180,188 **** struct magic *m; char *t, *s; - #define ALLOC_INCR 20 if (nd+1 >= maxmagic){ ! maxmagic += ALLOC_INCR; if ((magic = (struct magic *) realloc(magic, sizeof(struct magic) * maxmagic)) == NULL) { --- 181,188 ---- struct magic *m; char *t, *s; if (nd+1 >= maxmagic){ ! maxmagic += alloc_incr; if ((magic = (struct magic *) realloc(magic, sizeof(struct magic) * maxmagic)) == NULL) { *************** *** 192,198 **** else exit(1); } ! memset(&magic[*ndx], 0, sizeof(struct magic) * ALLOC_INCR); } m = &magic[*ndx]; m->flag = 0; --- 192,199 ---- else exit(1); } ! memset(&magic[*ndx], 0, sizeof(struct magic) * alloc_incr); ! alloc_incr *= 2; } m = &magic[*ndx]; m->flag = 0; diff -c file/file.h file.new/file.h *** file/file.h Wed Jul 21 12:37:00 1999 --- file.new/file.h Wed Jul 21 12:35:40 1999 *************** *** 35,41 **** #ifndef HOWMANY # define HOWMANY 8192 /* how much of the file to look at */ #endif ! #define MAXMAGIS 1000 /* max entries in /etc/magic */ #define MAXDESC 50 /* max leng of text description */ #define MAXstring 32 /* max leng of "string" types */ --- 35,41 ---- #ifndef HOWMANY # define HOWMANY 8192 /* how much of the file to look at */ #endif ! #define MAXMAGIS 5000 /* max entries in /etc/magic */ #define MAXDESC 50 /* max leng of text description */ #define MAXstring 32 /* max leng of "string" types */ --------------8B774876DAB92991344CB5AD-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message