Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 21 Jul 1999 12:54:51 +0100
From:      Peter Edwards <peter.edwards@isocor.ie>
To:        Peter Jeremy <jeremyp@gsmx07.alcatel.com.au>
Cc:        will@iki.fi, hackers@FreeBSD.ORG
Subject:   Re: speed of file(1)
Message-ID:  <3795B50B.EFBBEFE2@isocor.ie>
References:  <99Jul21.210648est.40326@border.alcanet.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a multi-part message in MIME format.
--------------8B774876DAB92991344CB5AD
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

A quick look at the source reveals:

A MAXMAGIS constant in file.h that estimates a limit of 1000 lines in
magic. (The real number is 4802)

An array sized on MAXMAGIS, that is reallocated every ALLOC_INTR lines
of magic once MAXMAGIS is exceeded.

The patch updates MAXMAGIS to 5000 (give a bit of room to grow)
And makes ALLOC_INCR a variable that is bigger, and doubles every time
it is used, to attenuate the problem if there ever ends up being 10000
entries in magic.

Results on a 90Mhz Pentium:

new verson

time ./file ./file
./file: FreeBSD/i386 compact demand paged dynamically linked executable
not stripped
        0.14 real         0.11 user         0.02 sys

old verson:

./file: FreeBSD/i386 compact demand paged dynamically linked executable
not stripped
        0.79 real         0.60 user         0.16 sys




--
Peter.



Peter Jeremy wrote:
> 
> Ville-Pertti Keinonen <will@iki.fi> wrote:
> >jeremyp@gsmx07.alcatel.com.au (Peter Jeremy) writes:
> >> I can't believe these figures.
> 
> Based on the figures below, maybe I was overly hasty in this statement.
> The changes between 2.x and 3.x magic files have far more impact than
> I would have expected.
> 
> >What are your results, then?
> 
> All timings with everything cached (although the 386 only has 8MB
> which limits the cacheability).  For the 2.2.5 systems, I give timings
> with both the 2.2.5 magic and the 4.0 magic (which is the same as
> 3.2-RELEASE, in /tmp).
> 
> i386SX-25 running 2.2.5 (roughly as posted earlier):
> % /usr/bin/time file src/Z/dhcp-2.0b1pl26.tar.gz
> src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified: Thu Jan  1 10:00:00 1970, os: Unix
>         2.82 real         1.92 user         0.84 sys
> % /usr/bin/time file -m /tmp/magic src/Z/dhcp-2.0b1pl26.tar.gz
> src/Z/dhcp-2.0b1pl26.tar.gz: gzip compressed data, deflated, last modified: Thu Jan  1 10:00:00 1970, os: Unix
>         4.05 real         2.67 user         1.23 sys
> 
> 486DX2-50 running 2.2.5:
> % /usr/bin/time file src/Z/dhcp-3.0-alpha-19990423.tar.gz
> src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last modified: Thu Jan  1 10:00:00 1970, os: Unix
>         1.43 real         0.96 user         0.38 sys
> % /usr/bin/time file -m /tmp/magic src/Z/dhcp-3.0-alpha-19990423.tar.gz
> src/Z/dhcp-3.0-alpha-19990423.tar.gz: gzip compressed data, deflated, last modified: Thu Jan  1 10:00:00 1970, os: Unix
>         2.15 real         1.62 user         0.44 sys
> 
> PII-266 running 4.0-CURRENT:
> % /usr/bin/time file src/Z/dhcp-1.4.0p6.tar.gz
> src/Z/dhcp-1.4.0p6.tar.gz: gzip compressed data, deflated, last modified: Wed Mar  3 20:57:52 1999, os: Unix
>         0.13 real         0.09 user         0.03 sys
> 
> When I profile file in a slow system (like a 386 or 486), there is an
> obvious performance bottleneck:  The problem is the memcpy() invoked
> from fgets().  The only solution would seem to be to mmap() magic
> and parse it, rather than using fgets() to read it.  This bottleneck
> will also be far more obvious on bandwidth-starved systems (like
> 386SX and 486DX2/4), whereas virtually the whole thing fits into the
> L2 cache on my P-II.
> 
> Peter
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-hackers" in the body of the message
--------------8B774876DAB92991344CB5AD
Content-Type: text/plain; charset=us-ascii;
 name="file.patch"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
 filename="file.patch"

Common subdirectories: file/Magdir and file.new/Magdir
diff -c file/apprentice.c file.new/apprentice.c
*** file/apprentice.c	Wed Jan 28 07:36:21 1998
--- file.new/apprentice.c	Wed Jul 21 12:35:21 1999
***************
*** 50,55 ****
--- 50,56 ----
  static void eatsize	__P((char **));
  
  static int maxmagic = 0;
+ static int alloc_incr = 256;
  
  static int apprentice_1	__P((char *, int));
  
***************
*** 180,188 ****
  	struct magic *m;
  	char *t, *s;
  
- #define ALLOC_INCR	20
  	if (nd+1 >= maxmagic){
! 	    maxmagic += ALLOC_INCR;
  	    if ((magic = (struct magic *) realloc(magic, 
  						  sizeof(struct magic) * 
  						  maxmagic)) == NULL) {
--- 181,188 ----
  	struct magic *m;
  	char *t, *s;
  
  	if (nd+1 >= maxmagic){
! 	    maxmagic += alloc_incr;
  	    if ((magic = (struct magic *) realloc(magic, 
  						  sizeof(struct magic) * 
  						  maxmagic)) == NULL) {
***************
*** 192,198 ****
  		else
  			exit(1);
  	    }
! 	    memset(&magic[*ndx], 0, sizeof(struct magic) * ALLOC_INCR);
  	}
  	m = &magic[*ndx];
  	m->flag = 0;
--- 192,199 ----
  		else
  			exit(1);
  	    }
! 	    memset(&magic[*ndx], 0, sizeof(struct magic) * alloc_incr);
! 	    alloc_incr *= 2;
  	}
  	m = &magic[*ndx];
  	m->flag = 0;
diff -c file/file.h file.new/file.h
*** file/file.h	Wed Jul 21 12:37:00 1999
--- file.new/file.h	Wed Jul 21 12:35:40 1999
***************
*** 35,41 ****
  #ifndef HOWMANY
  # define HOWMANY 8192		/* how much of the file to look at */
  #endif
! #define MAXMAGIS 1000		/* max entries in /etc/magic */
  #define MAXDESC	50		/* max leng of text description */
  #define MAXstring 32		/* max leng of "string" types */
  
--- 35,41 ----
  #ifndef HOWMANY
  # define HOWMANY 8192		/* how much of the file to look at */
  #endif
! #define MAXMAGIS 5000		/* max entries in /etc/magic */
  #define MAXDESC	50		/* max leng of text description */
  #define MAXstring 32		/* max leng of "string" types */
  

--------------8B774876DAB92991344CB5AD--



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3795B50B.EFBBEFE2>