From owner-freebsd-hackers  Mon Jul 21 06:50:27 1997
Return-Path: <owner-freebsd-hackers>
Received: (from root@localhost)
          by hub.freebsd.org (8.8.5/8.8.5) id GAA14091
          for hackers-outgoing; Mon, 21 Jul 1997 06:50:27 -0700 (PDT)
Received: from ocean.campus.luth.se (ocean.campus.luth.se [130.240.194.116])
          by hub.freebsd.org (8.8.5/8.8.5) with ESMTP id GAA14082
          for <freebsd-hackers@FreeBSD.ORG>; Mon, 21 Jul 1997 06:50:24 -0700 (PDT)
Received: (from karpen@localhost)
	by ocean.campus.luth.se (8.8.5/8.8.5) id PAA09544;
	Mon, 21 Jul 1997 15:52:29 +0200 (CEST)
From: Mikael Karpberg <karpen@ocean.campus.luth.se>
Message-Id: <199707211352.PAA09544@ocean.campus.luth.se>
Subject: Re: utmp/wtmp interface
In-Reply-To: <199707211056.UAA22881@genesis.atrad.adelaide.edu.au> from Michael Smith at "Jul 21, 97 08:26:41 pm"
To: msmith@atrad.adelaide.edu.au (Michael Smith)
Date: Mon, 21 Jul 1997 15:52:29 +0200 (CEST)
Cc: davidn@labs.usn.blaze.net.au, msmith@atrad.adelaide.edu.au,
        freebsd-hackers@FreeBSD.ORG
X-Mailer: ELM [version 2.4ME+ PL31H (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-hackers@FreeBSD.ORG
X-Loop: FreeBSD.org
Precedence: bulk


Ok... I just have to comment on this...

According to Michael Smith:
[... snip ...]
> > Could you explain that further? What is a "self-describing format"?
> 
> A file format which contains meta-data describing the layout of the
> file.  Eg. at the head of the file, start with a record containing
> the length of records in the file, followed by field, length tuples 
> describing the type and length of each of the fields in a record.
> 
> For utmp, you might do :
> 
> 0x0000	0x00000100	# 256-byte records
> 0x0004  0x00000001	# username
> 0x0008	0x00000020	# 32 bytes
> 0x000c	0x00000002	# login time
> 0x0010	0x00000004	# 4 bytes
> 0x0014	0x00000003	# source host IP
> 0x0018	0x00000004	# 4 bytes
> 0x001c	0x00000004	# source hostname
> 0x0020	0x00000080	# 128 bytes
> 
> Parsing this is very straightforward, as is using it to obtain the
> fields you want.  You can add new fields to the file, change the size
> of fields, etc. without ever breaking binary compatability again.
> Because records in the file are fixed in size, you can treat it as a
> random-access file too.

This sounds like an excellent idea to me. The library code for this would
always be 100% backwards compatible if written correctly with the correct
API towards the library user. One thing though:

If you're going to label the fields (1, 2, 3, etc) then why not do it in a
logical fashoin instead of some annoying enumeration? You have four bytes,
so use them! :-)

0x0000	0x00000100		# 256-byte records
0x0004  'U', 'S', 'E', R'	# USERname
0x0008	0x00000020		# 32 bytes
0x000c	'I', 'T', 'I', 'M'	# logIn TIMe
0x0010	0x00000004		# 4 bytes
0x0014	'H', '_', 'I', 'P'	# source Host IP
0x0018	0x00000004		# 4 bytes
0x001c	'H', 'N', 'A', 'M'	# source HostNAMe
0x0020	0x00000080		# 128 bytes

Then the function can take a string argument from which the four first
characters are used, which makes code more readable.

ptr = get_an_entry("USER");
ptr = get_an_entry("H_IP");

Now... this all sounds great to me, but I might have missed something.
Any reactions?

[... snip ...]
> > These calls are not intended to be reentrant. There are similar 'problems'
> > with all of the get{pw,gr}*() routines, for example. Calling any of these
> > functions from separate threads is something you just don't do, and for
> > the life of me I can't think of a single reason why you'd ever need to
> > in lastlog's case. :)
> 
> Call it a matter of principle.  8)  I realise that the alternative is a
> little more work.

Actually, libc should NOT be reentrant, should it? Only libc_r should be,
and will be linked at giving -thread to the compiler/linker. Or is this
not in libc? :-)

If you want the functions reentrant, just do it like this?

  ptr = internal_thread_specific_data[my_thread_id][this_function_id];
  if (ptr == NULL) {
    internal_thread_specific_data[my_thread_id][this_function_id] =
        malloc(BUFFERSIZE_FOR_THIS_FUNCTION);
  }
  ... go ahead and use the buffer ...

That way you can return a static buffer per thread, so it's threadsafe.
The buffer returned will be overwritten by each call and so only allocated
once. Therefor it's no big speed loss. If it's just run once, you really
don't need it to be any quicker then that, and if it's run a lot, you just
pay the penalty for a malloc once.

   /Mikael