From owner-freebsd-arch Sun Mar 31 14: 7:35 2002 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id 74BAA37B417 for ; Sun, 31 Mar 2002 14:07:22 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 4B85B5346; Mon, 1 Apr 2002 00:07:19 +0200 (CEST) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: arch@freebsd.org Subject: mutex profiling From: Dag-Erling Smorgrav Date: 01 Apr 2002 00:07:18 +0200 Message-ID: Lines: 87 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --=-=-= The attached patch (derived from patches by Eivind) adds code to record statistics about MTX_DEF locks. It's currently i386-only as it uses the TSC to measure the amount of time each mutex is held. Once compiled in and enabled, statistics can be retrieved using sysctl(8). The code records four numbers for each mutex (longest time held in one go, total time held, number of acquisitions, average time held) and returns a top-16 list for each. Each mutex is identified by the place where it was first acquired, so multiple instances of the same type of mutex (e.g. proc lock, filedesc lock) are counted as one. Here's a sample listing (with some annotations): des@des ~% sysctl kern.mtx.trace kern.mtx.trace.enable: 1 kern.mtx.trace.max: 20880774 kern/kern_fork.c:467 12632714 dev/sound/pcm/channel.c:677 10653798 i386/i386/machdep.c:1715 2222570 kern/kern_descrip.c:748 1101574 pci/if_xl.c:1259 812677 vm/uma_core.c:232 723969 dev/random/yarrow.c:265 523384 vm/uma_core.c:1179 500278 kern/kern_lock.c:227 427118 kern/vfs_subr.c:936 357190 kern/vfs_syscalls.c:1692 141822 vm/uma_core.c:985 79442 kern/vfs_lookup.c:149 65260 dev/sound/pcm/sound.c:134 52277 vm/uma_core.c:1301 36987 kern/kern_fork.c:535 kern.mtx.trace.total: 5472784414 i386/i386/machdep.c:1715 # Giant 1155767274 kern/kern_descrip.c:748 # filedesc lock 400100257 kern/kern_fork.c:467 # proc lock 352664142 kern/kern_lock.c:227 229755648 kern/sys_generic.c:800 # select(2) lock 226619967 kern/vfs_syscalls.c:1692 # file lock 116591756 vm/uma_core.c:1179 # zone lock 113001343 kern/vfs_subr.c:1857 # vnode lock 68279100 kern/kern_descrip.c:1108 28700835 pci/if_xl.c:1259 25726912 kern/vfs_lookup.c:149 25351235 kern/vfs_subr.c:1805 24323335 dev/sound/pcm/channel.c:677 21193642 kern/kern_sx.c:147 20097794 kern/vfs_subr.c:1788 18994392 kern/subr_mbuf.c:452 kern.mtx.trace.count: 1780724 kern/kern_fork.c:467 # proc lock 1127992 i386/i386/machdep.c:1715 # Giant 753901 kern/kern_lock.c:227 592171 kern/sys_generic.c:800 # select(2) lock 438879 vm/uma_core.c:1179 # zone lock 419580 kern/kern_descrip.c:748 # filedesc lock 407064 kern/vfs_syscalls.c:1692 258866 kern/vfs_subr.c:1857 104235 kern/kern_sx.c:147 86810 kern/vfs_subr.c:1788 61475 kern/vfs_subr.c:1805 52572 kern/subr_mbuf.c:452 36992 kern/vfs_vnops.c:762 36415 sys/buf.h:278 28884 kern/kern_resource.c:900 28421 kern/kern_descrip.c:1108 kern.mtx.trace.average: 3040416 dev/sound/pcm/channel.c:677 126265 vm/uma_core.c:985 # zone lock 38844 dev/random/yarrow.c:265 11743 pci/if_xl.c:1259 8335 dev/sound/pcm/sound.c:134 6292 kern/kern_fork.c:535 6069 kern/subr_disklabel.c:95 4852 i386/i386/machdep.c:1715 4554 kern/vfs_subr.c:936 3575 vm/uma_core.c:232 3337 vm/uma_core.c:1761 2787 vm/swap_pager.c:306 2754 kern/kern_descrip.c:748 2552 net/bpf.c:254 2402 kern/kern_descrip.c:1108 2321 kern/sys_pipe.c:1302 DES -- Dag-Erling Smorgrav - des@ofug.org --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=mutex_trace.diff Index: sys/_mutex.h =================================================================== RCS file: /home/ncvs/src/sys/sys/_mutex.h,v retrieving revision 1.4 diff -u -r1.4 _mutex.h --- sys/_mutex.h 18 Dec 2001 00:27:18 -0000 1.4 +++ sys/_mutex.h 23 Mar 2002 17:41:05 -0000 @@ -41,6 +41,10 @@ volatile u_int mtx_recurse; /* number of recursive holds */ TAILQ_HEAD(, thread) mtx_blocked; /* threads blocked on this lock */ LIST_ENTRY(mtx) mtx_contested; /* list of all contested locks */ +#ifdef TRACE_MUTEX_TIME + u_int64_t tsc; + char name[64]; +#endif }; #endif /* !_SYS_MUTEX_TYPES_H_ */ Index: kern/kern_mutex.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_mutex.c,v retrieving revision 1.83 diff -u -r1.83 kern_mutex.c --- kern/kern_mutex.c 27 Mar 2002 09:23:38 -0000 1.83 +++ kern/kern_mutex.c 31 Mar 2002 21:42:56 -0000 @@ -37,17 +37,18 @@ #include "opt_ddb.h" #include +#include #include #include +#include #include #include #include #include #include +#include #include -#include #include -#include #include #include @@ -196,6 +197,121 @@ } } +#ifdef TRACE_MUTEX_TIME +#define TRACE_HASH_SIZE 1009 +SYSCTL_NODE(_kern, OID_AUTO, mtx, CTLFLAG_RD, NULL, "mutex manipulation"); +SYSCTL_NODE(_kern_mtx, OID_AUTO, trace, CTLFLAG_RD, NULL, "mutex tracing"); +static int kern_mtx_trace_enable = 0; +SYSCTL_INT(_kern_mtx_trace, OID_AUTO, enable, CTLFLAG_RW, + &kern_mtx_trace_enable, 0, "Enable tracing of mutex holdtime"); +static int kern_mtx_trace_numtraced = 0; +#ifdef DIAGNOSTICS +SYSCTL_INT(_kern_mtx_trace, OID_AUTO, numtraced, CTLFLAG_RD, + &kern_mtx_trace_numtraced, 0, "Number of traced mutices"); +static int kern_mtx_trace_hashsize = TRACE_HASH_SIZE; +SYSCTL_INT(_kern_mtx_trace, OID_AUTO, hashsize, CTLFLAG_RD, + &kern_mtx_trace_hashsize, 0, "Trace hash size"); +static int kern_mtx_trace_collisions = 0; +SYSCTL_INT(_kern_mtx_trace, OID_AUTO, collisions, CTLFLAG_RD, + &kern_mtx_trace_collisions, 0, "Number of hash collisions"); +#endif + +static struct mutex_trace { + char name[64]; +#define TRACE_MAX 0 +#define TRACE_TOT 1 +#define TRACE_CNT 2 +#define TRACE_AVG 3 + u_int64_t tsc[4]; + struct mutex_trace *next; +} *mutex_traces[TRACE_HASH_SIZE]; + +static struct mtx trace_mtx; + +static void +mutex_trace_init(void *arg __unused) +{ + mtx_init(&trace_mtx, "mutex trace lock", MTX_SPIN|MTX_QUIET); +} +SYSINIT(mtxtraceinit, SI_SUB_LOCK, SI_ORDER_ANY, mutex_trace_init, NULL); + +/* + * tracebuf is the pool of trace records we have; + * we cannot (reliably) handle more than this number + * of places in the code where mutexes are aquired. + * + * We use a static allocation to avoid interacting + * with the rest of the system. + */ +#define NUM_TRACE_BUFFERS 1000 +static struct mutex_trace tracebuf[NUM_TRACE_BUFFERS]; +static int first_free_mutex_trace = 0; + +static const char *unknown = "(unknown)"; + +/* Number of locks to show in the sysctl. */ +#define NUM_LOCKS_TO_DUMP 16 + +/* + * Output statistics of mutex holders that keep the mutex the longest. + */ +static int dump_mtx_trace(SYSCTL_HANDLER_ARGS); +SYSCTL_PROC(_kern_mtx_trace, OID_AUTO, max, CTLTYPE_STRING|CTLFLAG_RD, + NULL, TRACE_MAX, dump_mtx_trace, "A", ""); +SYSCTL_PROC(_kern_mtx_trace, OID_AUTO, total, CTLTYPE_STRING|CTLFLAG_RD, + NULL, TRACE_TOT, dump_mtx_trace, "A", ""); +SYSCTL_PROC(_kern_mtx_trace, OID_AUTO, count, CTLTYPE_STRING|CTLFLAG_RD, + NULL, TRACE_CNT, dump_mtx_trace, "A", ""); +SYSCTL_PROC(_kern_mtx_trace, OID_AUTO, average, CTLTYPE_STRING|CTLFLAG_RD, + NULL, TRACE_AVG, dump_mtx_trace, "A", ""); +static int +dump_mtx_trace(SYSCTL_HANDLER_ARGS) +{ + struct sbuf *sb; + struct mutex_trace *mtp; + int traces[NUM_LOCKS_TO_DUMP]; + int error, i, j, k, n; + + if (kern_mtx_trace_numtraced == 0) + return SYSCTL_OUT(req, "No locking recorded", + sizeof("No locking recorded")); + n = arg2; + + /* + * Find the 10 largest + */ + mtx_lock_spin(&trace_mtx); + for (i = 0; i < NUM_LOCKS_TO_DUMP; ++i) + traces[i] = -1; + for (k = 0; k < first_free_mutex_trace; ++k) { + for (i = 0; i < NUM_LOCKS_TO_DUMP; ++i) { + if (traces[i] == -1) { + traces[i] = k; + break; + } + if (tracebuf[k].tsc[n] < tracebuf[traces[i]].tsc[n]) + continue; + for (j = NUM_LOCKS_TO_DUMP - 1; j > i; --j) + traces[j] = traces[j - 1]; + traces[i] = k; + break; + } + } + + /* Now dump the garbage */ + sb = sbuf_new(NULL, NULL, 256, SBUF_AUTOEXTEND); + for (i = 0; i < NUM_LOCKS_TO_DUMP && traces[i] != -1; ++i) { + mtp = &tracebuf[traces[i]]; + sbuf_printf(sb, "%12llu %s\n", mtp->tsc[n], mtp->name); + } + sbuf_finish(sb); + mtx_unlock_spin(&trace_mtx); + error = SYSCTL_OUT(req, sbuf_data(sb), sbuf_len(sb) + 1); + sbuf_delete(sb); + return (error); +} +#endif /* TRACE_MUTEX_TIME */ + /* * Function versions of the inlined __mtx_* macros. These are used by * modules and can also be called from assembly language if needed. @@ -209,6 +325,18 @@ LOCK_LOG_LOCK("LOCK", &m->mtx_object, opts, m->mtx_recurse, file, line); WITNESS_LOCK(&m->mtx_object, opts | LOP_EXCLUSIVE, file, line); +#ifdef TRACE_MUTEX_TIME + m->tsc = kern_mtx_trace_enable ? rdtsc() : 0; + if (m->name[0] == '\0') { + const char *p = file; + + while (p && strncmp(p, "../", 3) == 0) + p += 3; + if (p == NULL || *p == '\0') + p = unknown; + snprintf(m->name, sizeof m->name, "%s:%d", p, line); + } +#endif } void @@ -217,6 +345,53 @@ MPASS(curthread != NULL); mtx_assert(m, MA_OWNED); +#ifdef TRACE_MUTEX_TIME + if (m->tsc != 0 && m->name[0] != '\0') { + struct mutex_trace *mtp; + u_int64_t tsc, mtsc; + volatile u_int hash; + char *p; + + tsc = rdtsc(); + mtsc = m->tsc; + m->tsc = 0; + if (tsc <= mtsc) + goto out; + hash = 0; + for (p = m->name; *p != '\0'; ++p) + hash = ((hash << 1) | *p) % TRACE_HASH_SIZE; + mtx_lock_spin(&trace_mtx); + for (mtp = mutex_traces[hash]; mtp != NULL; mtp = mtp->next) + if (strcmp(mtp->name, m->name) == 0) + break; + if (mtp == NULL) { + /* Just exit if we cannot get a trace buffer */ + if (first_free_mutex_trace >= NUM_TRACE_BUFFERS) + goto unlock; + mtp = &tracebuf[first_free_mutex_trace++]; + strcpy(mtp->name, m->name); +#ifdef DIAGNOSTICS + if (mutex_traces[hash] != NULL) + ++kern_mtx_trace_collisions; +#endif + mtp->next = mutex_traces[hash]; + mutex_traces[hash] = mtp; + ++kern_mtx_trace_numtraced; + } + /* + * Record if the mutex has been held longer now than ever + * before + */ + if ((tsc - mtsc) > mtp->tsc[TRACE_MAX]) + mtp->tsc[TRACE_MAX] = tsc - mtsc; + mtp->tsc[TRACE_TOT] += tsc - mtsc; + mtp->tsc[TRACE_CNT] += 1; + mtp->tsc[TRACE_AVG] = mtp->tsc[TRACE_TOT] / mtp->tsc[TRACE_CNT]; +unlock: + mtx_unlock_spin(&trace_mtx); + } +out: +#endif WITNESS_UNLOCK(&m->mtx_object, opts | LOP_EXCLUSIVE, file, line); LOCK_LOG_LOCK("UNLOCK", &m->mtx_object, opts, m->mtx_recurse, file, line); Index: conf/options.i386 =================================================================== RCS file: /home/ncvs/src/sys/conf/options.i386,v retrieving revision 1.169 diff -u -r1.169 options.i386 --- conf/options.i386 27 Feb 2002 09:51:31 -0000 1.169 +++ conf/options.i386 23 Mar 2002 18:04:45 -0000 @@ -188,6 +188,9 @@ # SMB/CIFS filesystem SMBFS +# Mutex profiling +TRACE_MUTEX_TIME opt_global.h + # ------------------------------- # EOF # ------------------------------- --=-=-=-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 14:13:18 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id 682E837B41B for ; Sun, 31 Mar 2002 14:13:15 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.2/8.12.2) with ESMTP id g2VMCoe7029647; Mon, 1 Apr 2002 00:12:50 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Dag-Erling Smorgrav Cc: arch@FreeBSD.ORG Subject: Re: mutex profiling In-Reply-To: Your message of "01 Apr 2002 00:07:18 +0200." Date: Mon, 01 Apr 2002 00:12:50 +0200 Message-ID: <29646.1017612770@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG With the footnote that the TSC's are not synchronized on SMP systems, this looks like a nice initial tool to get some kind of picture of the locking situation in the kernel. Poul-Henning In message , Dag-Erling Smorgrav writes: >--=-=-= > >The attached patch (derived from patches by Eivind) adds code to >record statistics about MTX_DEF locks. It's currently i386-only as it >uses the TSC to measure the amount of time each mutex is held. Once -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 14:14:49 2002 Delivered-To: freebsd-arch@freebsd.org Received: from beastie.mckusick.com (beastie.mckusick.com [209.31.233.184]) by hub.freebsd.org (Postfix) with ESMTP id 566B737B41D for ; Sun, 31 Mar 2002 14:14:46 -0800 (PST) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.11.6/8.11.6) with ESMTP id g2VMEWD07500; Sun, 31 Mar 2002 14:14:32 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200203312214.g2VMEWD07500@beastie.mckusick.com> To: Garance A Drosihn Subject: Re: UFS snapshots in current Cc: Alfred Perlstein , arch@FreeBSD.ORG In-Reply-To: Your message of "Sun, 31 Mar 2002 01:50:10 EST." Date: Sun, 31 Mar 2002 14:14:31 -0800 From: Kirk McKusick Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Date: Sun, 31 Mar 2002 01:50:10 -0500 To: Alfred Perlstein From: Garance A Drosihn Subject: Re: UFS snapshots in current Cc: Kirk McKusick , arch@FreeBSD.ORG At 8:50 PM -0800 3/30/02, Alfred Perlstein wrote: >Looks like you hit one of the snapshot deadlock conditions, >Dr McKusick recently introduced a fix for one of the >deadlocks so this may not happen again... In the past week? The "old" current system which I had been running should have been from March 26th, when I rebuilt it to see if it would fix my vmware problems (which it did...). I don't see any commits wrt softupdates which are that recent. -- Garance Alistair Drosehn = gad@eclipse.acs.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu All of the above is basically correct. Your use of a snapshot over a long period of time and in particular over a CVS update, build world, and install world should not cause trouble (other than perhaps a large amount of disk space being used). There is no problem with rebooting, provided that the filesystem with the snapshot is cleanly unmounted. Snapshots can be made to survive panics by setting `sysctl -w debug.dopersistence=1' but at a non-trivial performance cost. The problem that you encountered was most likely a deadlock in the snapshot code. I fixed one deadlock last January, but am aware of at least one more that is still there. I have a fairly good idea on how to fix it, but have not yet had the time to work on that fix. On your final question about making a pax archive, if you make an archive of the real filesystem, the snapshot will show up on the archive as a file the size of the filesystem partition. If you mount the snapshot and then make an archive of that filesystem, then the snapshot(s) in the archive will show up as zero length files. Kirk McKusick To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 17:28:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from wantadilla.lemis.com (wantadilla.lemis.com [192.109.197.80]) by hub.freebsd.org (Postfix) with ESMTP id 9C45537B416 for ; Sun, 31 Mar 2002 17:28:06 -0800 (PST) Received: by wantadilla.lemis.com (Postfix, from userid 1004) id 7E4E67830D; Mon, 1 Apr 2002 10:58:04 +0930 (CST) Date: Mon, 1 Apr 2002 10:58:04 +0930 From: Greg 'groggy' Lehey To: Dag-Erling Smorgrav Cc: arch@freebsd.org Subject: Re: mutex profiling Message-ID: <20020401105804.B26813@wantadilla.lemis.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.3.23i Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Monday, 1 April 2002 at 0:07:18 +0200, Dag-Erling Smorgrav wrote: > The attached patch (derived from patches by Eivind) adds code to > record statistics about MTX_DEF locks. It's currently i386-only as it > uses the TSC to measure the amount of time each mutex is held. Once > compiled in and enabled, statistics can be retrieved using sysctl(8). > The code records four numbers for each mutex (longest time held in one > go, total time held, number of acquisitions, average time held) and > returns a top-16 list for each. Each mutex is identified by the place > where it was first acquired, so multiple instances of the same type of > mutex (e.g. proc lock, filedesc lock) are counted as one. Here's a > sample listing (with some annotations): Excellent! Of course, it could be better :-) It would be nice to get a list by lock of each of the four parameters, something like: max total count average i386/i386/machdep.c:1715 10653798 5472784414 1127992 4852 (repeat for each lock) One of the things that I can't recall anybody looking at has been whether to spin or block on each kind of lock. This information would help make that decision. What units are the times in? If the average time spent in Giant is 4852 µs, I'd say it's definitely a waste of time to spin on it at all. If it's 4852 ns, it's probably the correct thing to do. Greg -- See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 18:12:16 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail.rpi.edu (mail.rpi.edu [128.113.22.40]) by hub.freebsd.org (Postfix) with ESMTP id E00B937B41F for ; Sun, 31 Mar 2002 18:12:11 -0800 (PST) Received: from [128.113.24.47] (gilead.acs.rpi.edu [128.113.24.47]) by mail.rpi.edu (8.12.1/8.12.1) with ESMTP id g312C8CL493732; Sun, 31 Mar 2002 21:12:09 -0500 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: In-Reply-To: <200203312214.g2VMEWD07500@beastie.mckusick.com> References: <200203312214.g2VMEWD07500@beastie.mckusick.com> Date: Sun, 31 Mar 2002 21:12:13 -0500 To: Kirk McKusick From: Garance A Drosihn Subject: Re: UFS snapshots in current Cc: arch@FreeBSD.ORG Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: MIMEDefang 2.3 (www dot roaringpenguin dot com slash mimedefang) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG At 2:14 PM -0800 3/31/02, Kirk McKusick wrote: >All of the above is basically correct. Your use of a snapshot >over a long period of time and in particular over a CVS update, >build world, and install world should not cause trouble (other >than perhaps a large amount of disk space being used). There >is no problem with rebooting, provided that the filesystem with >the snapshot is cleanly unmounted. I really think this snapshot capability is great. Combined with the huge disks we can buy these days, I think snapshots will be useful in many ways that we're not even thinking of yet. >I fixed one deadlock last January, but am aware of at least >one more that is still there. I have a fairly good idea on how >to fix it, but have not yet had the time to work on that fix. Okay, thanks. >On your final question about making a pax archive, if you make >an archive of the real filesystem, the snapshot will show up >on the archive as a file the size of the filesystem partition. >If you mount the snapshot and then make an archive of that >filesystem, then the snapshot(s) in the archive will show up >as zero length files. Hmm. Is there any way for a regular user-land process to tell if a given file is a snapshot? Something in the stat() info, or some other way to tell? I have no urgent need for it, but it seems like it would be useful. -- Garance Alistair Drosehn = gad@eclipse.acs.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 18:38:42 2002 Delivered-To: freebsd-arch@freebsd.org Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by hub.freebsd.org (Postfix) with ESMTP id 3083837B405 for ; Sun, 31 Mar 2002 18:38:40 -0800 (PST) Received: by elvis.mu.org (Postfix, from userid 1192) id 0BB1AAE027; Sun, 31 Mar 2002 18:38:40 -0800 (PST) Date: Sun, 31 Mar 2002 18:38:39 -0800 From: Alfred Perlstein To: Garance A Drosihn Cc: Kirk McKusick , arch@FreeBSD.ORG Subject: Re: UFS snapshots in current Message-ID: <20020401023839.GR93885@elvis.mu.org> References: <200203312214.g2VMEWD07500@beastie.mckusick.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.27i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG * Garance A Drosihn [020331 18:12] wrote: > > Hmm. Is there any way for a regular user-land process to tell > if a given file is a snapshot? Something in the stat() info, > or some other way to tell? I have no urgent need for it, but > it seems like it would be useful. This probably isn't exactly what you're looking for, but you could check the file's ctime against the ctime of the snapshot file. -- -Alfred Perlstein [alfred@freebsd.org] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' Tax deductible donations for FreeBSD: http://www.freebsdfoundation.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 18:48:59 2002 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id D5F8C37B417 for ; Sun, 31 Mar 2002 18:48:45 -0800 (PST) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.6/8.11.6) with SMTP id g312mQw00952; Sun, 31 Mar 2002 21:48:27 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Sun, 31 Mar 2002 21:48:26 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Garance A Drosihn Cc: Kirk McKusick , arch@FreeBSD.ORG Subject: Re: UFS snapshots in current In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sun, 31 Mar 2002, Garance A Drosihn wrote: > Hmm. Is there any way for a regular user-land process to tell if a > given file is a snapshot? Something in the stat() info, or some other > way to tell? I have no urgent need for it, but it seems like it would > be useful. Look for the SF_SNAPSHOT flag. I don't recall if this is exported via the flags field via stat(), but it may well be. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 20:15:18 2002 Delivered-To: freebsd-arch@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 821) id 7878337B400; Sun, 31 Mar 2002 20:15:13 -0800 (PST) Date: Sun, 31 Mar 2002 20:15:13 -0800 From: John De Boskey To: Arch List Subject: /bin/ls -T option logic/doc Message-ID: <20020331201513.A55590@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, The -T option to ls specifies to display complete time information. However, this option only has effect when the -l (ell) option is also specfied. Two alternatives: - Update the manpage to reflect the dependancy as is already done with other options. - Update the -T option processing to automatically imply the -l option. I believe the manpage update is probably the safest, but I don't know if it is the most correct. I can't seem to find any standards doc on this issue related to -T. Pointers? Thanks! John ps: Also noted in the ls.1 diff is a spelling correction which can be done regardless of -T. Index: ls.1 =================================================================== RCS file: /home/ncvs/src/bin/ls/ls.1,v retrieving revision 1.62 diff -u -r1.62 ls.1 --- ls.1 9 Jan 2002 13:29:39 -0000 1.62 +++ ls.1 1 Apr 2002 03:51:35 -0000 @@ -137,8 +137,12 @@ .It Fl R Recursively list subdirectories encountered. .It Fl T -Display complete time information for the file, including -month, day, hour, minute, second, and year. +When used with the +.Fl l +(lowercase letter +.Dq ell ) +option, display complete time information for the file, including +month, day, hour, minute, second, and year. .It Fl W Display whiteouts when scanning directories. .It Fl Z @@ -167,7 +171,7 @@ .Pq Fl l format output. .It Fl h -When used wih the +When used with the .Fl l option, use unit suffixes: Byte, Kilobyte, Megabyte, Gigabyte, Terabyte and Petabyte in order to reduce the number of digits to three or less Index: ls.c =================================================================== RCS file: /home/ncvs/src/bin/ls/ls.c,v retrieving revision 1.56 diff -u -r1.56 ls.c --- ls.c 19 Feb 2002 00:05:50 -0000 1.56 +++ ls.c 1 Apr 2002 03:53:50 -0000 @@ -263,6 +263,8 @@ break; case 'T': f_sectime = 1; + f_longform = 1; + f_singlecol = 0; break; case 't': f_timesort = 1; To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 20:47:23 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail.rpi.edu (mail.rpi.edu [128.113.22.40]) by hub.freebsd.org (Postfix) with ESMTP id 9C23237B487; Sun, 31 Mar 2002 20:47:03 -0800 (PST) Received: from [128.113.24.47] (gilead.acs.rpi.edu [128.113.24.47]) by mail.rpi.edu (8.12.1/8.12.1) with ESMTP id g314kcCL471458; Sun, 31 Mar 2002 23:46:39 -0500 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: In-Reply-To: References: Date: Sun, 31 Mar 2002 23:46:44 -0500 To: Robert Watson From: Garance A Drosihn Subject: Re: UFS snapshots in current Cc: Kirk McKusick , arch@FreeBSD.ORG Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: MIMEDefang 2.3 (www dot roaringpenguin dot com slash mimedefang) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG At 9:48 PM -0500 3/31/02, Robert Watson wrote: >On Sun, 31 Mar 2002, Garance A Drosihn wrote: > > > Hmm. Is there any way for a regular user-land process to > > tell if a given file is a snapshot? Something in the stat() > > info, or some other way to tell? > >Look for the SF_SNAPSHOT flag. I don't recall if this is >exported via the flags field via stat(), but it may well be. It looks like it is. In fact, it looks like we could just change fflagstostr() to check for it, and 'ls -lo' would show 'snap' in the field of interesting flags. This might be a good idea, since a snapshot file is (I assume) truly read-only. (I assume it's like schg, except that you can't even use chflags to make it writable). -- Garance Alistair Drosehn = gad@eclipse.acs.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Mar 31 22: 8: 4 2002 Delivered-To: freebsd-arch@freebsd.org Received: from beastie.mckusick.com (beastie.mckusick.com [209.31.233.184]) by hub.freebsd.org (Postfix) with ESMTP id D414C37B400 for ; Sun, 31 Mar 2002 22:07:59 -0800 (PST) Received: from beastie.mckusick.com (localhost [127.0.0.1]) by beastie.mckusick.com (8.11.6/8.11.6) with ESMTP id g31678D07951; Sun, 31 Mar 2002 22:07:08 -0800 (PST) (envelope-from mckusick@beastie.mckusick.com) Message-Id: <200204010607.g31678D07951@beastie.mckusick.com> To: Garance A Drosihn Subject: Re: UFS snapshots in current Cc: arch@FreeBSD.ORG In-Reply-To: Your message of "Sun, 31 Mar 2002 21:12:13 EST." Date: Sun, 31 Mar 2002 22:07:08 -0800 From: Kirk McKusick Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Date: Sun, 31 Mar 2002 21:12:13 -0500 To: Kirk McKusick From: Garance A Drosihn Subject: Re: UFS snapshots in current Cc: arch@FreeBSD.ORG At 2:14 PM -0800 3/31/02, Kirk McKusick wrote: >On your final question about making a pax archive, if you make >an archive of the real filesystem, the snapshot will show up >on the archive as a file the size of the filesystem partition. >If you mount the snapshot and then make an archive of that >filesystem, then the snapshot(s) in the archive will show up >as zero length files. Hmm. Is there any way for a regular user-land process to tell if a given file is a snapshot? Something in the stat() info, or some other way to tell? I have no urgent need for it, but it seems like it would be useful. -- Garance Alistair Drosehn = gad@eclipse.acs.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu You can determine if a file is a snapshot by doing a stat and checking for the SF_SNAPSHOT bit being set in the st_flags field. Kirk McKusick To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 3:38: 7 2002 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id B8C3437B41A for ; Mon, 1 Apr 2002 03:38:04 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 5843D5346; Mon, 1 Apr 2002 13:38:01 +0200 (CEST) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: Poul-Henning Kamp Cc: arch@FreeBSD.ORG Subject: Re: mutex profiling References: <29646.1017612770@critter.freebsd.dk> From: Dag-Erling Smorgrav Date: 01 Apr 2002 13:38:01 +0200 In-Reply-To: <29646.1017612770@critter.freebsd.dk> Message-ID: Lines: 10 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Poul-Henning Kamp writes: > With the footnote that the TSC's are not synchronized on SMP > systems [...] We tried using {,get}nanouptime() instead, but got nothing but zeroes... DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 3:41:25 2002 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id 8FB1837B41F; Mon, 1 Apr 2002 03:41:22 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 81E795346; Mon, 1 Apr 2002 13:41:21 +0200 (CEST) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: Greg 'groggy' Lehey Cc: arch@freebsd.org Subject: Re: mutex profiling References: <20020401105804.B26813@wantadilla.lemis.com> From: Dag-Erling Smorgrav Date: 01 Apr 2002 13:41:21 +0200 In-Reply-To: <20020401105804.B26813@wantadilla.lemis.com> Message-ID: Lines: 19 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Greg 'groggy' Lehey writes: > It would be nice to get a list by lock of each of the four parameters, > something like: > > max total count average > i386/i386/machdep.c:1715 10653798 5472784414 1127992 4852 > (repeat for each lock) Good idea, I'll add that. > One of the things that I can't recall anybody looking at has been > whether to spin or block on each kind of lock. This information would > help make that decision. What units are the times in? Clock cycles. DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 4:29: 2 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id 2125437B420 for ; Mon, 1 Apr 2002 04:28:57 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.2/8.12.2) with ESMTP id g31CSV4F004906; Mon, 1 Apr 2002 14:28:32 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Dag-Erling Smorgrav Cc: arch@FreeBSD.ORG Subject: Re: mutex profiling In-Reply-To: Your message of "01 Apr 2002 13:38:01 +0200." Date: Mon, 01 Apr 2002 14:28:31 +0200 Message-ID: <4905.1017664111@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message , Dag-Erling Smorgrav writes: >Poul-Henning Kamp writes: >> With the footnote that the TSC's are not synchronized on SMP >> systems [...] > >We tried using {,get}nanouptime() instead, but got nothing but >zeroes... nanouptime() should not get you zeros, but it would be slower than TSC. getnanouptime would hopefully give you all zeros. I didn't mean to imply that the TSC was wrong as such, but merely wanted to point to the fact that a mutex locked on one CPU and unlocked on another will (likely) screw up your numbers big time. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 8:11: 3 2002 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id BEBEB37B417; Mon, 1 Apr 2002 08:10:40 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 4BBA95348; Mon, 1 Apr 2002 18:10:39 +0200 (CEST) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: Greg 'groggy' Lehey Cc: arch@freebsd.org Subject: Re: mutex profiling References: <20020401105804.B26813@wantadilla.lemis.com> From: Dag-Erling Smorgrav Date: 01 Apr 2002 18:10:38 +0200 In-Reply-To: <20020401105804.B26813@wantadilla.lemis.com> Message-ID: Lines: 97 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Greg 'groggy' Lehey writes: > It would be nice to get a list by lock of each of the four parameters, > something like: > > max total count average > i386/i386/machdep.c:1715 10653798 5472784414 1127992 4852 > (repeat for each lock) How's this? des@des ~% sysctl -n debug.mutex.prof.all mutex max total count kern/kern_fork.c:467 6124730 162187907 986568 i386/i386/machdep.c:1715 10024515 2426296920 620041 kern/vfs_syscalls.c:1692 12998 118467144 281628 kern/kern_descrip.c:748 87515 196871456 238729 kern/vfs_vnops.c:762 12361 8096945 24015 vm/uma_core.c:1179 441720 98833757 392171 kern/kern_sx.c:147 8604 19467888 98071 kern/kern_lock.c:227 13992 228025483 520510 fs/pseudofs/pseudofs_vncache.c:211 862 109653 208 kern/kern_prot.c:1706 9726 13211102 43701 kern/vfs_subr.c:1788 10696 16134536 80109 kern/imgact_elf.c:518 11827 1793896 3801 vm/vm_object.c:599 1078 561152 2956 kern/kern_sx.c:102 12378 15707316 29498 kern/kern_resource.c:900 7723 3815878 18642 kern/kern_proc.c:356 7798 605512 365 kern/kern_proc.c:972 727 145459 633 kern/vfs_subr.c:1857 12330 88298631 231204 sys/buf.h:278 8950 10131632 34047 kern/tty.c:1053 2071 75200 198 kern/sys_pipe.c:450 6780 1395976 1410 kern/vfs_subr.c:1805 9496 22600436 54091 kern/vfs_subr.c:782 6202 3523460 17028 kern/vfs_subr.c:2356 761 55381 269 vm/uma_core.c:1701 11202 1714480 1340 kern/vfs_lookup.c:149 20364 13832882 21443 ufs/ufs/ufs_dirhash.c:356 7669 1787405 1333 ufs/ufs/ufs_ihash.c:110 9396 8938672 17295 ufs/ffs/ffs_vfsops.c:1172 10552 642025 4200 kern/kern_lock.c:507 822 704605 4247 kern/vfs_subr.c:936 401354 3583794 2351 ufs/ufs/ufs_dirhash.c:158 1568 381173 701 vm/uma_core.c:1301 39491 1624803 1042 kern/kern_prot.c:1757 7278 1157954 4514 vm/swap_pager.c:306 12060 3043719 1512 kern/vfs_subr.c:1739 6767 463008 1251 kern/sys_pipe.c:229 7250 5170251 10120 kern/sys_pipe.c:259 5853 4140334 26902 vm/uma_core.c:1678 2403 82426 79 kern/subr_eventhandler.c:78 1605 49847 46 kern/kern_descrip.c:1108 21367 7456128 4962 pci/if_xl.c:1259 945556 3979999 84 kern/subr_mbuf.c:452 6244 3756251 13490 kern/subr_mbuf.c:577 2837 11077 59 net/bpf.c:1268 695 854 2 net/bpf.c:627 6557 97926 59 net/bpf.c:254 7413 147862 46 net/if_var.h:294 3298 36665 140 kern/sys_generic.c:800 20769 31401570 87233 kern/vfs_subr.c:419 1758 94701 84 vm/uma_core.c:1761 17668 671222 167 ufs/ufs/ufs_vnops.c:298 5371 537428 786 ufs/ufs/ufs_ihash.c:114 6323 250572 253 dev/random/yarrow.c:265 255551 4751740 63 kern/kern_prot.c:1910 3961 21217 49 kern/sys_pipe.c:1179 5596 1660837 1315 kern/tty_tty.c:89 2645 5455 6 kern/kern_exit.c:285 3916 29487 16 kern/sys_pipe.c:843 5238 168056 120 kern/sys_pipe.c:1129 3111 22629 19 kern/kern_resource.c:864 2124 95751 77 kern/sys_pipe.c:1302 6326 13639 5 kern/kern_proc.c:372 9044 42850 29 kern/kern_descrip.c:1577 3511 3511 1 kern/kern_fork.c:562 2172 27768 39 kern/init_main.c:495 2527 82565 108 vm/uma_core.c:1887 3835 42494 48 vm/uma_core.c:232 727662 876974 154 vm/uma_core.c:1716 679 3261 5 vm/uma_core.c:985 113894 534906 5 vm/device_pager.c:156 3001 5445 19 vm/swap_pager.c:1694 3306 56747 138 kern/kern_descrip.c:168 13772 47793 25 ufs/ufs/ufs_dirhash.c:495 4150 132187 97 kern/kern_fork.c:535 27527 59469 12 dev/sound/pcm/channel.c:677 12177241 23561042 8 dev/sound/pcm/sound.c:134 65432 173901 21 kern/kern_descrip.c:264 12207 1311316 9670 (I didn't include average since it can be computed from total and count) DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 8:23:22 2002 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 19D8037B429; Mon, 1 Apr 2002 08:23:11 -0800 (PST) Received: from localhost (arr@localhost) by fledge.watson.org (8.11.6/8.11.6) with SMTP id g31GN2d11126; Mon, 1 Apr 2002 11:23:02 -0500 (EST) (envelope-from arr@FreeBSD.org) X-Authentication-Warning: fledge.watson.org: arr owned process doing -bs Date: Mon, 1 Apr 2002 11:23:01 -0500 (EST) From: "Andrew R. Reiter" X-Sender: arr@fledge.watson.org To: Dag-Erling Smorgrav Cc: "Greg 'groggy' Lehey" , arch@FreeBSD.org Subject: Re: mutex profiling In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 1 Apr 2002, Dag-Erling Smorgrav wrote: :Greg 'groggy' Lehey writes: :> It would be nice to get a list by lock of each of the four parameters, :> something like: :> :> max total count average :> i386/i386/machdep.c:1715 10653798 5472784414 1127992 4852 :> (repeat for each lock) : :How's this? Can we perhaps have the ability to dump the lock char * description? Or are you doing this way b/c you can get the file and line #'s? Looks good, tho. Cheers, Andrew : :des@des ~% sysctl -n debug.mutex.prof.all :mutex max total count :kern/kern_fork.c:467 6124730 162187907 986568 :i386/i386/machdep.c:1715 10024515 2426296920 620041 :kern/vfs_syscalls.c:1692 12998 118467144 281628 :kern/kern_descrip.c:748 87515 196871456 238729 :kern/vfs_vnops.c:762 12361 8096945 24015 :vm/uma_core.c:1179 441720 98833757 392171 :kern/kern_sx.c:147 8604 19467888 98071 :kern/kern_lock.c:227 13992 228025483 520510 :fs/pseudofs/pseudofs_vncache.c:211 862 109653 208 :kern/kern_prot.c:1706 9726 13211102 43701 :kern/vfs_subr.c:1788 10696 16134536 80109 :kern/imgact_elf.c:518 11827 1793896 3801 :vm/vm_object.c:599 1078 561152 2956 :kern/kern_sx.c:102 12378 15707316 29498 :kern/kern_resource.c:900 7723 3815878 18642 :kern/kern_proc.c:356 7798 605512 365 :kern/kern_proc.c:972 727 145459 633 :kern/vfs_subr.c:1857 12330 88298631 231204 :sys/buf.h:278 8950 10131632 34047 :kern/tty.c:1053 2071 75200 198 :kern/sys_pipe.c:450 6780 1395976 1410 :kern/vfs_subr.c:1805 9496 22600436 54091 :kern/vfs_subr.c:782 6202 3523460 17028 :kern/vfs_subr.c:2356 761 55381 269 :vm/uma_core.c:1701 11202 1714480 1340 :kern/vfs_lookup.c:149 20364 13832882 21443 :ufs/ufs/ufs_dirhash.c:356 7669 1787405 1333 :ufs/ufs/ufs_ihash.c:110 9396 8938672 17295 :ufs/ffs/ffs_vfsops.c:1172 10552 642025 4200 :kern/kern_lock.c:507 822 704605 4247 :kern/vfs_subr.c:936 401354 3583794 2351 :ufs/ufs/ufs_dirhash.c:158 1568 381173 701 :vm/uma_core.c:1301 39491 1624803 1042 :kern/kern_prot.c:1757 7278 1157954 4514 :vm/swap_pager.c:306 12060 3043719 1512 :kern/vfs_subr.c:1739 6767 463008 1251 :kern/sys_pipe.c:229 7250 5170251 10120 :kern/sys_pipe.c:259 5853 4140334 26902 :vm/uma_core.c:1678 2403 82426 79 :kern/subr_eventhandler.c:78 1605 49847 46 :kern/kern_descrip.c:1108 21367 7456128 4962 :pci/if_xl.c:1259 945556 3979999 84 :kern/subr_mbuf.c:452 6244 3756251 13490 :kern/subr_mbuf.c:577 2837 11077 59 :net/bpf.c:1268 695 854 2 :net/bpf.c:627 6557 97926 59 :net/bpf.c:254 7413 147862 46 :net/if_var.h:294 3298 36665 140 :kern/sys_generic.c:800 20769 31401570 87233 :kern/vfs_subr.c:419 1758 94701 84 :vm/uma_core.c:1761 17668 671222 167 :ufs/ufs/ufs_vnops.c:298 5371 537428 786 :ufs/ufs/ufs_ihash.c:114 6323 250572 253 :dev/random/yarrow.c:265 255551 4751740 63 :kern/kern_prot.c:1910 3961 21217 49 :kern/sys_pipe.c:1179 5596 1660837 1315 :kern/tty_tty.c:89 2645 5455 6 :kern/kern_exit.c:285 3916 29487 16 :kern/sys_pipe.c:843 5238 168056 120 :kern/sys_pipe.c:1129 3111 22629 19 :kern/kern_resource.c:864 2124 95751 77 :kern/sys_pipe.c:1302 6326 13639 5 :kern/kern_proc.c:372 9044 42850 29 :kern/kern_descrip.c:1577 3511 3511 1 :kern/kern_fork.c:562 2172 27768 39 :kern/init_main.c:495 2527 82565 108 :vm/uma_core.c:1887 3835 42494 48 :vm/uma_core.c:232 727662 876974 154 :vm/uma_core.c:1716 679 3261 5 :vm/uma_core.c:985 113894 534906 5 :vm/device_pager.c:156 3001 5445 19 :vm/swap_pager.c:1694 3306 56747 138 :kern/kern_descrip.c:168 13772 47793 25 :ufs/ufs/ufs_dirhash.c:495 4150 132187 97 :kern/kern_fork.c:535 27527 59469 12 :dev/sound/pcm/channel.c:677 12177241 23561042 8 :dev/sound/pcm/sound.c:134 65432 173901 21 :kern/kern_descrip.c:264 12207 1311316 9670 : :(I didn't include average since it can be computed from total and :count) : :DES :-- :Dag-Erling Smorgrav - des@ofug.org : :To Unsubscribe: send mail to majordomo@FreeBSD.org :with "unsubscribe freebsd-arch" in the body of the message : -- Andrew R. Reiter arr@watson.org arr@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 8:26:45 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id 61ACC37B41D; Mon, 1 Apr 2002 08:26:41 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.2/8.12.2) with ESMTP id g31GQF4F014352; Mon, 1 Apr 2002 18:26:16 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Dag-Erling Smorgrav Cc: "Greg 'groggy' Lehey" , arch@FreeBSD.ORG Subject: Re: mutex profiling In-Reply-To: Your message of "01 Apr 2002 18:10:38 +0200." Date: Mon, 01 Apr 2002 18:26:15 +0200 Message-ID: <14351.1017678375@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message , Dag-Erling Smorgrav writes: >How's this? > >des@des ~% sysctl -n debug.mutex.prof.all >mutex max total count >kern/kern_fork.c:467 6124730 162187907 986568 >i386/i386/machdep.c:1715 10024515 2426296920 620041 >kern/vfs_syscalls.c:1692 12998 118467144 281628 >(I didn't include average since it can be computed from total and >count) I would include the average so that people can do sysctl -n debug.mutex.prof.all | sort +4rn -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 8:31:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id 11EAA37B405; Mon, 1 Apr 2002 08:31:33 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 9C3885348; Mon, 1 Apr 2002 18:31:31 +0200 (CEST) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: "Andrew R. Reiter" Cc: "Greg 'groggy' Lehey" , arch@FreeBSD.org Subject: Re: mutex profiling References: From: Dag-Erling Smorgrav Date: 01 Apr 2002 18:31:31 +0200 In-Reply-To: Message-ID: Lines: 12 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Andrew R. Reiter" writes: > Can we perhaps have the ability to dump the lock char * description? Or > are you doing this way b/c you can get the file and line #'s? Looks good, > tho. I can get both; I'm doing it this way because Eivind did it this way and it didn't occur to me to change it. Is one preferrable to the other? DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 8:32:24 2002 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id D970A37B419; Mon, 1 Apr 2002 08:32:21 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 421555346; Mon, 1 Apr 2002 18:32:20 +0200 (CEST) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: Poul-Henning Kamp Cc: "Greg 'groggy' Lehey" , arch@FreeBSD.ORG Subject: Re: mutex profiling References: <14351.1017678375@critter.freebsd.dk> From: Dag-Erling Smorgrav Date: 01 Apr 2002 18:32:19 +0200 In-Reply-To: <14351.1017678375@critter.freebsd.dk> Message-ID: Lines: 13 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Poul-Henning Kamp writes: > > I would include the average so that people can do > > sysctl -n debug.mutex.prof.all | sort +4rn > > True; then we wouldn't need the four other sysctls at all... DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 8:36: 7 2002 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 39F3B37B41A; Mon, 1 Apr 2002 08:36:03 -0800 (PST) Received: from localhost (arr@localhost) by fledge.watson.org (8.11.6/8.11.6) with SMTP id g31GZtU11290; Mon, 1 Apr 2002 11:35:55 -0500 (EST) (envelope-from arr@FreeBSD.org) X-Authentication-Warning: fledge.watson.org: arr owned process doing -bs Date: Mon, 1 Apr 2002 11:35:54 -0500 (EST) From: "Andrew R. Reiter" X-Sender: arr@fledge.watson.org To: Dag-Erling Smorgrav Cc: "Andrew R. Reiter" , "Greg 'groggy' Lehey" , arch@FreeBSD.org Subject: Re: mutex profiling In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 1 Apr 2002, Dag-Erling Smorgrav wrote: :"Andrew R. Reiter" writes: :> Can we perhaps have the ability to dump the lock char * description? Or :> are you doing this way b/c you can get the file and line #'s? Looks good, :> tho. : :I can get both; I'm doing it this way because Eivind did it this way :and it didn't occur to me to change it. Is one preferrable to the :other? : Well, I guess that the file and line number is of more use than lock name, especially for what we want to use this information for. If others find that information more helpful, then perhaps we should think about it further, otherwise, I think what you and Eivind have worked up should be useful. Thanks. Cheers, Andrew -- Andrew R. Reiter arr@watson.org arr@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 8:42:45 2002 Delivered-To: freebsd-arch@freebsd.org Received: from phoenix.dmnshq.net (phoenix.dmnshq.net [194.19.34.94]) by hub.freebsd.org (Postfix) with SMTP id E047037B419; Mon, 1 Apr 2002 08:42:32 -0800 (PST) Received: (from eivind@localhost) by phoenix.dmnshq.net (8.11.6/8.11.6) id g31Gfxd16475; Mon, 1 Apr 2002 18:41:59 +0200 (CEST) (envelope-from eivind) Date: Mon, 1 Apr 2002 18:41:58 +0200 From: Eivind Eklund To: Dag-Erling Smorgrav Cc: "Andrew R. Reiter" , "Greg 'groggy' Lehey" , arch@FreeBSD.ORG Subject: Re: mutex profiling Message-ID: <20020401184158.A15491@phoenix.dmnshq.net> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: ; from des@ofug.org on Mon, Apr 01, 2002 at 06:31:31PM +0200 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Apr 01, 2002 at 06:31:31PM +0200, Dag-Erling Smorgrav wrote: > "Andrew R. Reiter" writes: > > Can we perhaps have the ability to dump the lock char * description? Or > > are you doing this way b/c you can get the file and line #'s? Looks good, > > tho. > > I can get both; I'm doing it this way because Eivind did it this way > and it didn't occur to me to change it. Is one preferrable to the > other? The use of filename/line combinations was done to be able to find what actual lock aquisitions result in introduction of large amounts of latency. The basic reason I wrote this patch was to be able to find what parts of our code result in latency, to focus effort there. Measuring the lock types themselves (which is what the lock description would give you) give a much less granular set of information. This accumulation can (non-trivially) be done separately, but if you do the accumulation, it is not possible to recover the information about where the latency is introduced. Eivind. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 8:47:58 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id 6A88937B419; Mon, 1 Apr 2002 08:47:49 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.2/8.12.2) with ESMTP id g31GlO4F018326; Mon, 1 Apr 2002 18:47:24 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Eivind Eklund Cc: Dag-Erling Smorgrav , "Andrew R. Reiter" , "Greg 'groggy' Lehey" , arch@FreeBSD.ORG Subject: Re: mutex profiling In-Reply-To: Your message of "Mon, 01 Apr 2002 18:41:58 +0200." <20020401184158.A15491@phoenix.dmnshq.net> Date: Mon, 01 Apr 2002 18:47:24 +0200 Message-ID: <18325.1017679644@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020401184158.A15491@phoenix.dmnshq.net>, Eivind Eklund writes: >On Mon, Apr 01, 2002 at 06:31:31PM +0200, Dag-Erling Smorgrav wrote: >> "Andrew R. Reiter" writes: >> > Can we perhaps have the ability to dump the lock char * description? Or >> > are you doing this way b/c you can get the file and line #'s? Looks good, >> > tho. >> >> I can get both; I'm doing it this way because Eivind did it this way >> and it didn't occur to me to change it. Is one preferrable to the >> other? > >The use of filename/line combinations was done to be able to find what actual >lock aquisitions result in introduction of large amounts of latency. The >basic reason I wrote this patch was to be able to find what parts of our code >result in latency, to focus effort there. > >Measuring the lock types themselves (which is what the lock description would >give you) give a much less granular set of information. This accumulation can >(non-trivially) be done separately, but if you do the accumulation, it is not >possible to recover the information about where the latency is introduced. I think you misunderstood, I think the request was to get the lock name in addition to the file/line info. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 9: 3:45 2002 Delivered-To: freebsd-arch@freebsd.org Received: from k6.locore.ca (k6.locore.ca [198.96.117.170]) by hub.freebsd.org (Postfix) with ESMTP id 739E637B41E for ; Mon, 1 Apr 2002 09:03:42 -0800 (PST) Received: (from jake@localhost) by k6.locore.ca (8.11.6/8.11.6) id g31H53704564; Mon, 1 Apr 2002 12:05:03 -0500 (EST) (envelope-from jake) Date: Mon, 1 Apr 2002 12:05:02 -0500 From: Jake Burkholder To: Dag-Erling Smorgrav Cc: arch@FreeBSD.ORG Subject: Re: mutex profiling Message-ID: <20020401120502.G207@locore.ca> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from des@ofug.org on Mon, Apr 01, 2002 at 12:07:18AM +0200 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Apparently, On Mon, Apr 01, 2002 at 12:07:18AM +0200, Dag-Erling Smorgrav said words to the effect of; > The attached patch (derived from patches by Eivind) adds code to > record statistics about MTX_DEF locks. It's currently i386-only as it > uses the TSC to measure the amount of time each mutex is held. Once You can use the get_cyclecount() function as an MI way to read the tsc or whatever cheap cyclecounter an architecture provides. I think everything we run on but i386 and i486 have one. Jake To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 9: 4: 5 2002 Delivered-To: freebsd-arch@freebsd.org Received: from phoenix.dmnshq.net (phoenix.dmnshq.net [194.19.34.94]) by hub.freebsd.org (Postfix) with SMTP id 62B0637B405; Mon, 1 Apr 2002 09:04:01 -0800 (PST) Received: (from eivind@localhost) by phoenix.dmnshq.net (8.11.6/8.11.6) id g31H3l717301; Mon, 1 Apr 2002 19:03:47 +0200 (CEST) (envelope-from eivind) Date: Mon, 1 Apr 2002 19:03:47 +0200 From: Eivind Eklund To: Poul-Henning Kamp Cc: Dag-Erling Smorgrav , "Andrew R. Reiter" , "Greg 'groggy' Lehey" , arch@FreeBSD.ORG Subject: Re: mutex profiling Message-ID: <20020401190347.B17023@phoenix.dmnshq.net> References: <20020401184158.A15491@phoenix.dmnshq.net> <18325.1017679644@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <18325.1017679644@critter.freebsd.dk>; from phk@critter.freebsd.dk on Mon, Apr 01, 2002 at 06:47:24PM +0200 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Apr 01, 2002 at 06:47:24PM +0200, Poul-Henning Kamp wrote: > In message <20020401184158.A15491@phoenix.dmnshq.net>, Eivind Eklund writes: > >The use of filename/line combinations was done to be able to find what actual > >lock aquisitions result in introduction of large amounts of latency. The > >basic reason I wrote this patch was to be able to find what parts of our code > >result in latency, to focus effort there. > > > >Measuring the lock types themselves (which is what the lock description would > >give you) give a much less granular set of information. This accumulation can > >(non-trivially) be done separately, but if you do the accumulation, it is not > >possible to recover the information about where the latency is introduced. > > I think you misunderstood, I think the request was to get the lock > name in addition to the file/line info. My intent was just to give the background for the choice, as DES is the one presently working on the patch. Apart from that, it would probably be useful to be able to aquire statisticks for locks based on either of those keys. The filename/line combination gives single lock aquisitions, and show where we get latency from that, and thus which code paths would give us the largest amount of latency reduction by being shortened (or broken by a lock release/reaquire, the way to Linux low latency patches do it). I suspect statistics based on lock types (descriptions) will give us ideas of what introduce overall system latency and for what types of locks we should be doing more work to optimize overall throughput. Eivind. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Apr 1 12:31:21 2002 Delivered-To: freebsd-arch@freebsd.org Received: from green.bikeshed.org (freefall.FreeBSD.org [216.136.204.21]) by hub.freebsd.org (Postfix) with ESMTP id 1107037B405; Mon, 1 Apr 2002 12:31:16 -0800 (PST) Received: from localhost (green@localhost) by green.bikeshed.org (8.11.6/8.11.6) with ESMTP id g31KVEm31548; Mon, 1 Apr 2002 15:31:15 -0500 (EST) (envelope-from green@green.bikeshed.org) Message-Id: <200204012031.g31KVEm31548@green.bikeshed.org> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Poul-Henning Kamp Cc: Eivind Eklund , Dag-Erling Smorgrav , "Andrew R. Reiter" , "Greg 'groggy' Lehey" , arch@FreeBSD.ORG Subject: Re: mutex profiling In-Reply-To: Your message of "Mon, 01 Apr 2002 18:47:24 +0200." <18325.1017679644@critter.freebsd.dk> From: "Brian F. Feldman" Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 01 Apr 2002 15:31:14 -0500 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Poul-Henning Kamp wrote: > I think you misunderstood, I think the request was to get the lock > name in addition to the file/line info. The potential use of this would be if two different locks were being used by a macro since both would expand with the same line number from the C source file being substituted into each. -- Brian Fundakowski Feldman \'[ FreeBSD ]''''''''''\ <> green@FreeBSD.org <> bfeldman@tislabs.com \ The Power to Serve! \ Opinions expressed are my own. \,,,,,,,,,,,,,,,,,,,,,,\ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Apr 2 19:42:25 2002 Delivered-To: freebsd-arch@freebsd.org Received: from kayak.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by hub.freebsd.org (Postfix) with ESMTP id 9AB4237B417 for ; Tue, 2 Apr 2002 19:42:08 -0800 (PST) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by kayak.xcllnt.net (8.11.6/8.11.4) with ESMTP id g333g8b30423 for ; Tue, 2 Apr 2002 19:42:08 -0800 (PST) (envelope-from marcel@kayak.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.2/8.12.2) with ESMTP id g333g9Q9000976 for ; Tue, 2 Apr 2002 19:42:09 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.2/8.12.2/Submit) id g333g8Ul000975 for arch@FreeBSD.org; Tue, 2 Apr 2002 19:42:08 -0800 (PST) Date: Tue, 2 Apr 2002 19:42:08 -0800 From: Marcel Moolenaar To: arch@FreeBSD.org Subject: Please review: endian invariant kernel dump headers Message-ID: <20020403034208.GA929@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="d6Gm4EdcadzBjdND" Content-Disposition: inline User-Agent: Mutt/1.3.27i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --d6Gm4EdcadzBjdND Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Gang, Please review the attached patch. The change achieves the following: 1. Dump the kernel header in dump byte order. This is the same as network byte order. Parity calculation is endianness invariant and should not use the macros. 2. The kernel dump header had a size of 520 bytes on 64-bit architectures due to alignment of the uint64_t following the uint32_t. Reordering solves that. 3. No version bump is required, because all existing headers are in little-endian and thus will not have the same version as the big-endian dumps. I did not add support in savecore to read version 0x01000000 headers :-) If there are no blocking objections I like to commit this quickly due to point 2. Thanks, -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net --d6Gm4EdcadzBjdND Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="dump.diff" Index: sys/i386/i386/i386dump.c =================================================================== RCS file: /home/ncvs/src/sys/i386/i386/i386dump.c,v retrieving revision 1.1 diff -u -r1.1 i386dump.c --- sys/i386/i386/i386dump.c 31 Mar 2002 22:36:44 -0000 1.1 +++ sys/i386/i386/i386dump.c 3 Apr 2002 03:23:18 -0000 @@ -67,11 +67,11 @@ /* Fill in the kernel dump header */ strcpy(kdh.magic, KERNELDUMPMAGIC); strcpy(kdh.architecture, "i386"); - kdh.version = KERNELDUMPVERSION; - kdh.architectureversion = KERNELDUMP_I386_VERSION; - kdh.dumplength = Maxmem * (off_t)PAGE_SIZE; - kdh.blocksize = di->blocksize; - kdh.dumptime = time_second; + kdh.version = htod32(KERNELDUMPVERSION); + kdh.architectureversion = htod32(KERNELDUMP_I386_VERSION); + kdh.dumplength = htod64(Maxmem * (off_t)PAGE_SIZE); + kdh.dumptime = htod64(time_second); + kdh.blocksize = htod32(di->blocksize); strncpy(kdh.hostname, hostname, sizeof kdh.hostname); strncpy(kdh.versionstring, version, sizeof kdh.versionstring); if (panicstr != NULL) Index: sys/ia64/ia64/ia64dump.c =================================================================== RCS file: /home/ncvs/src/sys/ia64/ia64/ia64dump.c,v retrieving revision 1.1 diff -u -r1.1 ia64dump.c --- sys/ia64/ia64/ia64dump.c 2 Apr 2002 10:51:32 -0000 1.1 +++ sys/ia64/ia64/ia64dump.c 3 Apr 2002 03:21:07 -0000 @@ -56,20 +56,19 @@ { if (sizeof(*kdh) != DEV_BSIZE) { - printf( - "Compiled struct kerneldumpheader is %d, not %d bytes\n", - sizeof(*kdh), DEV_BSIZE); + printf("Compiled struct kerneldumpheader is %d, " + "not %d bytes\n", sizeof(*kdh), DEV_BSIZE); return; } bzero(kdh, sizeof(*kdh)); strncpy(kdh->magic, KERNELDUMPMAGIC, sizeof(kdh->magic)); strncpy(kdh->architecture, MACHINE_ARCH, sizeof(kdh->architecture)); - kdh->version = KERNELDUMPVERSION; - kdh->architectureversion = archver; - kdh->dumplength = dumplen; - kdh->blocksize = blksz; - kdh->dumptime = time_second; + kdh->version = htod32(KERNELDUMPVERSION); + kdh->architectureversion = htod32(archver); + kdh->dumplength = htod64(dumplen); + kdh->dumptime = htod64(time_second); + kdh->blocksize = htod32(blksz); strncpy(kdh->hostname, hostname, sizeof(kdh->hostname)); strncpy(kdh->versionstring, version, sizeof(kdh->versionstring)); if (panicstr != NULL) @@ -215,7 +214,11 @@ ehdr.e_ident[EI_MAG2] = ELFMAG2; ehdr.e_ident[EI_MAG3] = ELFMAG3; ehdr.e_ident[EI_CLASS] = ELFCLASS64; +#if BYTE_ORDER == LITTLE_ENDIAN ehdr.e_ident[EI_DATA] = ELFDATA2LSB; +#else + ehdr.e_ident[EI_DATA] = ELFDATA2MSB; +#endif ehdr.e_ident[EI_VERSION] = EV_CURRENT; ehdr.e_ident[EI_OSABI] = ELFOSABI_STANDALONE; /* XXX big picture? */ ehdr.e_type = ET_CORE; Index: sys/sys/kerneldump.h =================================================================== RCS file: /home/ncvs/src/sys/sys/kerneldump.h,v retrieving revision 1.2 diff -u -r1.2 kerneldump.h --- sys/sys/kerneldump.h 2 Apr 2002 10:53:59 -0000 1.2 +++ sys/sys/kerneldump.h 3 Apr 2002 03:15:20 -0000 @@ -38,6 +38,25 @@ #ifndef _SYS_KERNELDUMP_H #define _SYS_KERNELDUMP_H +#include + +#if BYTE_ORDER == LITTLE_ENDIAN +#define dtoh32(x) __bswap32(x) +#define dtoh64(x) __bswap64(x) +#define htod32(x) __bswap32(x) +#define htod64(x) __bswap64(x) +#else +#define dtoh32(x) x +#define dtoh64(x) x +#define htod32(x) x +#define htod64(x) x +#endif + +/* + * All uintX_t fields are in dump byte order, which is the same as + * network byte order. Use the macros defined above to read or + * write the fields. + */ struct kerneldumpheader { char magic[20]; # define KERNELDUMPMAGIC "FreeBSD Kernel Dump" @@ -48,14 +67,17 @@ # define KERNELDUMP_I386_VERSION 1 # define KERNELDUMP_IA64_VERSION 1 uint64_t dumplength; /* excl headers */ - uint32_t blocksize; uint64_t dumptime; + uint32_t blocksize; char hostname[64]; char versionstring[192]; char panicstring[192]; uint32_t parity; }; +/* + * Parity calculation is endian insensitive + */ static __inline u_int32_t kerneldump_parity(struct kerneldumpheader *kdhp) { Index: sbin/savecore/savecore.c =================================================================== RCS file: /home/ncvs/src/sbin/savecore/savecore.c,v retrieving revision 1.52 diff -u -r1.52 savecore.c --- sbin/savecore/savecore.c 1 Apr 2002 18:23:58 -0000 1.52 +++ sbin/savecore/savecore.c 3 Apr 2002 03:15:46 -0000 @@ -48,24 +48,28 @@ #include static void -printheader(FILE *f, const struct kerneldumpheader *h, const char *devname, const char *md5) +printheader(FILE *f, const struct kerneldumpheader *h, const char *devname, + const char *md5) { + uint64_t dumplen; time_t t; fprintf(f, "Good dump found on device %s\n", devname); fprintf(f, " Architecture: %s\n", h->architecture); - fprintf(f, " Architecture version: %d\n", h->architectureversion); - fprintf(f, " Dump length: %lldB (%lld MB)\n", - (long long)h->dumplength, (long long)h->dumplength / (1024 * 1024)); - fprintf(f, " Blocksize: %d\n", h->blocksize); - t = h->dumptime; + fprintf(f, " Architecture version: %d\n", + dtoh32(h->architectureversion)); + dumplen = dtoh64(h->dumplength); + fprintf(f, " Dump length: %lldB (%lld MB)\n", (long long)dumplen, + (long long)(dumplen >> 20)); + fprintf(f, " Blocksize: %d\n", dtoh32(h->blocksize)); + t = dtoh64(h->dumptime); fprintf(f, " Dumptime: %s", ctime(&t)); fprintf(f, " Hostname: %s\n", h->hostname); fprintf(f, " Versionstring: %s", h->versionstring); fprintf(f, " Panicstring: %s\n", h->panicstring); fprintf(f, " MD5: %s\n", md5); } - + static void DoFile(const char *devname) @@ -109,12 +113,13 @@ warnx("Magic mismatch on last dump header on %s\n", devname); return; } - if (kdhl.version != KERNELDUMPVERSION) { + if (dtoh32(kdhl.version) != KERNELDUMPVERSION) { warnx("Unknown version (%d) in last dump header on %s\n", - kdhl.version, devname); + dtoh32(kdhl.version), devname); return; } - firsthd = lasthd - kdhl.dumplength - sizeof kdhf; + dumpsize = dtoh64(kdhl.dumplength); + firsthd = lasthd - dumpsize - sizeof kdhf; lseek(fd, firsthd, SEEK_SET); error = read(fd, &kdhf, sizeof kdhf); if (error != sizeof kdhf) { @@ -146,7 +151,6 @@ info = fdopen(fdinfo, "w"); printheader(stdout, &kdhl, devname, md5); printheader(info, &kdhl, devname, md5); - dumpsize = kdhl.dumplength; printf("Saving dump to file...\n"); while (dumpsize > 0) { wl = sizeof(buf); --d6Gm4EdcadzBjdND-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Apr 2 21:21:41 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id BF9AC37B405 for ; Tue, 2 Apr 2002 21:21:35 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.2/8.12.2) with ESMTP id g335L94F027396; Wed, 3 Apr 2002 07:21:09 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Marcel Moolenaar Cc: arch@FreeBSD.ORG Subject: Re: Please review: endian invariant kernel dump headers In-Reply-To: Your message of "Tue, 02 Apr 2002 19:42:08 -0800." <20020403034208.GA929@dhcp01.pn.xcllnt.net> Date: Wed, 03 Apr 2002 07:21:09 +0200 Message-ID: <27395.1017811269@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020403034208.GA929@dhcp01.pn.xcllnt.net>, Marcel Moolenaar writes : >Please review the attached patch. The change achieves the following: > >1. Dump the kernel header in dump byte order. This is the same > as network byte order. Parity calculation is endianness > invariant and should not use the macros. good. >2. The kernel dump header had a size of 520 bytes on 64-bit > architectures due to alignment of the uint64_t following > the uint32_t. Reordering solves that. My fault, I thought I had that right. >3. No version bump is required, because all existing headers are > in little-endian and thus will not have the same version as > the big-endian dumps. I did not add support in savecore to > read version 0x01000000 headers :-) Cool. Suggest you add: CTASSERT(sizeof kerneldumpheader == 512); while at it. >If there are no blocking objections I like to commit this quickly >due to point 2. By all means. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Apr 2 22:15:40 2002 Delivered-To: freebsd-arch@freebsd.org Received: from espresso.q9media.com (espresso.q9media.com [216.254.138.122]) by hub.freebsd.org (Postfix) with ESMTP id 13D1037B4B9 for ; Tue, 2 Apr 2002 22:15:22 -0800 (PST) Received: (from mike@localhost) by espresso.q9media.com (8.11.6/8.11.6) id g3368o976778; Wed, 3 Apr 2002 01:08:50 -0500 (EST) (envelope-from mike) Date: Wed, 3 Apr 2002 01:08:50 -0500 From: Mike Barcroft To: Marcel Moolenaar Cc: arch@FreeBSD.org Subject: Re: Please review: endian invariant kernel dump headers Message-ID: <20020403010850.C19806@espresso.q9media.com> References: <20020403034208.GA929@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020403034208.GA929@dhcp01.pn.xcllnt.net>; from marcel@xcllnt.net on Tue, Apr 02, 2002 at 07:42:08PM -0800 Organization: The FreeBSD Project Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Marcel Moolenaar writes: > Please review the attached patch. The change achieves the following: [...] > Index: sys/sys/kerneldump.h > =================================================================== > RCS file: /home/ncvs/src/sys/sys/kerneldump.h,v > retrieving revision 1.2 > diff -u -r1.2 kerneldump.h > --- sys/sys/kerneldump.h 2 Apr 2002 10:53:59 -0000 1.2 > +++ sys/sys/kerneldump.h 3 Apr 2002 03:15:20 -0000 > @@ -38,6 +38,25 @@ > #ifndef _SYS_KERNELDUMP_H > #define _SYS_KERNELDUMP_H > > +#include > + > +#if BYTE_ORDER == LITTLE_ENDIAN > +#define dtoh32(x) __bswap32(x) > +#define dtoh64(x) __bswap64(x) > +#define htod32(x) __bswap32(x) > +#define htod64(x) __bswap64(x) > +#else > +#define dtoh32(x) x > +#define dtoh64(x) x > +#define htod32(x) x > +#define htod64(x) x > +#endif Adding extra parens around `x' in the !LITTLE_ENDIAN case would prevent bitting future developers that might be expecting this to be evaluated like a function. The rest looks okay. Best regards, Mike Barcroft To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Apr 3 14:39:52 2002 Delivered-To: freebsd-arch@freebsd.org Received: from dragon.nuxi.com (trang.nuxi.com [66.92.13.169]) by hub.freebsd.org (Postfix) with ESMTP id 02A2737B400 for ; Wed, 3 Apr 2002 14:39:49 -0800 (PST) Received: from dragon.nuxi.com (obrien@localhost [127.0.0.1]) by dragon.nuxi.com (8.12.2/8.12.2) with ESMTP id g33MdlYm092161; Wed, 3 Apr 2002 14:39:47 -0800 (PST) (envelope-from obrien@dragon.nuxi.com) Received: (from obrien@localhost) by dragon.nuxi.com (8.12.2/8.12.2/Submit) id g33McVqN092155; Wed, 3 Apr 2002 14:38:31 -0800 (PST) Date: Wed, 3 Apr 2002 14:38:31 -0800 From: "David O'Brien" To: Marcel Moolenaar Cc: arch@FreeBSD.org Subject: Re: Please review: endian invariant kernel dump headers Message-ID: <20020403143831.A92066@dragon.nuxi.com> Reply-To: obrien@FreeBSD.org References: <20020403034208.GA929@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20020403034208.GA929@dhcp01.pn.xcllnt.net>; from marcel@xcllnt.net on Tue, Apr 02, 2002 at 07:42:08PM -0800 X-Operating-System: FreeBSD 5.0-CURRENT Organization: The NUXI BSD group X-Pgp-Rsa-Fingerprint: B7 4D 3E E9 11 39 5F A3 90 76 5D 69 58 D9 98 7A X-Pgp-Rsa-Keyid: 1024/34F9F9D5 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, Apr 02, 2002 at 07:42:08PM -0800, Marcel Moolenaar wrote: > +#include > + > +#if BYTE_ORDER == LITTLE_ENDIAN > +#define dtoh32(x) __bswap32(x) > +#define dtoh64(x) __bswap64(x) > +#define htod32(x) __bswap32(x) > +#define htod64(x) __bswap64(x) > +#else > +#define dtoh32(x) x > +#define dtoh64(x) x > +#define htod32(x) x > +#define htod64(x) x > +#endif Why can't you just use __bswap64 directly, or some other wrappers. I am seeing this part in more and more source files with the macros taking on serveral spellings. Lets please standardize this. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Apr 3 23:40:20 2002 Delivered-To: freebsd-arch@freebsd.org Received: from dire.bris.ac.uk (dire.bris.ac.uk [137.222.10.60]) by hub.freebsd.org (Postfix) with ESMTP id B603F37B41C for ; Wed, 3 Apr 2002 23:40:07 -0800 (PST) Received: from mail.ilrt.bris.ac.uk by dire.bris.ac.uk with SMTP-PRIV with ESMTP; Thu, 4 Apr 2002 08:40:03 +0100 Received: from cmjg (helo=localhost) by mail.ilrt.bris.ac.uk with local-esmtp (Exim 3.16 #1) id 16t1pn-0002zQ-00; Thu, 04 Apr 2002 08:39:03 +0100 Date: Thu, 4 Apr 2002 08:39:03 +0100 (BST) From: Jan Grant X-X-Sender: cmjg@mail.ilrt.bris.ac.uk To: freebsd-arch@freebsd.org Subject: Reference-counting API Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Lost the initial message, but as I recall the "consensus" was: - reference managing calls for refcounts that need a lock - do it by hand for structures that are protected by a mutex Might I suggest* that an API for the second set of calls would still be useful? They might be a bunch of trivial macros, but having explicit refcounting "calls" are: easier to find; document the use better (it's a refcount, but we hold a mutex on it already); easier to instrument if required. Cheers, jan * no, I don't have the patch -- jan grant, ILRT, University of Bristol. http://www.ilrt.bris.ac.uk/ Tel +44(0)117 9287088 Fax +44 (0)117 9287112 RFC822 jan.grant@bris.ac.uk Usenet: The separation of content AND presentation - simultaneously. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Apr 4 9:18:19 2002 Delivered-To: freebsd-arch@freebsd.org Received: from Awfulhak.org (gw.Awfulhak.org [217.204.245.18]) by hub.freebsd.org (Postfix) with ESMTP id ADDC537B41D; Thu, 4 Apr 2002 09:18:06 -0800 (PST) Received: from hak.lan.Awfulhak.org (root@hak.lan.Awfulhak.org [fec0::1:12]) by Awfulhak.org (8.11.6/8.11.6) with ESMTP id g34HHp906928; Thu, 4 Apr 2002 18:17:51 +0100 (BST) (envelope-from brian@freebsd-services.com) Received: from hak.lan.Awfulhak.org (brian@localhost [127.0.0.1]) by hak.lan.Awfulhak.org (8.12.2/8.12.2) with ESMTP id g34HHkq7037326; Thu, 4 Apr 2002 18:17:46 +0100 (BST) (envelope-from brian@freebsd-services.com) Message-Id: <200204041717.g34HHkq7037326@hak.lan.Awfulhak.org> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Doug Ambrisko Cc: "M. Warner Losh" , j@uriah.heep.sax.de, alan@clegg.com, luigi@FreeBSD.org, nsayer@FreeBSD.org, ryand-bsd@zenspider.com, Brian Somers , freebsd-arch@FreeBSD.org, freebsd-net@FreeBSD.org Subject: Re: Your change to in.c to limit duplicate networks is causing trouble In-Reply-To: Message from Doug Ambrisko of "Mon, 25 Mar 2002 11:34:50 PST." <200203251934.g2PJYoY68469@ambrisko.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 04 Apr 2002 18:17:46 +0100 From: Brian Somers Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I've crossposted to -net and -arch as this could probably do with a review from a larger audience.... > Brian Somers writes: > | > In message: <20020325172024.B60771@uriah.heep.sax.de> > | > Joerg Wunsch writes: > | > : As Alan Clegg wrote: > | > : > | > : > Is there any motion to pull this back? > | > : > | > : There was only consensus to special-case the BOOTP case. > | > : > | > : As i understand it, the change itself was more than desirable for > | > : PPP connections (so no surprise it was Brian who committed it). > | > > | > dhclient is still broken, however. The 0.0.0.0 should be the special > | > case, not bootp. > | > | Yes, I agree. > | > | The question is.... should interface address assignments with > | destinations of 0.0.0.0 have host routes created in the first place ? > | > | I'd tend to think not. > | > | Doing this will make things consistent, but maybe at the expense of > | breaking something else - under ``usual'' circumstances. I'm > | thinking along the lines of some program that may configure a > | destination address of 0.0.0.0 and then expect to be able to do stuff > | with the routing table - such as adding a route via 0.0.0.0 or calling > | sendto() or connect() with 0.0.0.0 as the destination. > | > | I'm guessing that dhclient will continue to work without a host route > | as it writes raw IP packets, and I haven't heard of any problems with > | running multiple dhclients using the old in.c code where second and > | subsequent SIOCAIADDRs with a 0.0.0.0 destination had no host route. > | I haven't tested it yet though. > | > | If nobody objects, I'll tweak things so that destinations of 0.0.0.0 > | don't add host routes and see if it breaks anything I know of. I'll > | post patches to -arch and cc -net when I get something working. > > Sounds reasonable. I can test it when you have something since I'm hitting > this on a few machines around here. > > Doug A. The attached patches seem to make things work for BOOTP with multiple interfaces and for ppp expecting failures for duplicate destination address assignments. The code now avoids adding a host route if the interface address is 0.0.0.0, and always treats a failure to add a host route as fatal (previously, it masked EEXIST for some reason - I guessed because it was trying to handle address re-assignment, but that works ok with this patch). If people could get some time to review it, it'd be appreciated. Cheers. -- Brian http://www.freebsd-services.com/ Don't _EVER_ lose your sense of humour ! Index: netinet/in.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/in.c,v retrieving revision 1.63 diff -u -r1.63 in.c --- netinet/in.c 1 Apr 2002 21:31:06 -0000 1.63 +++ netinet/in.c 4 Apr 2002 16:52:59 -0000 @@ -661,7 +661,7 @@ { register u_long i = ntohl(sin->sin_addr.s_addr); struct sockaddr_in oldaddr; - int s = splimp(), flags = RTF_UP, error; + int s = splimp(), flags = RTF_UP, error = 0; oldaddr = ia->ia_addr; ia->ia_addr = *sin; @@ -723,17 +723,21 @@ return (0); flags |= RTF_HOST; } - if ((error = rtinit(&(ia->ia_ifa), (int)RTM_ADD, flags)) == 0) - ia->ia_flags |= IFA_ROUTE; - if (error != 0 && ia->ia_dstaddr.sin_family == AF_INET) { - ia->ia_addr = oldaddr; - return (error); + /* + * Don't add routing table entries for interface address entries + * of 0.0.0.0. This makes it possible to assign several such address + * pairs with consistent results (no host route) and is required by + * BOOTP. + */ + if (ia->ia_addr.sin_addr.s_addr != INADDR_ANY) { + if ((error = rtinit(&ia->ia_ifa, (int)RTM_ADD, flags)) != 0) { + ia->ia_addr = oldaddr; + return (error); + } + ia->ia_flags |= IFA_ROUTE; } - /* XXX check if the subnet route points to the same interface */ - if (error == EEXIST) - error = 0; /* * If the interface supports multicast, join the "all hosts" To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Apr 4 12: 0:53 2002 Delivered-To: freebsd-arch@freebsd.org Received: from sax.sax.de (sax.sax.de [193.175.26.33]) by hub.freebsd.org (Postfix) with ESMTP id 04C4737B41C; Thu, 4 Apr 2002 12:00:28 -0800 (PST) Received: (from uucp@localhost) by sax.sax.de (8.9.3/8.9.3) with UUCP id WAA22498; Thu, 4 Apr 2002 22:00:04 +0200 (CEST) Received: (from j@localhost) by uriah.heep.sax.de (8.11.6/8.11.6) id g34Jwin83318; Thu, 4 Apr 2002 21:58:44 +0200 (MET DST) (envelope-from j) Date: Thu, 4 Apr 2002 21:58:44 +0200 From: Joerg Wunsch To: Brian Somers Cc: Doug Ambrisko , "M. Warner Losh" , alan@clegg.com, luigi@FreeBSD.org, nsayer@FreeBSD.org, ryand-bsd@zenspider.com, freebsd-arch@FreeBSD.org, freebsd-net@FreeBSD.org Subject: Re: Your change to in.c to limit duplicate networks is causing trouble Message-ID: <20020404215844.A83154@uriah.heep.sax.de> Reply-To: Joerg Wunsch References: <200204041717.g34HHkq7037326@hak.lan.Awfulhak.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200204041717.g34HHkq7037326@hak.lan.Awfulhak.org>; from brian@freebsd-services.com on Thu, Apr 04, 2002 at 06:17:46PM +0100 X-Phone: +49-351-2012 669 X-PGP-Fingerprint: DC 47 E6 E4 FF A6 E9 8F 93 21 E0 7D F9 12 D6 4E Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG As Brian Somers wrote: > The code now avoids adding a host route if the interface address is > 0.0.0.0, and always treats a failure to add a host route as fatal > (previously, it masked EEXIST for some reason - I guessed because it > was trying to handle address re-assignment, but that works ok with > this patch). I think that will be fatal for the sppp case with dynamic IP address negotiation. We use 0.0.0.0 as the local IP address for the unnegotiated PPP link then, with the idea that it's still possible to route through the interface anyway. For dial-on-demand PPP links (like on ISDN), the routed packets will then trigger the dialout event. In the course of the PPP negotiations, an actual local IP address will be negotiated and assigned, but we first need some packets to pass through the PPP layer in order to trigger this. Perhaps it would still be possible to use per-interface routes even after your change (-iface isp0 etc.), but currently, a number of documents describe that it's possible to use local address 0.0.0.0 and still get normal routing behaviour for those links. -- cheers, J"org .-.-. --... ...-- -.. . DL8DTL http://www.sax.de/~joerg/ NIC: JW11-RIPE Never trust an operating system you don't have sources for. ;-) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Apr 4 13:11: 8 2002 Delivered-To: freebsd-arch@freebsd.org Received: from shell4.bayarea.net (shell4.bayarea.net [209.128.82.1]) by hub.freebsd.org (Postfix) with ESMTP id B67A837B41E; Thu, 4 Apr 2002 13:10:59 -0800 (PST) Received: from shell4.bayarea.net (localhost [127.0.0.1]) by shell4.bayarea.net (8.9.3/8.9.3) with ESMTP id NAA05062; Thu, 4 Apr 2002 13:10:44 -0800 (envelope-from stephenm@shell4.bayarea.net) Message-Id: <200204042110.NAA05062@shell4.bayarea.net> To: Doug Ambrisko Cc: "M. Warner Losh" , j@uriah.heep.sax.de, alan@clegg.com, luigi@FreeBSD.ORG, nsayer@FreeBSD.ORG, ryand-bsd@zenspider.com, Brian Somers , freebsd-arch@FreeBSD.ORG, freebsd-net@FreeBSD.ORG Subject: Re: Your change to in.c to limit duplicate networks is causing trouble Date: Thu, 04 Apr 2002 13:10:44 -0800 From: stephen macmanus Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > The code now avoids adding a host route if the interface address is > 0.0.0.0, and always treats a failure to add a host route as fatal > (previously, it masked EEXIST for some reason - I guessed because it > was trying to handle address re-assignment, but that works ok with > this patch). One effect of the masked EEXIST is to suppress the spurious error which occurs when adding an alias IP address (SIOCAIFADDR) on the same logical subnet as an existing IP address. Users have no way of knowing that it's actually safe to simply ignore the error in that situation, so the masking should probably be preserved. Stephen ------------------ Stephen Macmanus #include stephenm@bayarea.net - - - if ((error = rtinit(&(ia->ia_ifa), (int)RTM_ADD, flags)) == 0) - - - ia->ia_flags |= IFA_ROUTE; - - - if (error != 0 && ia->ia_dstaddr.sin_family == AF_INET) { - - - ia->ia_addr = oldaddr; - - - return (error); + /* + * Don't add routing table entries for interface address entries + * of 0.0.0.0. This makes it possible to assign several such address + * pairs with consistent results (no host route) and is required by + * BOOTP. + */ + if (ia->ia_addr.sin_addr.s_addr != INADDR_ANY) { + if ((error = rtinit(&ia->ia_ifa, (int)RTM_ADD, flags)) != 0) { + ia->ia_addr = oldaddr; + return (error); + } + ia->ia_flags |= IFA_ROUTE; } - - - /* XXX check if the subnet route points to the same interface */ - - - if (error == EEXIST) - - - error = 0; /* * If the interface supports multicast, join the "all hosts" To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Apr 4 15:20:37 2002 Delivered-To: freebsd-arch@freebsd.org Received: from Awfulhak.org (gw.Awfulhak.org [217.204.245.18]) by hub.freebsd.org (Postfix) with ESMTP id 2FA4E37B405; Thu, 4 Apr 2002 15:20:21 -0800 (PST) Received: from hak.lan.Awfulhak.org (root@hak.lan.Awfulhak.org [IPv6:fec0::1:12]) by Awfulhak.org (8.12.2/8.11.6) with ESMTP id g34NK9Cu001874; Fri, 5 Apr 2002 00:20:09 +0100 (BST) (envelope-from brian@freebsd-services.com) Received: from hak.lan.Awfulhak.org (brian@localhost [127.0.0.1]) by hak.lan.Awfulhak.org (8.12.2/8.12.2) with ESMTP id g34NK6q7041410; Fri, 5 Apr 2002 00:20:06 +0100 (BST) (envelope-from brian@freebsd-services.com) Message-Id: <200204042320.g34NK6q7041410@hak.lan.Awfulhak.org> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Joerg Wunsch Cc: Brian Somers , Doug Ambrisko , "M. Warner Losh" , alan@clegg.com, luigi@FreeBSD.org, nsayer@FreeBSD.org, ryand-bsd@zenspider.com, freebsd-arch@FreeBSD.org, freebsd-net@FreeBSD.org Subject: Re: Your change to in.c to limit duplicate networks is causing trouble In-Reply-To: Message from Joerg Wunsch of "Thu, 04 Apr 2002 21:58:44 +0200." <20020404215844.A83154@uriah.heep.sax.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 05 Apr 2002 00:20:06 +0100 From: Brian Somers Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > As Brian Somers wrote: > > > The code now avoids adding a host route if the interface address is > > 0.0.0.0, and always treats a failure to add a host route as fatal > > (previously, it masked EEXIST for some reason - I guessed because it > > was trying to handle address re-assignment, but that works ok with > > this patch). > > I think that will be fatal for the sppp case with dynamic IP > address negotiation. We use 0.0.0.0 as the local IP address > for the unnegotiated PPP link then, with the idea that it's > still possible to route through the interface anyway. For > dial-on-demand PPP links (like on ISDN), the routed packets > will then trigger the dialout event. In the course of the > PPP negotiations, an actual local IP address will be negotiated > and assigned, but we first need some packets to pass through the > PPP layer in order to trigger this. > > Perhaps it would still be possible to use per-interface routes > even after your change (-iface isp0 etc.), but currently, a number > of documents describe that it's possible to use local address > 0.0.0.0 and still get normal routing behaviour for those links. Hmm, valid point :( So the code will have to become something like the attached ? This is quite grotty, but I can't think of any clean way other than somehow telling SIOCAIFADDR and SIOCSIFADDR not to add the host route in the first place. > -- > cheers, J"org .-.-. --... ...-- -.. . DL8DTL > > http://www.sax.de/~joerg/ NIC: JW11-RIPE > Never trust an operating system you don't have sources for. ;-) -- Brian http://www.freebsd-services.com/ Don't _EVER_ lose your sense of humour ! Index: in.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/in.c,v retrieving revision 1.63 diff -u -r1.63 in.c --- in.c 1 Apr 2002 21:31:06 -0000 1.63 +++ in.c 4 Apr 2002 23:18:36 -0000 @@ -661,7 +661,7 @@ { register u_long i = ntohl(sin->sin_addr.s_addr); struct sockaddr_in oldaddr; - int s = splimp(), flags = RTF_UP, error; + int s = splimp(), flags = RTF_UP, error = 0; oldaddr = ia->ia_addr; ia->ia_addr = *sin; @@ -723,17 +723,25 @@ return (0); flags |= RTF_HOST; } - if ((error = rtinit(&(ia->ia_ifa), (int)RTM_ADD, flags)) == 0) - ia->ia_flags |= IFA_ROUTE; - if (error != 0 && ia->ia_dstaddr.sin_family == AF_INET) { - ia->ia_addr = oldaddr; - return (error); + /*- + * Don't add host routes for interface addresses of + * 0.0.0.0 --> 0.255.255.255 netmask 255.0.0.0. This makes it + * possible to assign several such address pairs with consistent + * results (no host route) and is required by BOOTP. + * + * XXX: This is ugly ! There should be a way for the caller to + * say that they don't want a host route. + */ + if (ia->ia_addr.sin_addr.s_addr != INADDR_ANY || + ia->ia_netmask != IN_CLASSA_NET || + ia->ia_dstaddr.sin_addr.s_addr != htonl(IN_CLASSA_HOST)) { + if ((error = rtinit(&ia->ia_ifa, (int)RTM_ADD, flags)) != 0) { + ia->ia_addr = oldaddr; + return (error); + } + ia->ia_flags |= IFA_ROUTE; } - - /* XXX check if the subnet route points to the same interface */ - if (error == EEXIST) - error = 0; /* * If the interface supports multicast, join the "all hosts" To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Apr 4 16:20:32 2002 Delivered-To: freebsd-arch@freebsd.org Received: from Awfulhak.org (gw.Awfulhak.org [217.204.245.18]) by hub.freebsd.org (Postfix) with ESMTP id D1AFD37B43F; Thu, 4 Apr 2002 16:19:38 -0800 (PST) Received: from hak.lan.Awfulhak.org (root@hak.lan.Awfulhak.org [IPv6:fec0::1:12]) by Awfulhak.org (8.12.2/8.11.6) with ESMTP id g350JSCu002060; Fri, 5 Apr 2002 01:19:28 +0100 (BST) (envelope-from brian@freebsd-services.com) Received: from hak.lan.Awfulhak.org (brian@localhost [127.0.0.1]) by hak.lan.Awfulhak.org (8.12.2/8.12.2) with ESMTP id g350JPq7042133; Fri, 5 Apr 2002 01:19:25 +0100 (BST) (envelope-from brian@freebsd-services.com) Message-Id: <200204050019.g350JPq7042133@hak.lan.Awfulhak.org> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: stephen macmanus Cc: Doug Ambrisko , "M. Warner Losh" , j@uriah.heep.sax.de, alan@clegg.com, luigi@FreeBSD.ORG, nsayer@FreeBSD.ORG, ryand-bsd@zenspider.com, Brian Somers , freebsd-arch@FreeBSD.ORG, freebsd-net@FreeBSD.ORG Subject: Re: Your change to in.c to limit duplicate networks is causing trouble In-Reply-To: Message from stephen macmanus of "Thu, 04 Apr 2002 13:10:44 -0800." <200204042110.NAA05062@shell4.bayarea.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 05 Apr 2002 01:19:25 +0100 From: Brian Somers Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > > The code now avoids adding a host route if the interface address is > > 0.0.0.0, and always treats a failure to add a host route as fatal > > (previously, it masked EEXIST for some reason - I guessed because it > > was trying to handle address re-assignment, but that works ok with > > this patch). > > > One effect of the masked EEXIST is to suppress the spurious error > which occurs when adding an alias IP address (SIOCAIFADDR) on the > same logical subnet as an existing IP address. Users have no way > of knowing that it's actually safe to simply ignore the error in > that situation, so the masking should probably be preserved. Hmm, thanks for the pointer. I think this now works - where it didn't before (although see the new patch posted in response to Joergs mention of the sppp problem). The lack of the EEXIST hack in my patch means that this will work as before: ifconfig dc0 inet 172.16.0.5 netmask 0xffffff00 ifconfig dc0 inet 172.16.0.11 netmask 0xfffffff8 Where connections to 172.16.0.1-172.16.0.7 and 172.16.0.16-172.16.0.255 come from 172.16.0.5 and connections to 172.16.0.8-172.16.0.15 come from 172.16.0.11. After the above however, ifconfig dc0 inet 172.16.0.14 netmask 0xfffffff8 will (correctly) fail in the patched code. It fails because the gateway/netmask combination produces a duplicate key in the routing table, returning an error from rtinit(). Previously, this failure was masked by the EEXIST hack, allowing the interface address update without a corresponding host route. I believe the old behaviour becomes obviously wrong when someone then deletes the 172.16.0.11 interface address, blowing away the associated host route and leaving no routing table entry to talk to the 172.16.0.14 address. So I don't think the old was was really safe after all :-/ > Stephen > ------------------ > Stephen Macmanus #include > stephenm@bayarea.net > > - - - if ((error = rtinit(&(ia->ia_ifa), (int)RTM_ADD, flags)) == 0) > - - - ia->ia_flags |= IFA_ROUTE; > > - - - if (error != 0 && ia->ia_dstaddr.sin_family == AF_INET) { > - - - ia->ia_addr = oldaddr; > - - - return (error); > + /* > + * Don't add routing table entries for interface address entries > + * of 0.0.0.0. This makes it possible to assign several such address > + * pairs with consistent results (no host route) and is required by > + * BOOTP. > + */ > + if (ia->ia_addr.sin_addr.s_addr != INADDR_ANY) { > + if ((error = rtinit(&ia->ia_ifa, (int)RTM_ADD, flags)) != 0) { > + ia->ia_addr = oldaddr; > + return (error); > + } > + ia->ia_flags |= IFA_ROUTE; > } > > - - - /* XXX check if the subnet route points to the same interface */ > - - - if (error == EEXIST) > - - - error = 0; > > /* > * If the interface supports multicast, join the "all hosts" -- Brian http://www.freebsd-services.com/ Don't _EVER_ lose your sense of humour ! To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Apr 4 17:37:20 2002 Delivered-To: freebsd-arch@freebsd.org Received: from shell4.bayarea.net (shell4.bayarea.net [209.128.82.1]) by hub.freebsd.org (Postfix) with ESMTP id C491337B41C; Thu, 4 Apr 2002 17:36:58 -0800 (PST) Received: (from stephenm@localhost) by shell4.bayarea.net (8.9.3/8.9.3) id RAA02010; Thu, 4 Apr 2002 17:36:51 -0800 (envelope-from stephenm) Date: Thu, 4 Apr 2002 17:36:51 -0800 From: stephen macmanus Message-Id: <200204050136.RAA02010@shell4.bayarea.net> To: brian@freebsd-services.com, stephenm@bayarea.net Subject: Re: Your change to in.c to limit duplicate networks is causing trouble Cc: alan@clegg.com, ambrisko@ambrisko.com, freebsd-arch@FreeBSD.ORG, freebsd-net@FreeBSD.ORG, imp@village.org, j@uriah.heep.sax.de, luigi@FreeBSD.ORG, nsayer@FreeBSD.ORG, ryand-bsd@zenspider.com Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > > > The code now avoids adding a host route if the interface address is > > > 0.0.0.0, and always treats a failure to add a host route as fatal > > > (previously, it masked EEXIST for some reason - I guessed because it > > > was trying to handle address re-assignment, but that works ok with > > > this patch). > > > > > > One effect of the masked EEXIST is to suppress the spurious error > > which occurs when adding an alias IP address (SIOCAIFADDR) on the > > same logical subnet as an existing IP address. Users have no way > > of knowing that it's actually safe to simply ignore the error in > > that situation, so the masking should probably be preserved. > > Hmm, thanks for the pointer. > > I think this now works - where it didn't before (although see > the new patch posted in response to Joergs mention of the sppp > problem). > > The lack of the EEXIST hack in my patch means that this will work as > before: > > ifconfig dc0 inet 172.16.0.5 netmask 0xffffff00 > ifconfig dc0 inet 172.16.0.11 netmask 0xfffffff8 > > Where connections to 172.16.0.1-172.16.0.7 and 172.16.0.16-172.16.0.255 > come from 172.16.0.5 and connections to 172.16.0.8-172.16.0.15 come from > 172.16.0.11. > > After the above however, > > ifconfig dc0 inet 172.16.0.14 netmask 0xfffffff8 > > will (correctly) fail in the patched code. It fails because the > gateway/netmask combination produces a duplicate key in the routing > table, returning an error from rtinit(). Previously, this failure > was masked by the EEXIST hack, allowing the interface address update > without a corresponding host route. All true. However, this change just redefines the desired behavior in this situation. The current EEXIST hack prevents a "meaningless" error message (in the sense that it is still possible to use the 172.16.0.14 address due to the existence of the earlier route). This patch just restores the original behavior from earlier BSD versions which reported an error for the reasons you mention. I guess it's just a judgement call as to which one is more desirable. > I believe the old behaviour becomes obviously wrong when someone then > deletes the 172.16.0.11 interface address, blowing away the > associated host route and leaving no routing table entry to talk to > the 172.16.0.14 address. > > So I don't think the old was was really safe after all :-/ Definitely true. An ideal solution would involve some type of reference count for the route entry to maintain connectivity without attempting to add a duplicate route which would always cause an error. It would be even easier if users didn't setup redundant addresses like this one which serve no purpose! ;-) The people who do it, however, are also the most likely to think the resulting error indicates an actual problem with the new address assignment. Stephen ------------------ Stephen Macmanus #include stephenm@bayarea.net To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Apr 4 19: 9:22 2002 Delivered-To: freebsd-arch@freebsd.org Received: from Awfulhak.org (gw.Awfulhak.org [217.204.245.18]) by hub.freebsd.org (Postfix) with ESMTP id 5EC9B37B41D; Thu, 4 Apr 2002 19:09:11 -0800 (PST) Received: from hak.lan.Awfulhak.org (root@hak.lan.Awfulhak.org [IPv6:fec0::1:12]) by Awfulhak.org (8.12.2/8.11.6) with ESMTP id g3538xCu002757; Fri, 5 Apr 2002 04:08:59 +0100 (BST) (envelope-from brian@freebsd-services.com) Received: from hak.lan.Awfulhak.org (brian@localhost [127.0.0.1]) by hak.lan.Awfulhak.org (8.12.2/8.12.2) with ESMTP id g3538sq7044360; Fri, 5 Apr 2002 04:08:54 +0100 (BST) (envelope-from brian@freebsd-services.com) Message-Id: <200204050308.g3538sq7044360@hak.lan.Awfulhak.org> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: stephen macmanus Cc: brian@freebsd-services.com, alan@clegg.com, ambrisko@ambrisko.com, freebsd-arch@FreeBSD.ORG, freebsd-net@FreeBSD.ORG, imp@village.org, j@uriah.heep.sax.de, luigi@FreeBSD.ORG, nsayer@FreeBSD.ORG, ryand-bsd@zenspider.com, brian@freebsd-services.com Subject: Re: Your change to in.c to limit duplicate networks is causing trouble In-Reply-To: Message from stephen macmanus of "Thu, 04 Apr 2002 17:36:51 -0800." <200204050136.RAA02010@shell4.bayarea.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 05 Apr 2002 04:08:54 +0100 From: Brian Somers Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > > > > The code now avoids adding a host route if the interface address is > > > > 0.0.0.0, and always treats a failure to add a host route as fatal > > > > (previously, it masked EEXIST for some reason - I guessed because it > > > > was trying to handle address re-assignment, but that works ok with > > > > this patch). > > > > > > > > > One effect of the masked EEXIST is to suppress the spurious error > > > which occurs when adding an alias IP address (SIOCAIFADDR) on the > > > same logical subnet as an existing IP address. Users have no way > > > of knowing that it's actually safe to simply ignore the error in > > > that situation, so the masking should probably be preserved. > > > > Hmm, thanks for the pointer. > > > > I think this now works - where it didn't before (although see > > the new patch posted in response to Joergs mention of the sppp > > problem). > > > > The lack of the EEXIST hack in my patch means that this will work as > > before: > > > > ifconfig dc0 inet 172.16.0.5 netmask 0xffffff00 > > ifconfig dc0 inet 172.16.0.11 netmask 0xfffffff8 > > > > Where connections to 172.16.0.1-172.16.0.7 and 172.16.0.16-172.16.0.255 > > come from 172.16.0.5 and connections to 172.16.0.8-172.16.0.15 come from > > 172.16.0.11. > > > > After the above however, > > > > ifconfig dc0 inet 172.16.0.14 netmask 0xfffffff8 > > > > will (correctly) fail in the patched code. It fails because the > > gateway/netmask combination produces a duplicate key in the routing > > table, returning an error from rtinit(). Previously, this failure > > was masked by the EEXIST hack, allowing the interface address update > > without a corresponding host route. > > All true. However, this change just redefines the desired behavior > in this situation. The current EEXIST hack prevents a "meaningless" > error message (in the sense that it is still possible to use the > 172.16.0.14 address due to the existence of the earlier route). > > This patch just restores the original behavior from earlier BSD > versions which reported an error for the reasons you mention. > > I guess it's just a judgement call as to which one is more desirable. > > > I believe the old behaviour becomes obviously wrong when someone then > > deletes the 172.16.0.11 interface address, blowing away the > > associated host route and leaving no routing table entry to talk to > > the 172.16.0.14 address. > > > > So I don't think the old was was really safe after all :-/ > > Definitely true. An ideal solution would involve some type of > reference count for the route entry to maintain connectivity > without attempting to add a duplicate route which would always > cause an error. > > It would be even easier if users didn't setup redundant addresses > like this one which serve no purpose! ;-) The people who do it, > however, are also the most likely to think the resulting error > indicates an actual problem with the new address assignment. Well, it does serve a purpose - it allows the machine to accept tcp connections on the .14 address (although udp requests get nicely confused) and to bind to the .14 address before connect(). The resulting error *does* indicate that there's a problem with the new address assignment; adding that .14 address with a conflicting netmask should be considered wrong (and is treated as an error when the EEXIST hack is removed). If they want to add another address to the 172.16/28 subnet, they must use a netmask of 0xffffffff to get the desired result. The EEXIST hack is just permitting people to confuse themselves. > Stephen > > > ------------------ > Stephen Macmanus #include > stephenm@bayarea.net -- Brian http://www.freebsd-services.com/ Don't _EVER_ lose your sense of humour ! To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Apr 4 20: 4:24 2002 Delivered-To: freebsd-arch@freebsd.org Received: from hub.freebsd.org (ws084187.housing-ec.siu.edu [131.230.84.187]) by hub.freebsd.org (Postfix) with SMTP id 6FB1237B41C for ; Thu, 4 Apr 2002 20:04:22 -0800 (PST) From: "pht92gee@hotmail.com" Date: Thu, 04 Apr 2002 22:04:22 To: arch@freebsd.org Subject: Tool Kit MIME-Version: 1.0 Content-Type: text/plain;charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <20020405040422.6FB1237B41C@hub.freebsd.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hey there, I found this web site that gives some good sources for doing more with the school’s web site. Let me know what you think. http://www.pluggedin.org/tool_kit/ Bradley Smith Educator To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message