From owner-freebsd-smp Mon May 20 17:15:21 2002 Delivered-To: freebsd-smp@freebsd.org Received: from mta6.snfc21.pbi.net (mta6.snfc21.pbi.net [206.13.28.240]) by hub.freebsd.org (Postfix) with ESMTP id 7B62037B409 for ; Mon, 20 May 2002 17:15:16 -0700 (PDT) Received: from FreeBSD.org ([63.193.112.125]) by mta6.snfc21.pbi.net (iPlanet Messaging Server 5.1 (built May 7 2001)) with ESMTP id <0GWF00DL2Q1GXZ@mta6.snfc21.pbi.net> for smp@freebsd.org; Mon, 20 May 2002 17:15:16 -0700 (PDT) Date: Mon, 20 May 2002 17:15:42 -0700 From: Jeffrey Hsu Subject: socket locks To: smp@freebsd.org Message-id: <0GWF00DL3Q1GXZ@mta6.snfc21.pbi.net> MIME-version: 1.0 X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 Content-type: text/plain; charset=us-ascii Content-transfer-encoding: 7BIT Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > tanimura 2002/05/19 22:41:09 PDT > Modified files: > (too many files) > Log: > Lock down a socket, milestone 1. > > Reviewed by: alfred This patch mechanically adds a lot of locks around uses of socket fields. For example, in tcp_output.c: @@ -889,8 +897,11 @@ send: #ifdef IPSEC ipsec_setsocket(m, so); #endif /*IPSEC*/ + SOCK_LOCK(so); + soopts = (so->so_options & SO_DONTROUTE); + SOCK_UNLOCK(so); error = ip_output(m, tp->t_inpcb->inp_options, &tp->t_inpcb->inp_route, - (so->so_options & SO_DONTROUTE), 0); + soopts, 0); } Locking and immediately unlocking accesses to a socket field doesn't accomplish anything. We want to lock the socket at the beginning of an operation and release it at the end. This leads to way fewer locks inserted into the networking code. Furthermore, these mechanical bottom-up socket locks don't respect lock ordering with respect to the top-down inpcb locks. Fortunately, cvs update only resulted in a few conflicts with the inpcb locking code and I can just turn off the socket lock macros in my tree. We should coordiante better on working to lock up the networking code so we're not working at cross-purposes. Jeffrey To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue May 21 7:42:16 2002 Delivered-To: freebsd-smp@freebsd.org Received: from mail.speakeasy.net (mail17.speakeasy.net [216.254.0.217]) by hub.freebsd.org (Postfix) with ESMTP id A631037B414 for ; Tue, 21 May 2002 07:39:34 -0700 (PDT) Received: (qmail 28411 invoked from network); 21 May 2002 14:39:34 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail17.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 21 May 2002 14:39:34 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.11.6/8.11.6) with ESMTP id g4LEdXF95047; Tue, 21 May 2002 10:39:33 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.2 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <0GWF00DL3Q1GXZ@mta6.snfc21.pbi.net> Date: Tue, 21 May 2002 10:39:13 -0400 (EDT) From: John Baldwin To: Jeffrey Hsu Subject: RE: socket locks Cc: smp@freebsd.org Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On 21-May-2002 Jeffrey Hsu wrote: > > tanimura 2002/05/19 22:41:09 PDT > > Modified files: > > (too many files) > > Log: > > Lock down a socket, milestone 1. > > > > Reviewed by: alfred > > This patch mechanically adds a lot of locks around uses of socket fields. > For example, in tcp_output.c: > > @@ -889,8 +897,11 @@ send: > #ifdef IPSEC > ipsec_setsocket(m, so); > #endif /*IPSEC*/ > + SOCK_LOCK(so); > + soopts = (so->so_options & SO_DONTROUTE); > + SOCK_UNLOCK(so); > error = ip_output(m, tp->t_inpcb->inp_options, &tp->t_inpcb->inp_route, > - (so->so_options & SO_DONTROUTE), 0); > + soopts, 0); > } > > Locking and immediately unlocking accesses to a socket field doesn't > accomplish anything. We want to lock the socket at the beginning of an > operation and release it at the end. This leads to way fewer locks inserted > into the networking code. Agreed, the value you read with the lock held above becomes invalid/stale as soon as you drop the lock, so the lock really isn't doing much good. It is, however, obfuscating the code and will probably have to be backed out when the code is more fully locked. Changes like this don't make reading/writing so_options safe. It is not enough just to lock the reads and writes, you need to lock the results of actions taken on those values as well to ensure that multiple actions taken on an object all perform them on a consistent snapshot of that object. > Furthermore, these mechanical bottom-up socket locks don't respect > lock ordering with respect to the top-down inpcb locks. Fortunately, > cvs update only resulted in a few conflicts with the inpcb locking code > and I can just turn off the socket lock macros in my tree. > > We should coordiante better on working to lock up the networking code so > we're not working at cross-purposes. Can the various folks working on the network stack possibly "sit down" and write up a description of the overall locking strategy for the stack including the required lock orders, etc.? That would be a big help in coordinating things better. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue May 21 7:44: 7 2002 Delivered-To: freebsd-smp@freebsd.org Received: from mailhost.firstcallgroup.co.uk (dilbert.firstcallgroup.co.uk [194.200.93.142]) by hub.freebsd.org (Postfix) with ESMTP id A941C37B431; Tue, 21 May 2002 07:41:36 -0700 (PDT) Received: from pfrench by mailhost.firstcallgroup.co.uk with local (Exim 3.34 #1) id 17AApO-000J93-00; Tue, 21 May 2002 15:41:30 +0100 To: freebsd-smp@FreeBSD.ORG, stable@FreeBSD.ORG Subject: Re: 4.6-RC system hangs (fxp0, smp, sym) In-Reply-To: <15594.22920.415872.835007@moe.cs.duke.edu> Message-Id: From: Pete French Date: Tue, 21 May 2002 15:41:30 +0100 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > dmesg output and mptable output would go a long ways towards somebody > being able to help you. For my system these are attached at the end of this email. > I seem to remember that you said the machines used to work just fine. > It would be helpful for you to bracket the breakage to a smaller > window of time using CVS & doing a binary search on the source trees, > looking for the date the breakage occured. Somewhat harder - I was a bit blase about that update as I had tried it on several other systems first, so I can tell you that for me it started on April 8th, but I cannot remember when I updated before that, so I cannot give you a date when it was actually working :-( Maybe one of the others with the problem could narrow it down a little further ? -pcf. [note that I have since updated to RC-1 to see if this makes any difference to the stability, and I am no longer running SMP] -------------------------------------------------------------------- dmesg: Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.6-RC #0: Fri May 17 15:57:38 BST 2002 pfrench@tixlink1.firstcallgroup.co.uk:/usr/obj/usr/src/sys/TIXLINK1 Timecounter "i8254" frequency 1193182 Hz CPU: Pentium III/Pentium III Xeon/Celeron (548.55-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x673 Stepping = 3 Features=0x383fbff real memory = 268435456 (262144K bytes) avail memory = 256303104 (250296K bytes) Preloaded elf kernel "kernel" at 0xc04ca000. netsmb_dev: loaded Pentium Pro MTRR support enabled md0: Malloc disk npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 pci0: at 11.0 pcib1: at device 13.0 on pci0 pci1: on pcib1 tl0: port 0x3800-0x380f mem 0xc6ffddf0-0xc6ffd dff irq 11 at device 7.0 on pci1 tl0: Ethernet address: 00:50:8b:8b:de:ab miibus0: on tl0 nsphy0: on miibus0 nsphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto tlphy0: on miibus0 tlphy0: 10base2/BNC, 10base5/AUI sym0: <875> port 0x3000-0x30ff mem 0xc6fff000-0xc6ffffff,0xc6ffdf00-0xc6ffdfff irq 15 at device 9.0 on pci1 sym0: No NVRAM, ID 7, Fast-20, SE, parity checking sym1: <875> port 0x3400-0x34ff mem 0xc6ffe000-0xc6ffefff,0xc6ffde00-0xc6ffdeff irq 9 at device 9.1 on pci1 sym1: No NVRAM, ID 7, Fast-20, SE, parity checking pci0: (vendor=0x0e11, dev=0xa0f0) at 14.0 tl1: port 0x2000-0x200f mem 0xc6efeef0-0xc6efeeff irq 5 at device 15.0 on pci0 tl1: Ethernet address: 00:08:c7:84:c7:4e miibus1: on tl1 nsphy1: on miibus1 nsphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto tlphy1: on miibus1 tlphy1: 10base2/BNC, 10base5/AUI isab0: at device 20.0 on pci0 isa0: on isab0 atapci0: port 0xf100-0xf10f at device 20.1 on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 uhci0: irq 0 at device 20.2 on pci0 uhci0: Could not map ports device_probe_and_attach: uhci0 attach returned 6 chip1: at device 20.3 on pci0 eisa0: on motherboard mainboard0: on eisa0 slot 0 orm0: