From owner-freebsd-smp Tue May 14 3:17:53 2002 Delivered-To: freebsd-smp@freebsd.org Received: from rina.r.dl.itc.u-tokyo.ac.jp (cvsup2.r.dl.itc.u-tokyo.ac.jp [133.11.199.247]) by hub.freebsd.org (Postfix) with ESMTP id 372B337B403; Tue, 14 May 2002 03:17:44 -0700 (PDT) Received: from rina.r.dl.itc.u-tokyo.ac.jp (localhost [127.0.0.1]) by rina.r.dl.itc.u-tokyo.ac.jp (8.12.3+3.5Wbeta/3.7W-rina.r-Nankai-Koya) with ESMTP id g4EAHf3i037251 ; Tue, 14 May 2002 19:17:42 +0900 (JST) Message-Id: <200205141017.g4EAHf3i037251@rina.r.dl.itc.u-tokyo.ac.jp> Date: Tue, 14 May 2002 19:17:40 +0900 From: Seigo Tanimura To: Alfred Perlstein Cc: Seigo Tanimura , current@FreeBSD.org, smp@FreeBSD.org Subject: Re: The updated socket patch and axing sotryfree() (Re: Locking down a socket, milestone 1) In-Reply-To: <20020508161656.GV36741@elvis.mu.org> References: <200204241110.g3OB8u8t006194@bunko> <200205081159.g48Bx63i045654@rina.r.dl.itc.u-tokyo.ac.jp> <20020508161656.GV36741@elvis.mu.org> User-Agent: Wanderlust/2.8.1 (Something) SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (=?ISO-8859-1?Q?Unebigory=F2mae?=) APEL/10.3 MULE XEmacs/21.1 (patch 14) (Cuyahoga Valley) (i386--freebsd) Organization: Digital Library Research Division, Information Techinology Centre, The University of Tokyo MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya") Content-Type: text/plain; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Wed, 8 May 2002 09:16:56 -0700, Alfred Perlstein said: bright> * Seigo Tanimura [020508 04:59] wrote: >> >> I would like to commit this patch in one or two weeks to start working >> on a possible race between a user process and a netisr kthread, >> prevented by only the Giant lock at the moment. >> >> When a user process calls sofree() for a listening socket, it attempts >> to free the sockets in the connection queues by soabort(). If the >> connection of an aborting socket gets dropped by a remote host (eg by >> TCP RST), a netisr kthread also attempts to free the socket. Since >> the reference count of a socket in a connection queue is zero, this >> would resust in doubly freeing a socket. >> >> To solve that problem, I would like to axe sotryfree(). The PCB of a >> socket and a connection queue should hold a reference to the >> socket. This should make the reference count of an alive socket always >> be >= 1, and ensure that there is only one referer to a socket to be >> freed. >> >> Comments? bright> I'm not sure exactly how this solves the problem, there may need to bright> be a global socket mutex, perhaps putting this sort of operation under bright> that may do what you want. Yes, at least, we should introduce a global lock to protect the relation between sockets and PCBs. bright> Off the top of my head... bright> I think one way of doing it is storing the hashlist that the socket bright> belongs to (inpcb_hash) inside the sockets. That way after a lookup bright> you will have the lock on the parent structure, a user process will bright> have to follow the same locking paradym, basically look at the head bright> socket, lock the hashlist, then manipulate the incomplete queue. bright> Basically, protect this sort of operation via the hashlist because bright> you pretty much need to anyway. :) In order to solve the issue of deallocation race by a hashlist lock, we *always* have to obtain a socket or a PCB by looking up a hashlist. This is quite problematic because: 1. the lock order between a socket and a PCB gets tangled, 2. a hashlist introduces an overhead of calculating a hash index value, and 3. a hashlint lock cannot be per-socket or per-PCB, resulting in a contention under a huge number of socket operations or incoming packets. In order to make our lock strategy readable and comprehensible, we should keep a lock order as simple as the following: 1. a lock to protect the relation between/among objects, (eg the proctree lock or the allproc lock) and 2. a lock dedicated to a single object. (eg a proc lock) A reference count allows us a flexible way to keep a lock order clean. Once we grabbed a reference to an object, we can unlock it completely to restart with locking any lock protecting a relation. For instance, in the interrupt handler of a protocol stack, you lock a hashlist to look up the PCB appropriate to an incoming packet. You then lock the PCB to do some work. If you have to modify the socket corresponding to the PCB, hold a reference to the PCB and unlock it. Now you can lock the relation between sockets and PCBs to grab the socket safely. This strategy should be applicable to a socket operation initiated by a user process as well. We will not have to worry about the lock order between a socket and a PCB. Another advantage of a reference count is its cost. Provided that we hold an appropriate lock, we can simply follow a pointer to obtain an object. This is much cheaper than we calculate a hash index. We can also reduce the contention over a lock because the lock of a reference count is per-socket or per-PCB. bright> Other than that, have you looked at what BSD/os does and what Linux bright> does? Do they get it wrong or have any particular drawbacks? BSD/OS seems to ensure the existence of a PCB by locking the hashlist of PCBs. I am worrying about the fact that they lock the hashlist for quite a long duration. (about a half of udp_input() hold a read lock, and almost all of tcp_input() hold a *write* lock.) -- Seigo Tanimura To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue May 14 10:21:40 2002 Delivered-To: freebsd-smp@freebsd.org Received: from hpux27.dc.engr.scu.edu (hpux27.dc.engr.scu.edu [129.210.16.27]) by hub.freebsd.org (Postfix) with ESMTP id 76FC637B404 for ; Tue, 14 May 2002 10:21:38 -0700 (PDT) Received: from localhost (dclark@localhost) by hpux27.dc.engr.scu.edu (8.10.2/8.10.2) with ESMTP id g4EHLWB04732 for ; Tue, 14 May 2002 10:21:32 -0700 (PDT) Date: Tue, 14 May 2002 10:21:32 -0700 (PDT) From: "Dorr H. Clark" To: freebsd-smp@FreeBSD.ORG Subject: hyperthreading? (was Re: question) In-Reply-To: <20020511012747.GC90188@elvis.mu.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Fri, 10 May 2002, Alfred Perlstein wrote: > * David [020510 18:08] wrote: > > > > Hi just wanna know if newest FreeBSD supports > > Dual P4-Xeon? > > Yes, even "hyperthreading" is supported. Is this really true? I thought it was only getting as far as putting out some misleading boot messages. -dhc To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue May 14 15:28:44 2002 Delivered-To: freebsd-smp@freebsd.org Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by hub.freebsd.org (Postfix) with ESMTP id 934E737B405 for ; Tue, 14 May 2002 15:28:40 -0700 (PDT) Received: by elvis.mu.org (Postfix, from userid 1192) id 68869AE1C6; Tue, 14 May 2002 15:28:40 -0700 (PDT) Date: Tue, 14 May 2002 15:28:40 -0700 From: Alfred Perlstein To: "Dorr H. Clark" Cc: freebsd-smp@FreeBSD.ORG Subject: Re: hyperthreading? (was Re: question) Message-ID: <20020514222840.GB1585@elvis.mu.org> References: <20020511012747.GC90188@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.27i Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org * Dorr H. Clark [020514 10:21] wrote: > > On Fri, 10 May 2002, Alfred Perlstein wrote: > > > * David [020510 18:08] wrote: > > > > > > Hi just wanna know if newest FreeBSD supports > > > Dual P4-Xeon? > > > > Yes, even "hyperthreading" is supported. > > Is this really true? I thought it was only getting > as far as putting out some misleading boot messages. It sure looked like it was working on the machine I was playing with. Could you be more specific? -- -Alfred Perlstein [alfred@freebsd.org] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' Tax deductible donations for FreeBSD: http://www.freebsdfoundation.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Wed May 15 14:22:59 2002 Delivered-To: freebsd-smp@freebsd.org Received: from ns2.speedlink.net (mail.speedlink.net [66.163.96.20]) by hub.freebsd.org (Postfix) with ESMTP id C3DE637B407 for ; Wed, 15 May 2002 14:22:56 -0700 (PDT) Received: from speedlink.net (speedproxy.speedlink.net [66.163.96.23]) by ns2.speedlink.net (Switch-2.0.1/Switch-2.0.1) with ESMTP id g4FLMmQ29261 for ; Wed, 15 May 2002 17:22:48 -0400 (EDT) Message-ID: <3CE2D242.C487BA41@speedlink.net> Date: Wed, 15 May 2002 17:25:23 -0400 From: Tom Zamberlan X-Mailer: Mozilla 4.77 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: freebsd-smp@FreeBSD.org Subject: ?? Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org ?? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Wed May 15 15:25:48 2002 Delivered-To: freebsd-smp@freebsd.org Received: from sabre.velocet.net (sabre.velocet.net [216.138.209.205]) by hub.freebsd.org (Postfix) with ESMTP id 7275037B404; Wed, 15 May 2002 15:25:43 -0700 (PDT) Received: from trooper.velocet.ca (trooper.velocet.net [216.138.242.2]) by sabre.velocet.net (Postfix) with ESMTP id DDBDE138287; Wed, 15 May 2002 18:25:38 -0400 (EDT) Received: by trooper.velocet.ca (Postfix, from userid 101) id 6842124B2; Wed, 15 May 2002 18:06:30 -0400 (EDT) From: David Gilbert MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15586.56294.70836.700134@trooper.velocet.net> Date: Wed, 15 May 2002 18:06:30 -0400 To: freebsd-hackers@freebsd.org, freebsd-smp@freebsd.org Subject: SMP kernel freezes with nice processes. X-Mailer: VM 7.00 under 21.1 (patch 14) "Cuyahoga Valley" XEmacs Lucid Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org I run dnetc with an argument to run two (one for each processor). If I realtime nice (not nasty) the processes, the computer freezes for a few seconds every minute or two. If I have them only regular nice'd, this does not happen. I can make a login on the machine available if this helps. Any ideas? Dave. -- ============================================================================ |David Gilbert, Velocet Communications. | Two things can only be | |Mail: dgilbert@velocet.net | equal if and only if they | |http://daveg.ca | are precisely opposite. | =========================================================GLO================ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Wed May 15 16:27:49 2002 Delivered-To: freebsd-smp@freebsd.org Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) by hub.freebsd.org (Postfix) with ESMTP id C5CE237B41E; Wed, 15 May 2002 16:27:21 -0700 (PDT) Received: from isi.edu (agfxnoy0h1odd4vc@hbo.isi.edu [128.9.160.75]) by boreas.isi.edu (8.11.6/8.11.2) with ESMTP id g4FNRJF15936; Wed, 15 May 2002 16:27:19 -0700 (PDT) Message-ID: <3CE2EED5.2010003@isi.edu> Date: Wed, 15 May 2002 16:27:17 -0700 From: Lars Eggert User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:0.9.9) Gecko/20020404 X-Accept-Language: en-us, de-de MIME-Version: 1.0 To: David Gilbert Cc: freebsd-hackers@freebsd.org, freebsd-smp@freebsd.org Subject: Re: SMP kernel freezes with nice processes. References: <15586.56294.70836.700134@trooper.velocet.net> Content-Type: multipart/signed; protocol="application/x-pkcs7-signature"; micalg=sha1; boundary="------------ms090203000900010207080306" Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org This is a cryptographically signed message in MIME format. --------------ms090203000900010207080306 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit David Gilbert wrote: > I run dnetc with an argument to run two (one for each processor). If > I realtime nice (not nasty) the processes, the computer freezes for a > few seconds every minute or two. If I have them only regular nice'd, > this does not happen. "realtime nice" = idprio? If so, probably priority inversion. Not much you can do about that without looking at the dnetc source any finding out which resources it holds locked. Lars -- Lars Eggert USC Information Sciences Institute --------------ms090203000900010207080306 Content-Type: application/x-pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIInzCC ArUwggIeoAMCAQICAwWBRzANBgkqhkiG9w0BAQIFADCBkjELMAkGA1UEBhMCWkExFTATBgNV BAgTDFdlc3Rlcm4gQ2FwZTESMBAGA1UEBxMJQ2FwZSBUb3duMQ8wDQYDVQQKEwZUaGF3dGUx HTAbBgNVBAsTFENlcnRpZmljYXRlIFNlcnZpY2VzMSgwJgYDVQQDEx9QZXJzb25hbCBGcmVl bWFpbCBSU0EgMjAwMC44LjMwMB4XDTAxMDgyNDE2NDAwMFoXDTAyMDgyNDE2NDAwMFowVDEP MA0GA1UEBBMGRWdnZXJ0MQ0wCwYDVQQqEwRMYXJzMRQwEgYDVQQDEwtMYXJzIEVnZ2VydDEc MBoGCSqGSIb3DQEJARYNbGFyc2VAaXNpLmVkdTCBnzANBgkqhkiG9w0BAQEFAAOBjQAwgYkC gYEA0AvLBsD78nxcUHeHkaMgl3b4qYPnfgbf8Lh+HQP8RgGMRG/Yb+vTpkGezlwt9pkJxiD1 1uZDy4CNNJUu3gKxKSb+zRV70O+lkwwftuHoLHoH4xwo3LcQ2LGDpd+I95tUN4dfJ3TmeEcU SF50dC/SuUI4w8AlhXQ8IxrhgdayTpECAwEAAaNWMFQwKgYFK2UBBAEEITAfAgEAMBowGAIB BAQTTDJ1TXlmZkJOVWJOSkpjZFoyczAYBgNVHREEETAPgQ1sYXJzZUBpc2kuZWR1MAwGA1Ud EwEB/wQCMAAwDQYJKoZIhvcNAQECBQADgYEAheZhn0pQA8zI7U2K1ZIAl11j0a1DKxnp3GtT vOUrGRB3WvYxidvdZ1kizhEsWeXU81TkNDH0DaRqtOEeu6Q2OhB+jeKEqY7IDAJE4/fI0e+d 6PnG1hd+vEvYmsKHkmzBhPc94XUOKNWO+qVNP2NGyNI3QIDy5wX4fdcOo1S34r4wggK1MIIC HqADAgECAgMFgUcwDQYJKoZIhvcNAQECBQAwgZIxCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxX ZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUgVG93bjEPMA0GA1UEChMGVGhhd3RlMR0wGwYD VQQLExRDZXJ0aWZpY2F0ZSBTZXJ2aWNlczEoMCYGA1UEAxMfUGVyc29uYWwgRnJlZW1haWwg UlNBIDIwMDAuOC4zMDAeFw0wMTA4MjQxNjQwMDBaFw0wMjA4MjQxNjQwMDBaMFQxDzANBgNV BAQTBkVnZ2VydDENMAsGA1UEKhMETGFyczEUMBIGA1UEAxMLTGFycyBFZ2dlcnQxHDAaBgkq hkiG9w0BCQEWDWxhcnNlQGlzaS5lZHUwgZ8wDQYJKoZIhvcNAQEBBQADgY0AMIGJAoGBANAL ywbA+/J8XFB3h5GjIJd2+KmD534G3/C4fh0D/EYBjERv2G/r06ZBns5cLfaZCcYg9dbmQ8uA jTSVLt4CsSkm/s0Ve9DvpZMMH7bh6Cx6B+McKNy3ENixg6XfiPebVDeHXyd05nhHFEhedHQv 0rlCOMPAJYV0PCMa4YHWsk6RAgMBAAGjVjBUMCoGBStlAQQBBCEwHwIBADAaMBgCAQQEE0wy dU15ZmZCTlViTkpKY2RaMnMwGAYDVR0RBBEwD4ENbGFyc2VAaXNpLmVkdTAMBgNVHRMBAf8E AjAAMA0GCSqGSIb3DQEBAgUAA4GBAIXmYZ9KUAPMyO1NitWSAJddY9GtQysZ6dxrU7zlKxkQ d1r2MYnb3WdZIs4RLFnl1PNU5DQx9A2karThHrukNjoQfo3ihKmOyAwCROP3yNHvnej5xtYX frxL2JrCh5JswYT3PeF1DijVjvqlTT9jRsjSN0CA8ucF+H3XDqNUt+K+MIIDKTCCApKgAwIB AgIBDDANBgkqhkiG9w0BAQQFADCB0TELMAkGA1UEBhMCWkExFTATBgNVBAgTDFdlc3Rlcm4g Q2FwZTESMBAGA1UEBxMJQ2FwZSBUb3duMRowGAYDVQQKExFUaGF3dGUgQ29uc3VsdGluZzEo MCYGA1UECxMfQ2VydGlmaWNhdGlvbiBTZXJ2aWNlcyBEaXZpc2lvbjEkMCIGA1UEAxMbVGhh d3RlIFBlcnNvbmFsIEZyZWVtYWlsIENBMSswKQYJKoZIhvcNAQkBFhxwZXJzb25hbC1mcmVl bWFpbEB0aGF3dGUuY29tMB4XDTAwMDgzMDAwMDAwMFoXDTAyMDgyOTIzNTk1OVowgZIxCzAJ BgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcTCUNhcGUgVG93bjEP MA0GA1UEChMGVGhhd3RlMR0wGwYDVQQLExRDZXJ0aWZpY2F0ZSBTZXJ2aWNlczEoMCYGA1UE AxMfUGVyc29uYWwgRnJlZW1haWwgUlNBIDIwMDAuOC4zMDCBnzANBgkqhkiG9w0BAQEFAAOB jQAwgYkCgYEA3jMypmPHCSVFPtJueCdngcXaiBmClw7jRCmKYzUqbXA8+tyu9+50bzC8M5B/ +TRxoKNtmPHDT6Jl2w36S/HW3WGl+YXNVZo1Gp2Sdagnrthy+boC9tewkd4c6avgGAOofENC UFGHgzzwObSbVIoTh/+zm51JZgAtCYnslGvpoWkCAwEAAaNOMEwwKQYDVR0RBCIwIKQeMBwx GjAYBgNVBAMTEVByaXZhdGVMYWJlbDEtMjk3MBIGA1UdEwEB/wQIMAYBAf8CAQAwCwYDVR0P BAQDAgEGMA0GCSqGSIb3DQEBBAUAA4GBAHMbbyZli/8VNEtZYortRL5Jx+gNu4+5DWomKmKE H7iHY3QcbbfPGlORS+HN5jjZ7VD0Omw0kqzmkpxuwSMBwgmn70uuct0GZ/VQby5YuLYLwVBX tewc1+8XttWIm7eiiBrtOVs5fTT8tpYYJU1q9J3Fw5EvqZa4BTxS/N3pYgNIMYICpjCCAqIC AQEwgZowgZIxCzAJBgNVBAYTAlpBMRUwEwYDVQQIEwxXZXN0ZXJuIENhcGUxEjAQBgNVBAcT CUNhcGUgVG93bjEPMA0GA1UEChMGVGhhd3RlMR0wGwYDVQQLExRDZXJ0aWZpY2F0ZSBTZXJ2 aWNlczEoMCYGA1UEAxMfUGVyc29uYWwgRnJlZW1haWwgUlNBIDIwMDAuOC4zMAIDBYFHMAkG BSsOAwIaBQCgggFhMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8X DTAyMDUxNTIzMjcxOVowIwYJKoZIhvcNAQkEMRYEFISBc1yCtipf+QtLM/L/CbN+IhW5MFIG CSqGSIb3DQEJDzFFMEMwCgYIKoZIhvcNAwcwDgYIKoZIhvcNAwICAgCAMA0GCCqGSIb3DQMC AgFAMAcGBSsOAwIHMA0GCCqGSIb3DQMCAgEoMIGtBgsqhkiG9w0BCRACCzGBnaCBmjCBkjEL MAkGA1UEBhMCWkExFTATBgNVBAgTDFdlc3Rlcm4gQ2FwZTESMBAGA1UEBxMJQ2FwZSBUb3du MQ8wDQYDVQQKEwZUaGF3dGUxHTAbBgNVBAsTFENlcnRpZmljYXRlIFNlcnZpY2VzMSgwJgYD VQQDEx9QZXJzb25hbCBGcmVlbWFpbCBSU0EgMjAwMC44LjMwAgMFgUcwDQYJKoZIhvcNAQEB BQAEgYBrzXcPG16S1pjdbttFxJKfp5KJ8RWbpC0wya6bV6cR0YDYtyAWvjZsTetAjKSyuLtL bSnSyg6DvCHNCaZpGO8GNkzAcokujob7Wdilk3/iIB4oII4SAG+QXWTQ1JNtk8VwJAKNaGRL HRT4nrF82zk+0xZLW0KnnTkTNhovDEHwFwAAAAAAAA== --------------ms090203000900010207080306-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Thu May 16 4:40:34 2002 Delivered-To: freebsd-smp@freebsd.org Received: from klima.physik.uni-mainz.de (klima.Physik.Uni-Mainz.DE [134.93.180.162]) by hub.freebsd.org (Postfix) with ESMTP id 4624037B40C for ; Thu, 16 May 2002 04:40:06 -0700 (PDT) Received: from klima.Physik.Uni-Mainz.DE (administrator@klima.Physik.Uni-Mainz.DE [134.93.180.162]) by klima.physik.uni-mainz.de (8.12.3/8.12.2) with ESMTP id g4GBe4b0017940 for ; Thu, 16 May 2002 13:40:05 +0200 (CEST) (envelope-from administrator@klima.physik.uni-mainz.de) Date: Thu, 16 May 2002 13:40:04 +0200 (CEST) From: Administrator IPA To: freebsd-smp@freebsd.org Subject: RE: SMP problems with Fujitsu Siemens Primergy P200 (fwd) Message-ID: <20020516133347.W17595-100000@klima.physik.uni-mainz.de> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Dear Sirs. On older SMP main PCBs I experienced a similar behaviour. The only solution was to disable a second or third NIC (I have in one machine two Intel fxp0 and fxp1 NICs). The second I did was changing several cards in PCI slots. Our problem occured when the firmware of our AMI Enterprise 1600 RAID controler was updated. It looked like an IRQ routing problem, I think this is a very often reported phenomenon on high end server boards and FreeBSD. After permutating the additional NIC on the 33MHz PCI slots, disabling all ACPI stuff (in BIOS and in kernel) and moving the RAID controller to its appropriate 66MHz PCI slot the obscure freezing of the system has been banned. I must add to this that we didn't activate the built in NIC anymore, so maybe the problem will come back if this nick gets reanimated again ... -- MfG O. Hartmann ohartman@klima.physik.uni-mainz.de ---------------------------------------------------------------- IT-Administration des Institut fuer Physik der Atmosphaere (IPA) ---------------------------------------------------------------- Johannes Gutenberg Universitaet Mainz Becherweg 21 55099 Mainz Tel: +496131/3924662 (Maschinensaal) Tel: +496131/3924144 FAX: +496131/3923532 ---------- Forwarded message ---------- Date: Mon, 13 May 2002 13:52:48 +0200 From: Simon L. Nielsen To: freebsd-stable@FreeBSD.ORG Subject: SMP problems with Fujitsu Siemens Primergy P200 Hello I'm having some problems using SMP on a Fujitsu Siemens Primergy P200 with FreeBSD 4.5-RELEASE-p4. It's a dual Intel Pentium 3 1266MHz with 1GB ECC RAM. It has a Mylex AcceleRAID 352 controller connected to 4 SCSI disks, and a normal ATAPI CD-ROM drive. It also has a on-board Symbios SCSI controller but this is not being used at present. When booting a SMP kernel it simply hangs after initializing the second CPU. The computer is not completely frozen since I can break to the kernel debugger, but it never gets any further. However, I don't really have enough information about the internals of the kernel to do anything useful in the kernel debugger... The computer works fine on a single CPU kernel. I have also tried to boot on another computer with the exact same hardware and still there is no positive result, so it would not appear to be "normal" defective hardware. I have tried booting a 4.6-PRERELEASE (from today) and 5.0DP1 without success. The 4.6 fails in the same way as 4.5, but it does detect the CD-ROM drive after it has written "AP CPU #1 Launched" (I have not included the output since it is almost the same and rather large). The 5.0DP1 just does a kernel after detecting the Mylex controller but I don't think that is really related to the main problem. I have included output from from a verbose boot and output from mptable. Any hints on what might be wrong or suggestions how to fix the problem would be very much appreciated. Verbose boot message from 4.5-RELEASE-p4 : SMAP type=01 base=00000000 00000000 len=00000000 0009e400 SMAP type=02 base=00000000 0009e400 len=00000000 00001c00 SMAP type=02 base=00000000 000ca000 len=00000000 00002000 SMAP type=02 base=00000000 000e0000 len=00000000 00020000 SMAP type=01 base=00000000 00100000 len=00000000 3fdf0000 SMAP type=03 base=00000000 3fef0000 len=00000000 0000f000 SMAP type=04 base=00000000 3feff000 len=00000000 00001000 SMAP type=01 base=00000000 3ff00000 len=00000000 00100000 SMAP type=02 base=00000000 fec00000 len=00000000 00010000 SMAP type=02 base=00000000 fee00000 len=00000000 00001000 SMAP type=02 base=00000000 ffc00000 len=00000000 00400000 Copyright (c) 1992-2002 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 4.5-RELEASE-p4 #2: Fri May 10 23:01:17 GST 2002 root@hd1.test.siemens.com:/usr/src/sys/compile/DEM-HARVESTER2 Calibrating clock(s) ... TSC clock: 1260490417 Hz, i8254 clock: 1193080 Hz CLK_USE_I8254_CALIBRATION not specified - using default frequency Timecounter "i8254" frequency 1193182 Hz CLK_USE_TSC_CALIBRATION not specified - using old calibration method CPU: Pentium III/Pentium III Xeon/Celeron (1260.61-MHz 686-class CPU) Origin = "GenuineIntel" Id = 0x6b1 Stepping = 1 Features=0x383fbff real memory = 1073741824 (1048576K bytes) Physical memory chunk(s): 0x00001000 - 0x0009dfff, 643072 bytes (157 pages) 0x004db000 - 0x3feeffff, 1067536384 bytes (260629 pages) 0x3ff00000 - 0x3fff7fff, 1015808 bytes (248 pages) avail memory = 1040273408 (1015892K bytes) Programming 16 pins in IOAPIC #0 Programming 16 pins in IOAPIC #1 SMP: CPU0 apic_initialize(): lint0: 0x00000700 lint1: 0x00010400 TPR: 0x00000010 SVR: 0x000001ff FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 3, version: 0x00040011, at 0xfee00000 cpu1 (AP): apic id: 0, version: 0x00040011, at 0xfee00000 io0 (APIC): apic id: 1, version: 0x000f0011, at 0xfec00000 io1 (APIC): apic id: 2, version: 0x000f0011, at 0xfec01000 bios32: Found BIOS32 Service Directory header at 0xc00f6ba0 bios32: Entry = 0xfd890 (c00fd890) Rev = 0 Len = 1 pcibios: PCI BIOS entry at 0x11a pnpbios: Found PnP BIOS data at 0xc00f6c30 pnpbios: Entry = f0000:9dd0 Rev = 1.0 Other BIOS signatures found: ACPI: 000f6c00 Preloaded elf kernel "kernel.old" at 0xc04b1000. Pentium Pro MTRR support enabled md0: Malloc disk Creating DISK md0 Math emulator present SMP: CPU0 bsp_apic_configure(): lint0: 0x00010700 lint1: 0x00000400 TPR: 0x00000010 SVR: 0x000001ff pci_open(1): mode 1 addr port (0x0cf8) is 0x80000270 pci_open(1a): mode1res=0x80000000 (0x80000000) pci_cfgcheck: device 0 [class=060000] [hdr=80] is there (id=00081166) Using $PIR table, 15 entries at 0xc00fded0 pcib-: pcib1 exists, using next available unit number pcib-: pcib2 exists, using next available unit number npx0: on motherboard npx0: INT 16 interface pcib1: on motherboard IOAPIC #1 intpin 13 -> irq 5 Freeing (NOT implemented) redirected PCI irq 10. found-> vendor=0x1000, dev=0x0021, revid=0x01 class=01-00-00, hdrtype=0x00, mfdev=0 subordinatebus=0 secondarybus=0 intpin=a, irq=5 pci1: on pcib1 sym0: <1010-66> irq 5 at device 10.0 on pci1 sym0: failed to allocate MMIO resources device_probe_and_attach: sym0 attach returned 6 pcib0: on motherboard found-> vendor=0x1166, dev=0x0008, revid=0x23 class=06-00-00, hdrtype=0x00, mfdev=1 subordinatebus=0 secondarybus=0 found-> vendor=0x1166, dev=0x0008, revid=0x01 class=06-00-00, hdrtype=0x00, mfdev=1 subordinatebus=0 secondarybus=0 found-> vendor=0x1166, dev=0x0006, revid=0x01 class=06-00-00, hdrtype=0x00, mfdev=1 subordinatebus=0 secondarybus=0 found-> vendor=0x1166, dev=0x0006, revid=0x01 class=06-00-00, hdrtype=0x00, mfdev=1 subordinatebus=0 secondarybus=0 found-> vendor=0x1002, dev=0x4752, revid=0x27 class=03-00-00, hdrtype=0x00, mfdev=0 subordinatebus=0 secondarybus=0 map[10]: type 1, range 32, base f5000000, size 24 map[14]: type 1, range 32, base 00001000, size 8 map[18]: type 1, range 32, base f4020000, size 12 IOAPIC #1 intpin 14 -> irq 9 Freeing (NOT implemented) redirected PCI irq 11. found-> vendor=0x8086, dev=0x1229, revid=0x09 class=02-00-00, hdrtype=0x00, mfdev=0 subordinatebus=0 secondarybus=0 intpin=a, irq=9 map[10]: type 1, range 32, base f4021000, size 12 map[14]: type 1, range 32, base 00001400, size 6 map[18]: type 1, range 32, base f4000000, size 17 found-> vendor=0x1166, dev=0x0200, revid=0x51 class=06-01-00, hdrtype=0x00, mfdev=1 subordinatebus=0 secondarybus=0 found-> vendor=0x1166, dev=0x0211, revid=0x00 class=01-01-8a, hdrtype=0x00, mfdev=1 subordinatebus=0 secondarybus=0 map[20]: type 1, range 32, base 00001800, size 4 IOAPIC #1 intpin 12 -> irq 10 Freeing (NOT implemented) redirected PCI irq 9. found-> vendor=0x1166, dev=0x0220, revid=0x04 class=0c-03-10, hdrtype=0x00, mfdev=1 subordinatebus=0 secondarybus=0 intpin=a, irq=10 map[10]: type 1, range 32, base f4022000, size 12 pci0: on pcib0 pci0: (vendor=0x1002, dev=0x4752) at 4.0 fxp0: port 0x1400-0x143f mem 0xf4000000-0xf401ffff,0xf4021000-0xf4021fff irq 9 at device 10.0 on pci0 fxp0: using memory space register mapping fxp0: Ethernet address 00:30:05:29:07:61 fxp0: PCI IDs: 8086 1229 110a 004b 0009 fxp0: Dynamic Standby mode is disabled inphy0: on miibus0 inphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto bpf: fxp0 attached isab0: at device 15.0 on pci0 isa0: on isab0 atapci0: port 0x1800-0x180f at device 15.1 on pci0 ata0: iobase=0x01f0 altiobase=0x03f6 bmaddr=0x1800 ata0: mask=03 status0=20 status1=30 ata0: mask=03 ostat0=20 ostat2=30 ata0-master: ATAPI probe a=20 b=20 ata0-slave: ATAPI probe a=30 b=30 ata0: mask=03 status0=20 status1=30 ata0-master: ATA probe a=25 b=25 ata0-slave: ATA probe a=25 b=25 ata0: devices=00 ata0: at 0x1f0 irq 14 on atapci0 ata1: iobase=0x0170 altiobase=0x0376 bmaddr=0x1808 ata1: mask=03 status0=50 status1=00 ata1: mask=03 ostat0=50 ostat2=00 ata1-master: ATAPI probe a=14 b=eb ata1-slave: ATAPI probe a=14 b=eb ata1: mask=03 status0=00 status1=00 ata1: devices=0c ata1: at 0x170 irq 15 on atapci0 ohci0: mem 0xf4022000-0xf4022fff irq 10 at device 15.2 on pci0 ohci0: (New OHCI DeviceId=0x02201166) usb0: OHCI version 1.0, legacy support usb0: on ohci0 usb0: USB revision 1.0 uhub0: (unknown) OHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 4 ports with 4 removable, self powered pcib2: on motherboard found-> vendor=0x1011, dev=0x0026, revid=0x02 class=06-04-00, hdrtype=0x01, mfdev=0 subordinatebus=3 secondarybus=3 IOAPIC #1 intpin 9 -> irq 11 found-> vendor=0x1069, dev=0x0050, revid=0x02 class=01-04-00, hdrtype=0x00, mfdev=0 subordinatebus=0 secondarybus=0 intpin=a, irq=11 map[10]: type 1, range 32, base f8000000, size 26 pci2: on pcib2 pcib4: at device 10.0 on pci2 IOAPIC #1 intpin 11 -> irq 16 Freeing (NOT implemented) redirected PCI irq 5. found-> vendor=0x9004, dev=0x6915, revid=0x03 class=02-00-00, hdrtype=0x00, mfdev=0 subordinatebus=0 secondarybus=0 intpin=a, irq=16 map[10]: type 1, range 32, base f6400000, size 19 map[14]: type 1, range 32, base 00002000, size 8 IOAPIC #1 intpin 8 -> irq 17 Freeing (NOT implemented) redirected PCI irq 9. found-> vendor=0x9004, dev=0x6915, revid=0x03 class=02-00-00, hdrtype=0x00, mfdev=0 subordinatebus=0 secondarybus=0 intpin=a, irq=17 map[10]: type 1, range 32, base f6480000, size 19 map[14]: type 1, range 32, base 00002400, size 8 pci3: on pcib4 sf0: port 0x2000-0x20ff mem 0xf6400000-0xf647ffff irq 16 at device 4.0 on pci3 sf0: Ethernet address: 00:00:d1:9d:b7:a4 miibus1: on sf0 ukphy0: on miibus1 ukphy0: OUI 0x0005be, model 0x0003, rev. 1 ukphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto bpf: sf0 attached sf1: port 0x2400-0x24ff mem 0xf6480000-0xf64fffff irq 17 at device 5.0 on pci3 sf1: Ethernet address: 00:00:d1:9d:b7:a5 miibus2: on sf1 ukphy1: on miibus2 ukphy1: OUI 0x0005be, model 0x0003, rev. 1 ukphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto bpf: sf1 attached mly0: mem 0xf8000000-0xfbffffff irq 11 at device 12.0 on pci2 mly0: AcceleRAID 352 , 2 channels, firmware 6.00-17-00 (20020218), 64MB RAM mly0: Siemens AcceleRAID 352 (1e), 33MHz 64-bit PCI mly0: 64MB 66MHz 64-bit SDRAM+ECC, cache 56MB mly0: CPU: i960RN @ 100MHZ mly0: 4MB 66MHz 64-bit private SDRAM+ECC mly0: battery backup not installed mly0: maximum data transfer 2048 blocks, maximum sg entries/command 257 mly0: logical devices present/critical/offline 1/0/0 mly0: physical devices present 7 mly0: physical disks present/offline 4/0 mly0: 2 physical channels, 2 virtual channels of 2 possible mly0: 512 parallel commands supported mly0: 1MB flash ROM, 0 of 100000 maximum cycles pci-: pci3 exists, using next available unit number pcib3: on motherboard pci4: on pcib3 ex_isa_identify() ata-: ata0 exists, using next available unit number ata-: ata1 exists, using next available unit number Trying Read_Port at 203 Trying Read_Port at 243 Trying Read_Port at 283 Trying Read_Port at 2c3 Trying Read_Port at 303 Trying Read_Port at 343 Trying Read_Port at 383 Trying Read_Port at 3c3 isa_probe_children: disabling PnP devices isa_probe_children: probing non-PnP devices orm0: