From owner-freebsd-stable@FreeBSD.ORG Sun May 26 11:38:50 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 650EEEB5; Sun, 26 May 2013 11:38:50 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pd0-f172.google.com (mail-pd0-f172.google.com [209.85.192.172]) by mx1.freebsd.org (Postfix) with ESMTP id 30F672E5; Sun, 26 May 2013 11:38:49 +0000 (UTC) Received: by mail-pd0-f172.google.com with SMTP id 10so5668511pdi.31 for ; Sun, 26 May 2013 04:38:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=i4uAv6OGVo1xDTtQsKVv1UF/kIwj5jqGCVcZJ3JPIBU=; b=M0bqYMsdqc2D+6+v+OjD2EB0zKxyNOZRAdNed15+2ivUsXvuxjxKPzcOsy/DtnHaJA /lrYWgF3cWQD20sB9gMvR6jypOJIaTaeEu5v5htM92JgCa4NvgcJRaElCZQI9okCQeH1 9BXEX4zNveN2jTAbwrioQ8QI9EkVQ1wLGPOxuOorZ51eQTNpMNGiwBR6G26q0Pdbgvgr xL/Mq4VgKNKq4yplkugiTBLD4JO9+b9aQBHjnipyVy3dWMTsotLXo9Va0YveXO2FcuOj oWzEGISSIg73A84ubtKLu4otjbI4ePBkYD7z6G66mMOGjKBsEdTk8UoMpCVpI/nrmR16 Qo0A== X-Received: by 10.66.139.198 with SMTP id ra6mr25570334pab.140.1369568328482; Sun, 26 May 2013 04:38:48 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id zt1sm7139544pbb.15.2013.05.26.04.38.44 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Sun, 26 May 2013 04:38:47 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Sun, 26 May 2013 20:38:41 +0900 From: YongHyeon PYUN Date: Sun, 26 May 2013 20:38:41 +0900 To: Hiroki Sato Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 Message-ID: <20130526113841.GA1511@michelle.cdnetworks.com> References: <20130524044919.GA41292@icarus.home.lan> <20130524054720.GA1496@michelle.cdnetworks.com> <20130524.162926.395058052118975996.hrs@allbsd.org> <20130524.163646.628115045676432731.hrs@allbsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130524.163646.628115045676432731.hrs@allbsd.org> User-Agent: Mutt/1.4.2.3i Cc: jdc@koitsu.org, gjb@freebsd.org, freebsd-stable@freebsd.org, re@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 May 2013 11:38:50 -0000 On Fri, May 24, 2013 at 04:36:46PM +0900, Hiroki Sato wrote: > Hiroki Sato wrote > in <20130524.162926.395058052118975996.hrs@allbsd.org>: > > hr> YongHyeon PYUN wrote > hr> in <20130524054720.GA1496@michelle.cdnetworks.com>: > hr> > hr> A workaround is specifying the following line in rc.conf: > hr> > hr> ifconfig_fxp0="DHCP media 100baseTX mediaopt full-duplex" > > Hmm, I guess this can happen on other NICs when the link negotiation > causes a link-state flap. Is it true? Probably not. AFAIK fxp(4) is the only controller that requires two full resets to support flow control. Multicast programming for fxp(4) also requires full controller reset so trying to renew its existing lease for fxp(4) looks wrong to me. > > -- Hiroki From owner-freebsd-stable@FreeBSD.ORG Mon May 27 00:09:44 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2680052A; Mon, 27 May 2013 00:09:44 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pb0-x232.google.com (mail-pb0-x232.google.com [IPv6:2607:f8b0:400e:c01::232]) by mx1.freebsd.org (Postfix) with ESMTP id E8994373; Mon, 27 May 2013 00:09:43 +0000 (UTC) Received: by mail-pb0-f50.google.com with SMTP id wy17so6139459pbc.9 for ; Sun, 26 May 2013 17:09:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=CVCs7nkSYd1vKjecbi3gyaE0WPGuon9HqpnP4/1l0Gg=; b=YbnRGwcC0vkt3/4eK+670HLUDQagDpHUN+H+wLO7dNCVZ/JX6G/Yk+JB5klhQyw6qA BXR6ECdd5gTdo3cF2cowuyN+nO4wytr44iMepoyTqn9hPE2SRm8etJJ6kU6WBsJ3275T J/KrnW6Z/8/13ZtlWbopEIjL14zALnYUd1zUHUKJ+9m7sYrs3R7e9dAva4Z+Lv4EVk0W 7y1E35PFAIhCobUYWYPLuya6OMUYmJ2nubAMNFBBtH9iDxkrCxAhqH9gKgc0V/5SquZq T7YddUc0B/LYJGox13D8ty0PEd3nh29kVV9qpm28KLPquCBKMpP84hm+LooIGnzl+rKO JUXA== X-Received: by 10.68.113.194 with SMTP id ja2mr26520691pbb.65.1369613382352; Sun, 26 May 2013 17:09:42 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id tb7sm26072566pbc.14.2013.05.26.17.09.38 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Sun, 26 May 2013 17:09:41 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Mon, 27 May 2013 09:09:34 +0900 From: YongHyeon PYUN Date: Mon, 27 May 2013 09:09:34 +0900 To: Charles Sprickman Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 Message-ID: <20130527000934.GA3227@michelle.cdnetworks.com> References: <20130524010943.GA37252@icarus.home.lan> <20130524012117.GE1672@glenbarber.us> <20130524030351.GA39091@icarus.home.lan> <20130524031303.GC28865@glenbarber.us> <20130524033806.GA39720@icarus.home.lan> <20130524034244.GD28865@glenbarber.us> <20130524044035.GA40957@icarus.home.lan> <20130524044919.GA41292@icarus.home.lan> <20130524054720.GA1496@michelle.cdnetworks.com> <91E1EEEC-CD78-4E9B-B71A-A8B4F5417D81@bway.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <91E1EEEC-CD78-4E9B-B71A-A8B4F5417D81@bway.net> User-Agent: Mutt/1.4.2.3i Cc: Jeremy Chadwick , Glen Barber , freebsd-stable@freebsd.org, FreeBSD Release Engineering Team X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 May 2013 00:09:44 -0000 On Fri, May 24, 2013 at 03:32:29AM -0400, Charles Sprickman wrote: > > On May 24, 2013, at 1:47 AM, YongHyeon PYUN wrote: > > > On Thu, May 23, 2013 at 09:49:19PM -0700, Jeremy Chadwick wrote: > >> On Thu, May 23, 2013 at 09:40:35PM -0700, Jeremy Chadwick wrote: > >>> On Thu, May 23, 2013 at 11:42:44PM -0400, Glen Barber wrote: > >>>> On Thu, May 23, 2013 at 08:38:06PM -0700, Jeremy Chadwick wrote: > >>>>> If someone wants me to test DHCP via fxp(4) on the above system (I can > >>>>> do so with both NICs), just let me know; it should only take me half an > >>>>> hour or so. > >>>>> > >>>>> I'll politely wait for someone to say "please do so" else won't bother. > >>>>> > >>>> > >>>> For the sake of completeness... > >>>> > >>>> "Please do so." :) > >>> > >>> Issue reproduced 100% reliably, even within sysinstall. > >>> > >>> {snip} > >> > >> Forgot to add: > >> > >> This issue ONLY happens when using DHCP. > >> > >> Statically assigning the IP address works fine; fxp0 goes down once, > >> up once, then stays up indefinitely. > > > > I asked Mike to try backing out dhclient(8) change(r247336) but it > > seems he missed that. Jeremy, could you try that? > > I have a system up and running and showing the problem (that was > non-trival, just for the record - one machine blew the PSU after > POST, the other refused to boot off an IDE drive, and then required > two CD-ROM drives before I found a functional one, and it took a > good half-hour to find what's apparently the last piece of writable > CD-R media I own). > > I am not awesome with svn, but I'll see if I can manually undo > r247336 and give it a spin. Download http://svnweb.freebsd.org/base/stable/8/sbin/dhclient/dhclient.c?r1=231278&r2=247336&view=patch And apply the patch with -R. From owner-freebsd-stable@FreeBSD.ORG Mon May 27 04:39:38 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id ABFA581F; Mon, 27 May 2013 04:39:38 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pa0-f52.google.com (mail-pa0-f52.google.com [209.85.220.52]) by mx1.freebsd.org (Postfix) with ESMTP id 74B879E; Mon, 27 May 2013 04:39:38 +0000 (UTC) Received: by mail-pa0-f52.google.com with SMTP id bg2so6423912pad.11 for ; Sun, 26 May 2013 21:39:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=AWG0/dAyAQ2F88Dy7RukTSczJvH1C6VsbkRHqatZs1E=; b=r7X7cHsIdkgdrVqs0JKT8apy/3AYAJYoLyUogtMPGyc8uCHHr0MLots0RHWS0IeRrj jmHXfC8TF5n+/w7ADiCC2kI6gB69lNi4TcLNouLU0ggquqmHgy9qNcTbZVD8tSt0mbgP c215Wh5oogJVZbtww+xPkBkVK3gx79U++x6kZLHwrD8+NV1t/OVrBnZtjL1pc6UMHjLZ Ha1FIPPOZ5oIktn3t6zNzMF8POBYIXduATQ03PcimzrskK5L79qbIwz5bXN35KHVpA9O UsggN3k33BQp94ZIZzGhmDak2OP11JpALAxVWfxJNVqBVzgky5oq+JzNn4dahtCVqJun uIJw== X-Received: by 10.68.12.98 with SMTP id x2mr27985359pbb.92.1369629571647; Sun, 26 May 2013 21:39:31 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id 3sm26878379pbj.46.2013.05.26.21.39.27 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Sun, 26 May 2013 21:39:30 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Mon, 27 May 2013 13:39:23 +0900 From: YongHyeon PYUN Date: Mon, 27 May 2013 13:39:23 +0900 To: Hiroki Sato Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 Message-ID: <20130527043923.GA1480@michelle.cdnetworks.com> References: <20130524044919.GA41292@icarus.home.lan> <20130524054720.GA1496@michelle.cdnetworks.com> <20130524.162926.395058052118975996.hrs@allbsd.org> <20130524.163646.628115045676432731.hrs@allbsd.org> <20130526113841.GA1511@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130526113841.GA1511@michelle.cdnetworks.com> User-Agent: Mutt/1.4.2.3i Cc: jdc@koitsu.org, gjb@freebsd.org, freebsd-stable@freebsd.org, re@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 May 2013 04:39:38 -0000 On Sun, May 26, 2013 at 08:38:41PM +0900, YongHyeon PYUN wrote: > On Fri, May 24, 2013 at 04:36:46PM +0900, Hiroki Sato wrote: > > Hiroki Sato wrote > > in <20130524.162926.395058052118975996.hrs@allbsd.org>: > > > > hr> YongHyeon PYUN wrote > > hr> in <20130524054720.GA1496@michelle.cdnetworks.com>: > > hr> > > hr> A workaround is specifying the following line in rc.conf: > > hr> > > hr> ifconfig_fxp0="DHCP media 100baseTX mediaopt full-duplex" > > > > Hmm, I guess this can happen on other NICs when the link negotiation > > causes a link-state flap. Is it true? > > Probably not. AFAIK fxp(4) is the only controller that requires two > full resets to support flow control. Multicast programming for > fxp(4) also requires full controller reset so trying to renew its > existing lease for fxp(4) looks wrong to me. > After reading code again, I think the dhclient change may affect all controllers that don't have protection against multiple initialization of upper stack. if_init() of driver is called whenever an IP address is assigned to an interface. The stack could be changed to call if_init() only when IFF_DRV_RUNNING flag is not set but that would break old drivers which may require full controller reset for multicast filter reprogramming. I also guess there may be several drivers that do not implement reinitialization protection in arm/mips. It seems fxp(4)'s simple protection against unnecessary controller initialization does not work well due to the limitation of controller. We may be able to improve fxp(4) case but other old/buggy drivers should be fixed too. From owner-freebsd-stable@FreeBSD.ORG Mon May 27 06:43:30 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 19808E65 for ; Mon, 27 May 2013 06:43:30 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pa0-f45.google.com (mail-pa0-f45.google.com [209.85.220.45]) by mx1.freebsd.org (Postfix) with ESMTP id E9E486C1 for ; Mon, 27 May 2013 06:43:29 +0000 (UTC) Received: by mail-pa0-f45.google.com with SMTP id tj12so1546786pac.18 for ; Sun, 26 May 2013 23:43:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=eSZTC+I/n6tgsSV2jsUWnma7+7Y8Uo6fj4zYEHzvq14=; b=dFL+k+edTyuHtdccMyPy6m9xJoYhA9rtMonmU1VlnFUwf/p9GC7N0YO1NnfS79ejzO z/DNohe/uBe6oPm+Th3mxK2iGX/FV69kv0YF+7aAJ/MmWTOF/RBxLiG365ilPW8UjOBL hlBzLYBs4U1LOcByQoSD9WXHKULqSggoMB0NfMYXGGdOELnD4l2akRtrsjhSe6PPwRth rMOj4JTu98HR7OjAOUzJboBZ1kB8kX7dfZ5C2GPRCLMZ3uTXmc6UI4zdbjP9gSwPGTwh 4u5i8vIp5iE0ETwYII4D1Yir0HfsnHnblv+RpKsKPHgdoFx8TTJMEg+Cj4IUCK0ObQLc HVPg== X-Received: by 10.68.222.8 with SMTP id qi8mr28060207pbc.7.1369637008389; Sun, 26 May 2013 23:43:28 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id cd2sm27340262pbd.35.2013.05.26.23.43.24 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Sun, 26 May 2013 23:43:26 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Mon, 27 May 2013 15:43:20 +0900 From: YongHyeon PYUN Date: Mon, 27 May 2013 15:43:20 +0900 To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130527064320.GB1480@michelle.cdnetworks.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 May 2013 06:43:30 -0000 On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > To check, I upgraded another identical host, and the same problem appears. What is the last known working revision? > There > is not correlation with time, since they happend at totaly different times. > I rebooted both hosts at almost the same time. > one host : > uptime: 5:24PM up 6:15, 0 users, load averages: 0.00, 0.00, 0.00 > May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN > May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP > May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN > May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP > > and > uptime: 5:24PM up 6:14, 0 users, load averages: 0.00, 0.00, 0.00 > > May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN > May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP > > this is not serious, the ilo (ssh) connection is ok, but it's anoying, we have > more > than 10 of this hosts, and if I upgrade all of them, the logs will fill up > with this :-) > > any ideas? > > cheers, > danny From owner-freebsd-stable@FreeBSD.ORG Mon May 27 07:59:39 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 141A9F78 for ; Mon, 27 May 2013 07:59:39 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id C18A7A6E for ; Mon, 27 May 2013 07:59:38 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1UgsL2-000DBa-El; Mon, 27 May 2013 10:59:28 +0300 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: pyunyh@gmail.com Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP In-reply-to: Your message of Mon, 27 May 2013 15:43:20 +0900. Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Mon, 27 May 2013 10:59:28 +0300 From: Daniel Braniss Message-ID: Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 May 2013 07:59:39 -0000 > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > bge0: mem 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus2: on bge0 brgphy0: PHY 1 on miibus2 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:1b:24:5d:5b:bd bge1: mem 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus3: on bge1 brgphy1: PHY 1 on miibus3 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 00:1b:24:5d:5b:be sf-10> ifconfig bge1 bge1: flags=8802 metric 0 mtu 1500 options=8009b ether 00:1b:24:5d:5b:be nd6 options=21 media: Ethernet autoselect (100baseTX ) status: active > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > > To check, I upgraded another identical host, and the same problem appears. > > What is the last known working revision? I have no idea, but I have older versions, and ill start from the oldets (9.1-prerelease), but it will take time, since it takes hours till it happens. > > > There > > is not correlation with time, since they happend at totaly different times. > > I rebooted both hosts at almost the same time. > > one host : > > uptime: 5:24PM up 6:15, 0 users, load averages: 0.00, 0.00, 0.00 > > May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN > > May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP > > May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN > > May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP > > > > and > > uptime: 5:24PM up 6:14, 0 users, load averages: 0.00, 0.00, 0.00 > > > > May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN > > May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP > > > > this is not serious, the ilo (ssh) connection is ok, but it's anoying, we have > > more > > than 10 of this hosts, and if I upgrade all of them, the logs will fill up > > with this :-) > > > > any ideas? > > > > cheers, > > danny From owner-freebsd-stable@FreeBSD.ORG Mon May 27 13:41:43 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 208ED174; Mon, 27 May 2013 13:41:43 +0000 (UTC) (envelope-from Devin.Teske@fisglobal.com) Received: from mx1.fisglobal.com (mx1.fisglobal.com [199.200.24.190]) by mx1.freebsd.org (Postfix) with ESMTP id D9871313; Mon, 27 May 2013 13:41:42 +0000 (UTC) Received: from smtp.fisglobal.com ([10.132.206.16]) by ltcfislmsgpa07.fnfis.com (8.14.5/8.14.5) with ESMTP id r4RDf8F7024280 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Mon, 27 May 2013 08:41:08 -0500 Received: from LTCFISWMSGMB21.FNFIS.com ([10.132.99.23]) by LTCFISWMSGHT05.FNFIS.com ([10.132.206.16]) with mapi id 14.02.0309.002; Mon, 27 May 2013 08:41:08 -0500 From: "Teske, Devin" To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Thread-Topic: SunFire X2200 ilo's bge1 DOWN/UP Thread-Index: AQHOWrAcuYvyNcIfrk+Q9Bpx3iIp5ZkZXh+A Date: Mon, 27 May 2013 13:41:08 +0000 Message-ID: <13CA24D6AB415D428143D44749F57D7201F62C26@ltcfiswmsgmb21> References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.132.253.126] MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.10.8626, 1.0.431, 0.0.0000 definitions=2013-05-27_03:2013-05-27,2013-05-27,1970-01-01 signatures=0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: "" , Devin Teske , FreeBSD-STABLE Mailing List X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: Devin Teske List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 May 2013 13:41:43 -0000 On May 27, 2013, at 12:59 AM, Daniel Braniss wrote: On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, If you're truly running stable/9, and it's up-to-date, you should have have= already SVN revisions 248858 and 250650. Both of which have significant impact for (a) the SunFire X2200 (r248858) and (b) the DOWN/UP problem (r250650). Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. bge0: = mem 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus2: on bge0 brgphy0: PHY 1 on miibus2 brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge0: Ethernet address: 00:1b:24:5d:5b:bd bge1: = mem 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz miibus3: on bge1 brgphy1: PHY 1 on miibus3 brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow bge1: Ethernet address: 00:1b:24:5d:5b:be sf-10> ifconfig bge1 bge1: flags=3D8802 metric 0 mtu 1500 options=3D8009b ether 00:1b:24:5d:5b:be nd6 options=3D21 media: Ethernet autoselect (100baseTX ) status: active Saw similar things happening over here with different broadcom chipset, and= the above revisions helped significantly (URLs below): http://svnweb.freebsd.org/base?view=3Drevision&revision=3D248858 http://svnweb.freebsd.org/base?view=3Drevision&revision=3D250650 is toggeling bge1 DOWN/UP every few hours, this port is being used by the I= LO. To check, I upgraded another identical host, and the same problem appears. What is the last known working revision? I have no idea, but I have older versions, and ill start from the oldets (9.1-prerelease), but it will take time, since it takes hours till it happens. There are ways you can speed up the replication time. I tend to flood a ser= ver with TCP while I've heard of it happening under UDP flood too. Here's a nice way to flood a server with TCP (assuming you have SSH access = to the system via keys): sh -c 'while :;do dd if=3D/dev/urandom of=3D/dev/stdout bs=3D1m count=3D102= 4 | ssh HOST2KILL /sbin/md5; done' Run that about 16 times in separate screen sessions from various other host= s on your network, taking care to replace "HOST2KILL" with the hostname or IP of the box with = the SunFire X2200. Let that run for a while, and then when you think you've had a reset (if yo= u weren't standing there watching for one)=85 grep 'bge.*DOWN' /var/log/messages On a system that has booted and stayed up-and-running, there shouldn't be a= ny messages like this: bge0: link state changed to DOWN When you actually get this message (if your experience is like ours), you'l= l be down for 90 seconds while the NIC resets. However, since you say you have some older 9.1 releases=85 I'd start by fir= st trying to bring the replication time of the problem down by using TCP and/or UDP floods. That w= ay you'll be able to test for resolution of the problem as you progress up to stable/9 (where th= e problem should be fixed by the aforementioned SVN revisions -- specific to your hardware). There is not correlation with time, since they happend at totaly different times. I rebooted both hosts at almost the same time. one host : uptime: 5:24PM up 6:15, 0 users, load averages: 0.00, 0.00, 0.00 May 24 12:53:52 sf-04 kernel: bge1: link state changed to DOWN May 24 12:53:55 sf-04 kernel: bge1: link state changed to UP May 24 15:34:25 sf-04 kernel: bge1: link state changed to DOWN May 24 15:34:28 sf-04 kernel: bge1: link state changed to UP and uptime: 5:24PM up 6:14, 0 users, load averages: 0.00, 0.00, 0.00 May 24 16:30:44 sf-10 kernel: bge1: link state changed to DOWN May 24 16:30:44 sf-10 kernel: bge1: link state changed to UP this is not serious, the ilo (ssh) connection is ok, but it's anoying, we h= ave more than 10 of this hosts, and if I upgrade all of them, the logs will fill up with this :-) any ideas? Well, you say the connection is OK=85 so it doesn't sound like a full reset= as it was in our case (we have a different chipset). But I agree that a log full of those would be annoying. Try getting up to stable/9 in its current state (note: stable/8 also has al= l the aforementioned revisions too). -- Devin _____________ The information contained in this message is proprietary and/or confidentia= l. If you are not the intended recipient, please: (i) delete the message an= d all copies; (ii) do not disclose, distribute or use the message in any ma= nner; and (iii) notify the sender immediately. In addition, please be aware= that any message addressed to our domain is subject to archiving and revie= w by persons other than the intended recipient. Thank you. From owner-freebsd-stable@FreeBSD.ORG Mon May 27 13:56:33 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 677EF93E; Mon, 27 May 2013 13:56:33 +0000 (UTC) (envelope-from marck@rinet.ru) Received: from woozle.rinet.ru (woozle.rinet.ru [195.54.192.68]) by mx1.freebsd.org (Postfix) with ESMTP id EBCA55E0; Mon, 27 May 2013 13:56:32 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by woozle.rinet.ru (8.14.5/8.14.5) with ESMTP id r4RDuO9F051974; Mon, 27 May 2013 17:56:24 +0400 (MSK) (envelope-from marck@rinet.ru) Date: Mon, 27 May 2013 17:56:24 +0400 (MSK) From: Dmitry Morozovsky To: Pete French Subject: Re: Proposed MFC to hastctl: compact 'status' and introduce 'list' command In-Reply-To: <20130524111945.GB12310@gmail.com> Message-ID: References: <20130524111945.GB12310@gmail.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) X-NCC-RegID: ru.rinet X-OpenPGP-Key-ID: 6B691B03 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.4.3 (woozle.rinet.ru [0.0.0.0]); Mon, 27 May 2013 17:56:24 +0400 (MSK) Cc: trociny@freebsd.org, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 May 2013 13:56:33 -0000 On Fri, 24 May 2013, Mikolaj Golub wrote: > > > > http://svnweb.freebsd.org/changeset/base/248291 > > > ... > > > > The reason I'm asking is that it could lead to changes in hast-related scripts > > > > which one use in production. > > > > > > > > > Any chance we could do this is 2 stages - first being to add 'list' to give us a chnace > > > ti change scripts over, then make the chnages to 'status'. I have scripts > > > which try and parse the outut from 'status' which will need changing, > > > and I sspect I am not the only one... > > > > I see no problem with this, as it is one-lite patch (modulo usage/manual page > > changes); it would be direct commit to -stable, but as it is temporary, I see > > no problem there too. > > > > Mikolaj, your opinion? > > It looks like a very good idea. Done for stable/9 and stable/8 as r251025 and r251026 I hope 6 weeks planned before cleanup will be enough for you and other current `hastctl list' consumers. -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------ From owner-freebsd-stable@FreeBSD.ORG Mon May 27 17:02:31 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 38959158; Mon, 27 May 2013 17:02:31 +0000 (UTC) (envelope-from mikes@siralan.org) Received: from mail.suso.org (mail.suso.org [66.244.94.5]) by mx1.freebsd.org (Postfix) with ESMTP id 15CBFEAC; Mon, 27 May 2013 17:02:30 +0000 (UTC) Received: from c-98-223-197-163.hsd1.in.comcast.net (c-98-223-197-163.hsd1.in.comcast.net [98.223.197.163]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.suso.org (Postfix) with ESMTP id 75B8513814A; Mon, 27 May 2013 17:02:19 +0000 (GMT) Date: Mon, 27 May 2013 13:02:14 -0400 (EDT) From: "Michael L. Squires" X-X-Sender: mikes@familysquires.net To: YongHyeon PYUN Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 In-Reply-To: <20130527043923.GA1480@michelle.cdnetworks.com> Message-ID: References: <20130524044919.GA41292@icarus.home.lan> <20130524054720.GA1496@michelle.cdnetworks.com> <20130524.162926.395058052118975996.hrs@allbsd.org> <20130524.163646.628115045676432731.hrs@allbsd.org> <20130526113841.GA1511@michelle.cdnetworks.com> <20130527043923.GA1480@michelle.cdnetworks.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: jdc@koitsu.org, gjb@freebsd.org, freebsd-stable@freebsd.org, re@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 May 2013 17:02:31 -0000 On Mon, 27 May 2013, YongHyeon PYUN wrote: > On Sun, May 26, 2013 at 08:38:41PM +0900, YongHyeon PYUN wrote: >> On Fri, May 24, 2013 at 04:36:46PM +0900, Hiroki Sato wrote: >>> Hiroki Sato wrote >>> in <20130524.162926.395058052118975996.hrs@allbsd.org>: >>> >>> hr> YongHyeon PYUN wrote >>> hr> in <20130524054720.GA1496@michelle.cdnetworks.com>: >>> hr> >>> hr> A workaround is specifying the following line in rc.conf: >>> hr> >>> hr> ifconfig_fxp0="DHCP media 100baseTX mediaopt full-duplex" >>> Sorry I've been offline, two trips last week. I've installed 8.4-RELEASE on the NAT box with the fxp interface: FreeBSD familysquires.net 8.4-RELEASE FreeBSD 8.4-RELEASE #54: Sun May 26 22:56:19 EDT 2013 root@familysquires.net:/usr/obj/usr/src/sys/NEWGATE i386 and am using the workaround given above which has stopped the fxp interface cycling on/off. I'll have access to the other box on Wednesday and will try the other test. Mike Squires mikes@siralan.org From owner-freebsd-stable@FreeBSD.ORG Tue May 28 02:33:16 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2B930ADF; Tue, 28 May 2013 02:33:16 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pd0-f179.google.com (mail-pd0-f179.google.com [209.85.192.179]) by mx1.freebsd.org (Postfix) with ESMTP id E6349EFA; Tue, 28 May 2013 02:33:15 +0000 (UTC) Received: by mail-pd0-f179.google.com with SMTP id q11so7092211pdj.10 for ; Mon, 27 May 2013 19:33:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=rgjbAnF0k32dGXoJz+bxrOfeKokwL2I7qgArBrN2b+E=; b=ohGNvJzF+/Zqewv6W2CVj763Lmwi/RzjPs+YMZoRzbE2nf9oBYC7jg3ligURBCpYGp LtWy3CZc6od0tDjCx1EtNVv5m0nZ7W4IJWP9ZC/oAgidnJtGagZxsHlBx9aw1RmwOFks g4Q8HtLCTIkjSBZNNn0tjFjKB7wO7tZLSODxi3PsMrKUxjNVbnJcGTBTh8UTl1fU+NhO +6hXVSaowL3uxWuEm6E88IqNbD9ZFQnzPkajuj8ITjvP4frNFGOuFzl2QqE4GHnm4X+5 ojuF2qIGh7GLXYT60sTlgUN3P0bCMus3f+E4C/UJzffYrUgVuMZWf+PC7P2orsBoef/A B2Rg== X-Received: by 10.66.144.170 with SMTP id sn10mr32121998pab.42.1369708388643; Mon, 27 May 2013 19:33:08 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id l16sm2240953pag.22.2013.05.27.19.33.04 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 27 May 2013 19:33:07 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Tue, 28 May 2013 11:33:00 +0900 From: YongHyeon PYUN Date: Tue, 28 May 2013 11:33:00 +0900 To: "Michael L. Squires" Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 Message-ID: <20130528023300.GA3077@michelle.cdnetworks.com> References: <20130524044919.GA41292@icarus.home.lan> <20130524054720.GA1496@michelle.cdnetworks.com> <20130524.162926.395058052118975996.hrs@allbsd.org> <20130524.163646.628115045676432731.hrs@allbsd.org> <20130526113841.GA1511@michelle.cdnetworks.com> <20130527043923.GA1480@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="EeQfGwPcQSOJBaQU" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: jdc@koitsu.org, gjb@freebsd.org, freebsd-stable@freebsd.org, re@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 02:33:16 -0000 --EeQfGwPcQSOJBaQU Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Mon, May 27, 2013 at 01:02:14PM -0400, Michael L. Squires wrote: > > On Mon, 27 May 2013, YongHyeon PYUN wrote: > > >On Sun, May 26, 2013 at 08:38:41PM +0900, YongHyeon PYUN wrote: > >>On Fri, May 24, 2013 at 04:36:46PM +0900, Hiroki Sato wrote: > >>>Hiroki Sato wrote > >>> in <20130524.162926.395058052118975996.hrs@allbsd.org>: > >>> > >>>hr> YongHyeon PYUN wrote > >>>hr> in <20130524054720.GA1496@michelle.cdnetworks.com>: > >>>hr> > >>>hr> A workaround is specifying the following line in rc.conf: > >>>hr> > >>>hr> ifconfig_fxp0="DHCP media 100baseTX mediaopt full-duplex" > >>> > > Sorry I've been offline, two trips last week. > > I've installed 8.4-RELEASE on the NAT box with the fxp interface: > > FreeBSD familysquires.net 8.4-RELEASE FreeBSD 8.4-RELEASE #54: Sun May 26 > 22:56:19 EDT 2013 root@familysquires.net:/usr/obj/usr/src/sys/NEWGATE > i386 > > and am using the workaround given above which has stopped the fxp interface > cycling on/off. > > I'll have access to the other box on Wednesday and will try the other test. Here is patch I'm testing and it seems to work with dhclient on CURRENT. Mike, could you try attached patch? > > Mike Squires > mikes@siralan.org --EeQfGwPcQSOJBaQU Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="fxp.init.diff" Index: sys/dev/fxp/if_fxp.c =================================================================== --- sys/dev/fxp/if_fxp.c (revision 251021) +++ sys/dev/fxp/if_fxp.c (working copy) @@ -1075,7 +1075,8 @@ fxp_suspend(device_t dev) pmstat |= PCIM_PSTAT_PME | PCIM_PSTAT_PMEENABLE; sc->flags |= FXP_FLAG_WOL; /* Reconfigure hardware to accept magic frames. */ - fxp_init_body(sc, 1); + ifp->if_drv_flags &= ~IFF_DRV_RUNNING; + fxp_init_body(sc, 0); } pci_write_config(sc->dev, pmc + PCIR_POWER_STATUS, pmstat, 2); } @@ -2141,8 +2142,10 @@ fxp_tick(void *xsc) */ if (sc->rx_idle_secs > FXP_MAX_RX_IDLE) { sc->rx_idle_secs = 0; - if ((ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) + if ((ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) { + ifp->if_drv_flags &= ~IFF_DRV_RUNNING; fxp_init_body(sc, 1); + } return; } /* @@ -2240,6 +2243,7 @@ fxp_watchdog(struct fxp_softc *sc) device_printf(sc->dev, "device timeout\n"); sc->ifp->if_oerrors++; + sc->ifp->if_drv_flags &= ~IFF_DRV_RUNNING; fxp_init_body(sc, 1); } @@ -2274,6 +2278,10 @@ fxp_init_body(struct fxp_softc *sc, int setmedia) int i, prm; FXP_LOCK_ASSERT(sc, MA_OWNED); + + if ((ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) + return; + /* * Cancel any pending I/O */ @@ -2813,6 +2821,7 @@ fxp_miibus_statchg(device_t dev) */ if (sc->revision == FXP_REV_82557) return; + ifp->if_drv_flags &= ~IFF_DRV_RUNNING; fxp_init_body(sc, 0); } @@ -2836,9 +2845,10 @@ fxp_ioctl(struct ifnet *ifp, u_long command, caddr if (ifp->if_flags & IFF_UP) { if (((ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) && ((ifp->if_flags ^ sc->if_flags) & - (IFF_PROMISC | IFF_ALLMULTI | IFF_LINK0)) != 0) + (IFF_PROMISC | IFF_ALLMULTI | IFF_LINK0)) != 0) { + ifp->if_drv_flags &= ~IFF_DRV_RUNNING; fxp_init_body(sc, 0); - else if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) + } else if ((ifp->if_drv_flags & IFF_DRV_RUNNING) == 0) fxp_init_body(sc, 1); } else { if ((ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) @@ -2851,8 +2861,10 @@ fxp_ioctl(struct ifnet *ifp, u_long command, caddr case SIOCADDMULTI: case SIOCDELMULTI: FXP_LOCK(sc); - if ((ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) + if ((ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) { + ifp->if_drv_flags &= ~IFF_DRV_RUNNING; fxp_init_body(sc, 0); + } FXP_UNLOCK(sc); break; @@ -2942,8 +2954,10 @@ fxp_ioctl(struct ifnet *ifp, u_long command, caddr ~(IFCAP_VLAN_HWTSO | IFCAP_VLAN_HWCSUM); reinit++; } - if (reinit > 0 && ifp->if_flags & IFF_UP) + if (reinit > 0 && (ifp->if_drv_flags & IFF_DRV_RUNNING) != 0) { + ifp->if_drv_flags &= ~IFF_DRV_RUNNING; fxp_init_body(sc, 0); + } FXP_UNLOCK(sc); VLAN_CAPABILITIES(ifp); break; --EeQfGwPcQSOJBaQU-- From owner-freebsd-stable@FreeBSD.ORG Tue May 28 05:30:02 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1DED1EC6 for ; Tue, 28 May 2013 05:30:02 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pd0-f177.google.com (mail-pd0-f177.google.com [209.85.192.177]) by mx1.freebsd.org (Postfix) with ESMTP id EFFD1912 for ; Tue, 28 May 2013 05:30:01 +0000 (UTC) Received: by mail-pd0-f177.google.com with SMTP id u11so7255030pdi.36 for ; Mon, 27 May 2013 22:30:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=0p8notnfv424bcFecqUH0kN6tm/CDzA5e1PsdmARZB0=; b=rzuXOCzu3wT3oQm6eQCJjb7FPwgnd+aM7MkCoIJKpclez73CAu7tk5DleFZl8taMq2 UupysHWlAfEvd1RzTdYWgTXvtw80JMco9zX/8hLuyTo2y/FJ+t2g1frSjd0Iwgm3MmD5 cgDklVUcUgwA5g8Gxdgg8clb2V6FvNY4Vg4oBW94IUhZ0E3T5GORcnPaML7vhYhj8DZQ WS/wg9LPrJjadunOExaPMTX56yfLT0bzGh9QYt3PAfL/SpP53c2AwAAH1GWIbVR6kRSK oG0LZhj3qFadUCnBLBopddMfuLMBhYe5JnR2fGVrpA7Y1D2x9++BKGCsl4Xe+wvx3iMt 3Tnw== X-Received: by 10.66.253.74 with SMTP id zy10mr32547072pac.123.1369719001246; Mon, 27 May 2013 22:30:01 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id zt1sm14410116pbb.15.2013.05.27.22.29.58 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 27 May 2013 22:30:00 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Tue, 28 May 2013 14:29:53 +0900 From: YongHyeon PYUN Date: Tue, 28 May 2013 14:29:53 +0900 To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130528052953.GA1457@michelle.cdnetworks.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 05:30:02 -0000 On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > bge0: mem > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > miibus2: on bge0 > brgphy0: PHY 1 on miibus2 > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > bge0: Ethernet address: 00:1b:24:5d:5b:bd > bge1: mem > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > miibus3: on bge1 > brgphy1: PHY 1 on miibus3 > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > bge1: Ethernet address: 00:1b:24:5d:5b:be > > sf-10> ifconfig bge1 > bge1: flags=8802 metric 0 mtu 1500 > options=8009b TE> > ether 00:1b:24:5d:5b:be > nd6 options=21 > media: Ethernet autoselect (100baseTX ) > status: active > Because bge1 is not UP, I wonder how you get link UP/DOWN events. Do you have some network script run by cron? > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > > > To check, I upgraded another identical host, and the same problem appears. > > > > What is the last known working revision? > > I have no idea, but I have older versions, and ill start from the oldets > (9.1-prerelease), but > it will take time, since it takes hours till it happens. > ok. From owner-freebsd-stable@FreeBSD.ORG Tue May 28 06:28:11 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 79CFC54C for ; Tue, 28 May 2013 06:28:11 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 337EBB69 for ; Tue, 28 May 2013 06:28:10 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1UhDO4-000Dr7-PJ; Tue, 28 May 2013 09:28:00 +0300 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: pyunyh@gmail.com Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP In-reply-to: <20130528052953.GA1457@michelle.cdnetworks.com> References: <20130528052953.GA1457@michelle.cdnetworks.com> Comments: In-reply-to YongHyeon PYUN message dated "Tue, 28 May 2013 14:29:53 +0900." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 28 May 2013 09:28:00 +0300 From: Daniel Braniss Message-ID: Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 06:28:11 -0000 > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > > > > bge0: mem > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > miibus2: on bge0 > > brgphy0: PHY 1 on miibus2 > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > bge0: Ethernet address: 00:1b:24:5d:5b:bd > > bge1: mem > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > miibus3: on bge1 > > brgphy1: PHY 1 on miibus3 > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > bge1: Ethernet address: 00:1b:24:5d:5b:be > > > > sf-10> ifconfig bge1 > > bge1: flags=8802 metric 0 mtu 1500 > > options=8009b > TE> > > ether 00:1b:24:5d:5b:be > > nd6 options=21 > > media: Ethernet autoselect (100baseTX ) > > status: active > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events. > Do you have some network script run by cron? no scripts. this port is shared with the ILO/IPMI, and back in March you fixed a problem that it was hanging soon after it was initialized by the driver, (r248226 - but I'm not sure if it was ever MFC'ed). Initialy I thought it could be caused by connections to it from other hosts (either via the web, or ssh) so I killed them, but it didn't help. without that patch the connection fails, and I don't see any DOWN/UP. > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > > > > To check, I upgraded another identical host, and the same problem appears. > > > > > > What is the last known working revision? > > > > I have no idea, but I have older versions, and ill start from the oldets > > (9.1-prerelease), but > > it will take time, since it takes hours till it happens. > > > > ok. From owner-freebsd-stable@FreeBSD.ORG Tue May 28 06:42:03 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 169697C0; Tue, 28 May 2013 06:42:03 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id C2BC3CD0; Tue, 28 May 2013 06:42:02 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1UhDba-000ELW-6j; Tue, 28 May 2013 09:41:58 +0300 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: Devin Teske Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP In-reply-to: <13CA24D6AB415D428143D44749F57D7201F62C26@ltcfiswmsgmb21> References: <13CA24D6AB415D428143D44749F57D7201F62C26@ltcfiswmsgmb21> Comments: In-reply-to "Teske, Devin" message dated "Mon, 27 May 2013 13:41:08 -0000." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 28 May 2013 09:41:58 +0300 From: Daniel Braniss Message-ID: Cc: "" , FreeBSD-STABLE Mailing List X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 06:42:03 -0000 ... > There are ways you can speed up the replication time. I tend to flood a ser= > ver with > TCP while I've heard of it happening under UDP flood too. > > Here's a nice way to flood a server with TCP (assuming you have SSH access = > to the > system via keys): > > sh -c 'while :;do dd if=3D/dev/urandom of=3D/dev/stdout bs=3D1m count=3D102= > 4 | ssh HOST2KILL /sbin/md5; done' > > Run that about 16 times in separate screen sessions from various other host= > s on your network, > taking care to replace "HOST2KILL" with the hostname or IP of the box with = > the SunFire X2200. > > Let that run for a while, and then when you think you've had a reset (if yo= > u weren't standing > there watching for one)=85 > > grep 'bge.*DOWN' /var/log/messages > > On a system that has booted and stayed up-and-running, there shouldn't be a= > ny messages like this: > > bge0: link state changed to DOWN > > When you actually get this message (if your experience is like ours), you'l= > l be down for 90 seconds > while the NIC resets. > > However, since you say you have some older 9.1 releases=85 I'd start by fir= > st trying to bring the > replication time of the problem down by using TCP and/or UDP floods. That w= > ay you'll be able to > test for resolution of the problem as you progress up to stable/9 (where th= > e problem should be fixed > by the aforementioned SVN revisions -- specific to your hardware). ... > any ideas? > > > Well, you say the connection is OK=85 so it doesn't sound like a full reset= > as it > was in our case (we have a different chipset). > > But I agree that a log full of those would be annoying. > > Try getting up to stable/9 in its current state (note: stable/8 also has al= > l the > aforementioned revisions too). > -- > Devin Hi Devin, the kernel is pretty new, actually last Friday's, and the svn says it's r250960. the bg1 port is not UP, it's shared with the onboard BMC/ILO/IPMI thingy. connecting to it via ssh gets me into it's ILO manager: ... Sun(TM) Embedded Lights Out Manager Copyright 2004-2006 Sun Microsystems, Inc. All rights reserved. Version 3.23 ... and so typing start AgentInfo/console I can get to the 'serial' console. cheers, and thanks, danny From owner-freebsd-stable@FreeBSD.ORG Tue May 28 06:48:59 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2790396E for ; Tue, 28 May 2013 06:48:59 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pd0-f170.google.com (mail-pd0-f170.google.com [209.85.192.170]) by mx1.freebsd.org (Postfix) with ESMTP id 042FAD3E for ; Tue, 28 May 2013 06:48:58 +0000 (UTC) Received: by mail-pd0-f170.google.com with SMTP id x10so7189742pdj.1 for ; Mon, 27 May 2013 23:48:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=biT1ei77+2BvIWOgSxfmWJkiYHc13MkzCeDRWEnoWgQ=; b=tHelfT1LdiHQ7n08kSp4ea/Vfwa2WNnEwaKFzH6dJQxRodF2etggOsv3Mk2zfIWbQ/ tVm5bynUNnkuZhOlWZZHVfHUcSPu0s3yoTM4gVrDHlKt9m0RAPXhRnb3grNbgvwrqH+G UxCydp6+HKnUMs827rXMKdiKUtsgY5Dkz9y5a9f0RX85tzEVa+i34nqlKMxn5Aym6Ymu tJ+KyQdeCkF0OCPaW0CqCw/VxqFfeqE+RPLXWlGhh9A55ieQAF0h9wIk1VoUcJ8AsjCB oIOhyF5l8kaJCog4/93h+UieIolUU2Jl7mAI7hFMPCIxf24pJtvQ2JBjpY9PQ0XYA6sN WF7A== X-Received: by 10.68.244.5 with SMTP id xc5mr32885364pbc.66.1369723737794; Mon, 27 May 2013 23:48:57 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id vb8sm31707105pbc.11.2013.05.27.23.48.54 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 27 May 2013 23:48:56 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Tue, 28 May 2013 15:48:50 +0900 From: YongHyeon PYUN Date: Tue, 28 May 2013 15:48:50 +0900 To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130528064850.GB1457@michelle.cdnetworks.com> References: <20130528052953.GA1457@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 06:48:59 -0000 On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > > > > > > > bge0: mem > > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > miibus2: on bge0 > > > brgphy0: PHY 1 on miibus2 > > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd > > > bge1: mem > > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > miibus3: on bge1 > > > brgphy1: PHY 1 on miibus3 > > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > bge1: Ethernet address: 00:1b:24:5d:5b:be > > > > > > sf-10> ifconfig bge1 > > > bge1: flags=8802 metric 0 mtu 1500 > > > options=8009b > > TE> > > > ether 00:1b:24:5d:5b:be > > > nd6 options=21 > > > media: Ethernet autoselect (100baseTX ) > > > status: active > > > > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events. > > Do you have some network script run by cron? > > no scripts. > this port is shared with the ILO/IPMI, and back in March you fixed a problem > that it was hanging soon after it was initialized by the driver, > (r248226 - but I'm not sure if it was ever MFC'ed). It was MFCed. > Initialy I thought it could be caused by connections to it from other > hosts (either via the web, or ssh) so I killed them, but it didn't help. > without that patch the connection fails, and I don't see any DOWN/UP. Could you check how many number of interrupts you get from bge1? Ideally you shouldn't get any interrupts for bge1. > > > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > > > > > To check, I upgraded another identical host, and the same problem appears. > > > > > > > > What is the last known working revision? > > > > > > I have no idea, but I have older versions, and ill start from the oldets > > > (9.1-prerelease), but > > > it will take time, since it takes hours till it happens. > > > > > > > ok. > > From owner-freebsd-stable@FreeBSD.ORG Tue May 28 06:49:34 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6C64CA73 for ; Tue, 28 May 2013 06:49:34 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta05.emeryville.ca.mail.comcast.net (qmta05.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:48]) by mx1.freebsd.org (Postfix) with ESMTP id 40764D54 for ; Tue, 28 May 2013 06:49:34 +0000 (UTC) Received: from omta23.emeryville.ca.mail.comcast.net ([76.96.30.90]) by qmta05.emeryville.ca.mail.comcast.net with comcast id hJpZ1l0011wfjNsA5JpZbL; Tue, 28 May 2013 06:49:33 +0000 Received: from jdc.koitsu.org ([67.180.84.87]) by omta23.emeryville.ca.mail.comcast.net with comcast id hJpX1l00b1t3BNj8jJpYhh; Tue, 28 May 2013 06:49:32 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id A6BAE73A33; Mon, 27 May 2013 23:49:31 -0700 (PDT) Date: Mon, 27 May 2013 23:49:31 -0700 From: Jeremy Chadwick To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130528064931.GA61056@icarus.home.lan> References: <20130528052953.GA1457@michelle.cdnetworks.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1369723773; bh=8keAuZvA/uH/7Te0moGM/DkD9B0W+y5KKcxT71jAizM=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=C7hd2ksc9gfL+JSbtO+3yjsiCM+lVdB+gaJwycrqV9pPOmPnFi4swfMsHODdMle7S xZXQWazne0hi6Xv8Papxvjdt6lMnS/Cu3V5y1kU6R31K4JkohWxMWkzKHuHyRKrOck L+Q0A7Yg3XHbd8ELvj4TxqagIE6S+caODZRqDoNqa3Mk6Y/HvRn8L+6HsVtYlxdlJK E3PfxX952Favfqk1jSzivl5wUxtCltOXLyG1MmR98NbwnTt+hohqxD8YHVOzGhcOX9 ezq384LkT+oWRLHrrqb65U2WrnXzXYstW3HvJA38SzPhy8fE4sclY8X++xEbH6yN9a cQPnzpSWnTlKA== Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 06:49:34 -0000 On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > > > > > > > bge0: mem > > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > miibus2: on bge0 > > > brgphy0: PHY 1 on miibus2 > > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd > > > bge1: mem > > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > miibus3: on bge1 > > > brgphy1: PHY 1 on miibus3 > > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > bge1: Ethernet address: 00:1b:24:5d:5b:be > > > > > > sf-10> ifconfig bge1 > > > bge1: flags=8802 metric 0 mtu 1500 > > > options=8009b > > TE> > > > ether 00:1b:24:5d:5b:be > > > nd6 options=21 > > > media: Ethernet autoselect (100baseTX ) > > > status: active > > > > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events. > > Do you have some network script run by cron? > > no scripts. > this port is shared with the ILO/IPMI, and back in March you fixed a problem > that it was hanging soon after it was initialized by the driver, > (r248226 - but I'm not sure if it was ever MFC'ed). > Initialy I thought it could be caused by connections to it from other > hosts (either via the web, or ssh) so I killed them, but it didn't help. > without that patch the connection fails, and I don't see any DOWN/UP. Two things: 1. r248226 in head was MFC'd to stable/9 as r248858. Validation: http://svnweb.freebsd.org/base/stable/9/sys/dev/bge/if_bge.c?view=log So the answer: whether or not you have that MFC in stable/9 depends on what SVN rev your kernel is. 2. Is there some way to verify that the ASF/iLO/IPMI bits (i.e. the IPMI firmware itself) are not shutting down bge1's PHY intentionally? Unless the IPMI module chooses to log something useful (e.g. "I'm doing this"), I'm not sure how you'd figure that out. Other question: is there any correlation between the amount of time that goes by between events with, say, ARP/MAC address expiry in "arp -a"? I mention this because I know some of the ASF methods have historically shown two MAC addresses on the same physif, and I can see how this might confuse some stacks. That "piggybacking" crap never should have been invented. All it has done is cause problems for every OS I know of (including Windows) since its inception, and is also exactly why today almost all vendors I've seen provide a dedicated NIC and RJ45 port for the iLO/IPMI interface. It's admission the "piggybacking" method doesn't work. And may it rot in hell for all I care, while simultaneously feeling very sorry for those who have to suffer/deal with it. This is just another reason why I've always been very picky about what hardware I'd buy for server deployments. Vendors never actually disclose this crap until you've shelled out money for the hardware, by which point it's too late and you're suffering. Really great model -- for the pocketbook. :/ -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue May 28 06:55:28 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DCC83BC9 for ; Tue, 28 May 2013 06:55:28 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 66752DA7 for ; Tue, 28 May 2013 06:55:28 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1UhDoa-000ElU-2U; Tue, 28 May 2013 09:55:24 +0300 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: pyunyh@gmail.com Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP In-reply-to: <20130528064850.GB1457@michelle.cdnetworks.com> References: <20130528052953.GA1457@michelle.cdnetworks.com> <20130528064850.GB1457@michelle.cdnetworks.com> Comments: In-reply-to YongHyeon PYUN message dated "Tue, 28 May 2013 15:48:50 +0900." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 28 May 2013 09:55:24 +0300 From: Daniel Braniss Message-ID: Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 06:55:28 -0000 > On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: > > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > > > > > > > > > > bge0: mem > > > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > > > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > > miibus2: on bge0 > > > > brgphy0: PHY 1 on miibus2 > > > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd > > > > bge1: mem > > > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > > > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > > miibus3: on bge1 > > > > brgphy1: PHY 1 on miibus3 > > > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > bge1: Ethernet address: 00:1b:24:5d:5b:be > > > > > > > > sf-10> ifconfig bge1 > > > > bge1: flags=8802 metric 0 mtu 1500 > > > > options=8009b > > > TE> > > > > ether 00:1b:24:5d:5b:be > > > > nd6 options=21 > > > > media: Ethernet autoselect (100baseTX ) > > > > status: active > > > > > > > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events. > > > Do you have some network script run by cron? > > > > no scripts. > > this port is shared with the ILO/IPMI, and back in March you fixed a problem > > that it was hanging soon after it was initialized by the driver, > > (r248226 - but I'm not sure if it was ever MFC'ed). > > It was MFCed. > > > Initialy I thought it could be caused by connections to it from other > > hosts (either via the web, or ssh) so I killed them, but it didn't help. > > without that patch the connection fails, and I don't see any DOWN/UP. > > Could you check how many number of interrupts you get from bge1? > Ideally you shouldn't get any interrupts for bge1. it's not even mentioned :-) sf-04> vmstat -i interrupt total rate irq3: uart1 964 0 irq4: uart0 6 0 irq14: ata0 227354 0 irq17: bge0 1021981 2 irq21: ohci0 28 0 irq22: ehci0 2 0 irq23: atapci1 293228 0 cpu0:timer 383244076 1124 cpu1:timer 2225144 6 cpu2:timer 2056087 6 cpu3:timer 2093943 6 Total 391162813 1147 > > > > > > > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > > > > > > To check, I upgraded another identical host, and the same problem appears. > > > > > > > > > > What is the last known working revision? > > > > > > > > I have no idea, but I have older versions, and ill start from the oldets > > > > (9.1-prerelease), but > > > > it will take time, since it takes hours till it happens. > > > > > > > > > > ok. > > > > From owner-freebsd-stable@FreeBSD.ORG Tue May 28 06:57:10 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C8101CDB for ; Tue, 28 May 2013 06:57:10 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta08.emeryville.ca.mail.comcast.net (qmta08.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:80]) by mx1.freebsd.org (Postfix) with ESMTP id 9C1FFDC5 for ; Tue, 28 May 2013 06:57:10 +0000 (UTC) Received: from omta23.emeryville.ca.mail.comcast.net ([76.96.30.90]) by qmta08.emeryville.ca.mail.comcast.net with comcast id hJue1l0021wfjNsA8Jx7gG; Tue, 28 May 2013 06:57:07 +0000 Received: from jdc.koitsu.org ([67.180.84.87]) by omta23.emeryville.ca.mail.comcast.net with comcast id hJx61l00M1t3BNj8jJx6Bs; Tue, 28 May 2013 06:57:07 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 9452C73A39; Mon, 27 May 2013 23:57:06 -0700 (PDT) Date: Mon, 27 May 2013 23:57:06 -0700 From: Jeremy Chadwick To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130528065706.GA61514@icarus.home.lan> References: <20130528052953.GA1457@michelle.cdnetworks.com> <20130528064931.GA61056@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130528064931.GA61056@icarus.home.lan> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1369724227; bh=OhAeIF1TT4El/x96ngYP2zAsyTHvpQ9E2cWFl9cutcc=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=seYp/EdOBZezTv6HYjZVzc0zyDh4vf6fw4Y8Bu6sQquKP82eByl4SvpMxyRvhVYXi SLFpCHAd3OqX1zRKgAYYvh6Jw8U1BSBAuUk4eqAdOSAy5813vix637JBJdVjCt3nuf ZmhFMXZQ7pHX9othJZwTdeaNGiXxZuzgP2vX/Uz/7Son7DL7nOYSiOelbZilOCyCbF Ub5P87f3ZbG+TG8F60ODUG0weexBZETGQp8HgsBxnYX+yjlM8k6vvTfui6XpDt/IQB ibazsaSAj+qqdDQdQsYA1QgdRen8FtAwRyteYtZNn68CC/7FJ7jC75gCAURqHnDjhE 49FmnvLJgRpvw== Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 06:57:10 -0000 On Mon, May 27, 2013 at 11:49:31PM -0700, Jeremy Chadwick wrote: > Other question: is there any correlation between the amount of time that > goes by between events with, say, ARP/MAC address expiry in "arp -a"? I > mention this because I know some of the ASF methods have historically > shown two MAC addresses on the same physif, and I can see how this might > confuse some stacks. Never mind -- I thought about this more, and it's irrelevant. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue May 28 07:57:34 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 205FC7C7 for ; Tue, 28 May 2013 07:57:34 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id CC6AAF3 for ; Tue, 28 May 2013 07:57:33 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1UhEmY-000HB3-8q; Tue, 28 May 2013 10:57:22 +0300 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: Jeremy Chadwick Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP In-reply-to: <20130528064931.GA61056@icarus.home.lan> References: <20130528052953.GA1457@michelle.cdnetworks.com> <20130528064931.GA61056@icarus.home.lan> Comments: In-reply-to Jeremy Chadwick message dated "Mon, 27 May 2013 23:49:31 -0700." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 28 May 2013 10:57:22 +0300 From: Daniel Braniss Message-ID: Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 07:57:34 -0000 -------- [...] > 1. r248226 in head was MFC'd to stable/9 as r248858. Validation: > > http://svnweb.freebsd.org/base/stable/9/sys/dev/bge/if_bge.c?view=log > > So the answer: whether or not you have that MFC in stable/9 depends on > what SVN rev your kernel is. I do a svnsync then I convert to mercurial so from the svn logs I see that the highest rev number is 250960. [...] > > That "piggybacking" crap never should have been invented. All it has > done is cause problems for every OS I know of (including Windows) since > its inception, and is also exactly why today almost all vendors I've > seen provide a dedicated NIC and RJ45 port for the iLO/IPMI interface. > It's admission the "piggybacking" method doesn't work. And may it rot > in hell for all I care, while simultaneously feeling very sorry for > those who have to suffer/deal with it. > > This is just another reason why I've always been very picky about what > hardware I'd buy for server deployments. Vendors never actually > disclose this crap until you've shelled out money for the hardware, by > which point it's too late and you're suffering. Really great model -- > for the pocketbook. :/ > I couldn't agree more! [...] in the case of the SunFire X2200, it has 4 bge ports, the 2nd, bge1, is only used by the ilo, it's not enabled (UP'ed), it doesn't have an interrupt assigned, it's, as far as I can tell, just anoying to have the DOWN/UP messages - unless something more sinester is lurking. thanks, danny From owner-freebsd-stable@FreeBSD.ORG Tue May 28 11:03:12 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 55A97ACA for ; Tue, 28 May 2013 11:03:12 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta10.emeryville.ca.mail.comcast.net (qmta10.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:43:76:96:30:17]) by mx1.freebsd.org (Postfix) with ESMTP id 33A75E04 for ; Tue, 28 May 2013 11:03:12 +0000 (UTC) Received: from omta24.emeryville.ca.mail.comcast.net ([76.96.30.92]) by qmta10.emeryville.ca.mail.comcast.net with comcast id hNxq1l0031zF43QAAP3BDc; Tue, 28 May 2013 11:03:11 +0000 Received: from jdc.koitsu.org ([67.180.84.87]) by omta24.emeryville.ca.mail.comcast.net with comcast id hP3A1l0071t3BNj8kP3AVp; Tue, 28 May 2013 11:03:10 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 0862073A33; Tue, 28 May 2013 04:03:10 -0700 (PDT) Date: Tue, 28 May 2013 04:03:10 -0700 From: Jeremy Chadwick To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130528110309.GA66043@icarus.home.lan> References: <20130528052953.GA1457@michelle.cdnetworks.com> <20130528064931.GA61056@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1369738991; bh=G/WixxyskWx1SXLkfu+etMYrtzDQtKMYZ2/zWJ8F5BU=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=ke/GFsqcRjkGSQiGMCW8GjyFm2CSJIE7qV5G+Y9Fwv73fn6SNQ1IMq4M+4P8dtlts yq0bUKa0SQGKpU71MyrzGf897rEUc9TVgg388BzEFQVik+U2gWhe0XecGQxzn05qjK GiHGdzLRTe5zXQJb9sbD/YUe0SJT/KciLMQB/rKtH7bloZPeWyjNMaM+BjUSRFO6AA wfkgxYE1bJR9+LQWUH61Mgy4ClDEwdkMH91KhfDYb4qEcAJypnrOdRgfsabXUM0Fx1 OnnEe4hJ7/b186Tpw/EGtoXjVWFoG60M+VhZVwbQC6arE97CEbdsrErqRq0Xb1/X6n Z/s7Mpp8yMMNg== Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 11:03:12 -0000 On Tue, May 28, 2013 at 10:57:22AM +0300, Daniel Braniss wrote: > -------- > [...] > > 1. r248226 in head was MFC'd to stable/9 as r248858. Validation: > > > > http://svnweb.freebsd.org/base/stable/9/sys/dev/bge/if_bge.c?view=log > > > > So the answer: whether or not you have that MFC in stable/9 depends on > > what SVN rev your kernel is. > > I do a svnsync then I convert to mercurial so from the svn logs I see that > the highest rev number is 250960. > > [...] > > > > That "piggybacking" crap never should have been invented. All it has > > done is cause problems for every OS I know of (including Windows) since > > its inception, and is also exactly why today almost all vendors I've > > seen provide a dedicated NIC and RJ45 port for the iLO/IPMI interface. > > It's admission the "piggybacking" method doesn't work. And may it rot > > in hell for all I care, while simultaneously feeling very sorry for > > those who have to suffer/deal with it. > > > > This is just another reason why I've always been very picky about what > > hardware I'd buy for server deployments. Vendors never actually > > disclose this crap until you've shelled out money for the hardware, by > > which point it's too late and you're suffering. Really great model -- > > for the pocketbook. :/ > > > > I couldn't agree more! > > [...] > > in the case of the SunFire X2200, it has 4 bge ports, the > 2nd, bge1, is only used by the ilo, it's not enabled (UP'ed), > it doesn't have an interrupt assigned, it's, as far as I can tell, > just anoying to have the DOWN/UP messages - unless something more sinester > is lurking. Does output from "ps -auxwwwwH | grep kernel/bge" show anything for bge1? What about "vmstat -i -a" (you might be surprised about the -a flag and what shows up compared to just using -i). Gut feeling says it will show up there. (See vmstat(8) for what -a does) Possibly interrupt generation isn't what's "triggering" the bge(4) device to see link going up/down; maybe this is done via some memory mapped I/O, which would explain why "vmstat -i" shows nothing for bge1 (no interrupts ever generated). That doesn't change the fact that the driver still is being told via some means that link is going up/down. Just a general FYI (probably not relevant here too much, but I often have to point it out for younger SAs (not saying anyone here is one, but the list is archived...)): there is a very distinct difference between a link being physically up/down vs. administratively up/down. With *IX ifconfig, the social assumption is that there's a 1:1 correlation between those (especially with Ethernet devices), when in reality it depends on the device driver and all subsystems in between. I remember quite clearly on some OSes (can't remember if BSD or Linux or Solaris) where "ifconfig xxx down" on certain devices would still result in packets being passed across xxx. This used to shock me when I was younger, but nowadays doesn't because I have a better understanding of why. ifconfig is just a generic tool that interfaces with a lot of things and tries to do too much, in my opinion. On BSD we tend to cram as much crap into ifconfig as humanly possible, while on other OSes separate per-device tools/utilities have been developed to segregate the intended behaviours/desires. -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue May 28 15:03:33 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 54F10753 for ; Tue, 28 May 2013 15:03:33 +0000 (UTC) (envelope-from mailnull@mips.inka.de) Received: from mail-in-07.arcor-online.net (mail-in-07.arcor-online.net [151.189.21.47]) by mx1.freebsd.org (Postfix) with ESMTP id 111D1CC for ; Tue, 28 May 2013 15:03:32 +0000 (UTC) Received: from mail-in-01-z2.arcor-online.net (mail-in-01-z2.arcor-online.net [151.189.8.13]) by mx.arcor.de (Postfix) with ESMTP id BB5A9107F66 for ; Tue, 28 May 2013 17:03:31 +0200 (CEST) Received: from mail-in-03.arcor-online.net (mail-in-03.arcor-online.net [151.189.21.43]) by mail-in-01-z2.arcor-online.net (Postfix) with ESMTP id B60877DA92B for ; Tue, 28 May 2013 17:03:31 +0200 (CEST) X-Greylist: Passed host: 94.218.179.254 X-DKIM: Sendmail DKIM Filter v2.8.2 mail-in-03.arcor-online.net 96171D83FC Received: from lorvorc.mips.inka.de (dslb-094-218-179-254.pools.arcor-ip.net [94.218.179.254]) by mail-in-03.arcor-online.net (Postfix) with ESMTPS id 96171D83FC for ; Tue, 28 May 2013 17:03:31 +0200 (CEST) Received: from lorvorc.mips.inka.de (localhost [127.0.0.1]) by lorvorc.mips.inka.de (8.14.7/8.14.7) with ESMTP id r4SF3VDl038996 for ; Tue, 28 May 2013 17:03:31 +0200 (CEST) (envelope-from mailnull@lorvorc.mips.inka.de) Received: (from mailnull@localhost) by lorvorc.mips.inka.de (8.14.7/8.14.7/Submit) id r4SF3VAx038995 for freebsd-stable@freebsd.org; Tue, 28 May 2013 17:03:31 +0200 (CEST) (envelope-from mailnull) From: naddy@mips.inka.de (Christian Weisgerber) Subject: Re: OpenSSH in -STABLE Date: Tue, 28 May 2013 15:03:31 +0000 (UTC) Message-ID: References: <20130522014240.275A5A6E38@smtp.hushmail.com> Originator: naddy@mips.inka.de (Christian Weisgerber) To: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 15:03:33 -0000 wrote: > Hi. Are there any plans to get OpenSSH 6.2 in 9-STABLE? I'd like to > check out the new AES-GCM stuff without going to -CURRENT on this > system. If there are no plans, is there a possibility? Thanks The OpenSSL version in 9-STABLE doesn't have GCM support. -- Christian "naddy" Weisgerber naddy@mips.inka.de From owner-freebsd-stable@FreeBSD.ORG Tue May 28 19:34:17 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 6F58E2FD; Tue, 28 May 2013 19:34:17 +0000 (UTC) (envelope-from mikes@siralan.org) Received: from mail.suso.org (mail.suso.org [66.244.94.5]) by mx1.freebsd.org (Postfix) with ESMTP id 387D8147; Tue, 28 May 2013 19:34:16 +0000 (UTC) Received: from c-98-223-197-163.hsd1.in.comcast.net (c-98-223-197-163.hsd1.in.comcast.net [98.223.197.163]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.suso.org (Postfix) with ESMTP id CD4F513813F; Tue, 28 May 2013 19:34:05 +0000 (GMT) Date: Tue, 28 May 2013 15:34:00 -0400 (EDT) From: "Michael L. Squires" X-X-Sender: mikes@familysquires.net To: YongHyeon PYUN Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 In-Reply-To: <20130528023300.GA3077@michelle.cdnetworks.com> Message-ID: References: <20130524044919.GA41292@icarus.home.lan> <20130524054720.GA1496@michelle.cdnetworks.com> <20130524.162926.395058052118975996.hrs@allbsd.org> <20130524.163646.628115045676432731.hrs@allbsd.org> <20130526113841.GA1511@michelle.cdnetworks.com> <20130527043923.GA1480@michelle.cdnetworks.com> <20130528023300.GA3077@michelle.cdnetworks.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: jdc@koitsu.org, gjb@freebsd.org, freebsd-stable@freebsd.org, re@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 19:34:17 -0000 Short answer: it didn't work. On Tue, 28 May 2013, YongHyeon PYUN wrote: > On Mon, May 27, 2013 at 01:02:14PM -0400, Michael L. Squires wrote: >> >> On Mon, 27 May 2013, YongHyeon PYUN wrote: >> >>> On Sun, May 26, 2013 at 08:38:41PM +0900, YongHyeon PYUN wrote: >>>> On Fri, May 24, 2013 at 04:36:46PM +0900, Hiroki Sato wrote: >>>>> Hiroki Sato wrote >>>>> in <20130524.162926.395058052118975996.hrs@allbsd.org>: >>>>> >>>>> hr> YongHyeon PYUN wrote >>>>> hr> in <20130524054720.GA1496@michelle.cdnetworks.com>: >>>>> hr> >>>>> hr> A workaround is specifying the following line in rc.conf: >>>>> hr> >>>>> hr> ifconfig_fxp0="DHCP media 100baseTX mediaopt full-duplex" >>>>> >> >> Sorry I've been offline, two trips last week. >> >> I've installed 8.4-RELEASE on the NAT box with the fxp interface: >> >> FreeBSD familysquires.net 8.4-RELEASE FreeBSD 8.4-RELEASE #54: Sun May 26 >> 22:56:19 EDT 2013 root@familysquires.net:/usr/obj/usr/src/sys/NEWGATE >> i386 >> >> and am using the workaround given above which has stopped the fxp interface >> cycling on/off. >> >> I'll have access to the other box on Wednesday and will try the other test. > > Here is patch I'm testing and it seems to work with dhclient on > CURRENT. > Mike, could you try attached patch? Patch did not solve the problem on the home NAT box. I'll try it on the second 1U box at work tomorrow. I applied the patch (see below) and recompiled/reinstalled world. root@familysquires:/usr/src/sys/dev/fxp # uname -a FreeBSD familysquires.net 8.4-RELEASE FreeBSD 8.4-RELEASE #54: Sun May 26 22:56:19 EDT 2013 root@familysquires.net:/usr/obj/usr/src/sys/NEWGATE i386 drwxr-xr-x 236 root 3584 May 28 10:28 ../ -rw-r--r-- 1 root 95366 May 28 10:28 if_fxp.c -rw-r--r-- 1 root 94968 Mar 28 09:04 if_fxp.c.orig -rw-r--r-- 1 root 15638 Mar 28 09:04 if_fxpreg.h -rw-r--r-- 1 root 8717 Mar 28 09:04 if_fxpvar.h -rw-r--r-- 1 root 23009 Mar 28 09:04 rcvbundl.h One immediate difference in behavior is that without the modified rc.conf the box was unable to use ntp to the outside world; it eventually sync'd on my internal ntp server. With the modified rc.conf the box immediately sync'd to an ntp server in the outside world. Result in messages was: May 28 13:39:24 familysquires kernel: fxp0: link state changed to DOWN May 28 13:39:24 familysquires dhclient: New Subnet Mask (fxp0): 255.255.240.0 May 28 13:39:24 familysquires dhclient: New Broadcast Address (fxp0): 255.255.25 5.255 May 28 13:39:24 familysquires dhclient: New Routers (fxp0): xx.xxx.xxx.1 May 28 13:39:26 familysquires kernel: fxp0: link state changed to UP May 28 13:39:26 familysquires dhclient: New IP Address (fxp0): xx.xxx.xxx.163 May 28 13:39:26 familysquires kernel: fxp0: link state changed to DOWN May 28 13:39:26 familysquires dhclient: New Subnet Mask (fxp0): 255.255.240.0 May 28 13:39:26 familysquires dhclient: New Broadcast Address (fxp0): 255.255.25 5.255 May 28 13:39:26 familysquires dhclient: New Routers (fxp0): xx.xxx.xxx.1 May 28 13:39:28 familysquires kernel: fxp0: link state changed to UP May 28 13:39:31 familysquires dhclient: New IP Address (fxp0): xx.xxx.xxx.163 May 28 13:39:31 familysquires kernel: fxp0: link state changed to DOWN May 28 13:39:31 familysquires dhclient: New Subnet Mask (fxp0): 255.255.240.0 May 28 13:39:31 familysquires dhclient: New Broadcast Address (fxp0): 255.255.25 5.255 May 28 13:39:31 familysquires dhclient: New Routers (fxp0): xx.xxx.xxx.1 May 28 13:39:33 familysquires kernel: fxp0: link state changed to UP May 28 13:39:36 familysquires dhclient: New IP Address (fxp0): xx.xxx.xxx.163 May 28 13:39:36 familysquires kernel: fxp0: link state changed to DOWN May 28 13:39:36 familysquires dhclient: New Subnet Mask (fxp0): 255.255.240.0 May 28 13:39:36 familysquires dhclient: New Broadcast Address (fxp0): 255.255.25 5.255 May 28 13:39:36 familysquires dhclient: New Routers (fxp0): xx.xxx.xxx.1 May 28 13:39:38 familysquires kernel: fxp0: link state changed to UP May 28 13:39:40 familysquires reboot: rebooted by root Mike Squires mikes@siralan.org UN*X at home since 1986 From owner-freebsd-stable@FreeBSD.ORG Tue May 28 23:48:09 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 85F65485; Tue, 28 May 2013 23:48:09 +0000 (UTC) (envelope-from hrs@FreeBSD.org) Received: from mail.allbsd.org (gatekeeper.allbsd.org [IPv6:2001:2f0:104:e001::32]) by mx1.freebsd.org (Postfix) with ESMTP id 0036F3CB; Tue, 28 May 2013 23:48:08 +0000 (UTC) Received: from alph.d.allbsd.org (p2175-ipbf701funabasi.chiba.ocn.ne.jp [122.25.209.175]) (authenticated bits=128) by mail.allbsd.org (8.14.5/8.14.5) with ESMTP id r4SNlnsU084004 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 29 May 2013 08:47:59 +0900 (JST) (envelope-from hrs@FreeBSD.org) Received: from localhost (localhost [127.0.0.1]) (authenticated bits=0) by alph.d.allbsd.org (8.14.5/8.14.5) with ESMTP id r4SNlkHs022910; Wed, 29 May 2013 08:47:47 +0900 (JST) (envelope-from hrs@FreeBSD.org) Date: Wed, 29 May 2013 08:47:14 +0900 (JST) Message-Id: <20130529.084714.2036194399784240097.hrs@allbsd.org> To: pyunyh@gmail.com Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 From: Hiroki Sato In-Reply-To: <20130528023300.GA3077@michelle.cdnetworks.com> References: <20130527043923.GA1480@michelle.cdnetworks.com> <20130528023300.GA3077@michelle.cdnetworks.com> X-PGPkey-fingerprint: BDB3 443F A5DD B3D0 A530 FFD7 4F2C D3D8 2793 CF2D X-Mailer: Mew version 6.5 on Emacs 24.3 / Mule 6.0 (HANACHIRUSATO) Mime-Version: 1.0 Content-Type: Multipart/Signed; protocol="application/pgp-signature"; micalg=pgp-sha1; boundary="--Security_Multipart(Wed_May_29_08_47_14_2013_729)--" Content-Transfer-Encoding: 7bit X-Virus-Scanned: clamav-milter 0.97.4 at gatekeeper.allbsd.org X-Virus-Status: Clean X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (mail.allbsd.org [133.31.130.32]); Wed, 29 May 2013 08:48:00 +0900 (JST) X-Spam-Status: No, score=-94.5 required=13.0 tests=CONTENT_TYPE_PRESENT, ONLY1HOPDIRECT,RCVD_IN_PBL,SAMEHELOBY2HOP,USER_IN_WHITELIST autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on gatekeeper.allbsd.org Cc: jdc@koitsu.org, gjb@FreeBSD.org, re@FreeBSD.org, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 May 2013 23:48:09 -0000 ----Security_Multipart(Wed_May_29_08_47_14_2013_729)-- Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit YongHyeon PYUN wrote in <20130528023300.GA3077@michelle.cdnetworks.com>: py> > I'll have access to the other box on Wednesday and will try the other test. py> py> Here is patch I'm testing and it seems to work with dhclient on py> CURRENT. py> Mike, could you try attached patch? On my box it worked without problem. Link status change of fxp0 was down->up only in the patched driver. -- Hiroki ----Security_Multipart(Wed_May_29_08_47_14_2013_729)-- Content-Type: application/pgp-signature Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (FreeBSD) iEYEABECAAYFAlGlQgIACgkQTyzT2CeTzy0ITwCeJsKsOJouEB4UVqwpumE80gQj 7gQAn3oCf3eEk0HNGsYi764xaKhcOOed =3lrH -----END PGP SIGNATURE----- ----Security_Multipart(Wed_May_29_08_47_14_2013_729)---- From owner-freebsd-stable@FreeBSD.ORG Wed May 29 06:31:38 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 34A0FFB6; Wed, 29 May 2013 06:31:38 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pd0-f181.google.com (mail-pd0-f181.google.com [209.85.192.181]) by mx1.freebsd.org (Postfix) with ESMTP id F3B01DB3; Wed, 29 May 2013 06:31:37 +0000 (UTC) Received: by mail-pd0-f181.google.com with SMTP id bv13so6696809pdb.12 for ; Tue, 28 May 2013 23:31:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=/6DQq+NXILi2EgTu3/AEvbnoP4Vid0pgph1WS7Wg6Vg=; b=y2GdA3Oc4ul+9HwboDph9V7Kf6y3tZhOgsqpt1gOpjKUZwqviW45jsIJVlG80tcwlG YuLBTkgV47uMUvlCBiBRJAO3k/1hKERhDfJ1DKUiCbt+8Vj29DooYJPZ6Mcom3Kydm7M 1/WgoLDg8fAgsF0bPtNE929rdLGj0o3z15xZry3JxzMtSBizIGZ9tZ321IXhdwxAMq7H ldx6Vm/j8KYTO9Ky5CSBSdc9PcWpu/0a3v4O5lrwi5f7ZK6YNJATHxdCtI0222/ktYc8 rT+wHzbKQXye4JaTa5oGY7rtag6+HHHQh5u+UmpcjcbdRc/1kJVE4+AuZ60ow4HSH9Qk f/JA== X-Received: by 10.66.82.69 with SMTP id g5mr1850951pay.179.1369809097374; Tue, 28 May 2013 23:31:37 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id dr6sm38548953pac.11.2013.05.28.23.31.32 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 28 May 2013 23:31:35 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 29 May 2013 15:31:27 +0900 From: YongHyeon PYUN Date: Wed, 29 May 2013 15:31:27 +0900 To: Hiroki Sato Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 Message-ID: <20130529063127.GA3042@michelle.cdnetworks.com> References: <20130527043923.GA1480@michelle.cdnetworks.com> <20130528023300.GA3077@michelle.cdnetworks.com> <20130529.084714.2036194399784240097.hrs@allbsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130529.084714.2036194399784240097.hrs@allbsd.org> User-Agent: Mutt/1.4.2.3i Cc: jdc@koitsu.org, gjb@freebsd.org, re@freebsd.org, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 06:31:38 -0000 On Wed, May 29, 2013 at 08:47:14AM +0900, Hiroki Sato wrote: > YongHyeon PYUN wrote > in <20130528023300.GA3077@michelle.cdnetworks.com>: > > py> > I'll have access to the other box on Wednesday and will try the other test. > py> > py> Here is patch I'm testing and it seems to work with dhclient on > py> CURRENT. > py> Mike, could you try attached patch? > > On my box it worked without problem. Link status change of fxp0 was > down->up only in the patched driver. Thanks for testing! > > -- Hiroki From owner-freebsd-stable@FreeBSD.ORG Wed May 29 06:42:33 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 849B73BA; Wed, 29 May 2013 06:42:33 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pb0-x22e.google.com (mail-pb0-x22e.google.com [IPv6:2607:f8b0:400e:c01::22e]) by mx1.freebsd.org (Postfix) with ESMTP id 4D287E68; Wed, 29 May 2013 06:42:33 +0000 (UTC) Received: by mail-pb0-f46.google.com with SMTP id rq2so8831897pbb.19 for ; Tue, 28 May 2013 23:42:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=PrSZ2wHrFr986/SAHgUWEFJ0nwM6nSs9j47hCGOBrxA=; b=HBz2RDvDHh95hOdS0L7LJCsbPOz6cUBQmJ7h9tWE3KLHFJpovGGKvHcXAwh8oprp10 2vTYAtyZa3AGSRxGx9DdTfQQD67B+WWomBj5qjEZ88TeZ71zR5HK4WBvh4WJH1Pl/qoW zbL1MWYciu3FXqCTcRLR4G6991ThCUdNzFjpQwHuyLQlX14lgnfZ1FojslvRwQ4Cv90+ hxTeJNWHzC2dSrgFF+TGYCW70umMVrH1czncLeL5BHxfefl6D6log3Yid8ocHVLMJ2wN cbL0KiHADsvjtFZsehl+iDeTVhHyW4ANTQHt7hyLrbJtDUMbA0NKdWmJHo6bFPJWi68y i4lA== X-Received: by 10.68.190.104 with SMTP id gp8mr1435053pbc.120.1369809753070; Tue, 28 May 2013 23:42:33 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id w8sm36114902pbo.9.2013.05.28.23.42.28 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Tue, 28 May 2013 23:42:31 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 29 May 2013 15:42:24 +0900 From: YongHyeon PYUN Date: Wed, 29 May 2013 15:42:24 +0900 To: "Michael L. Squires" Subject: Re: Apparent fxp regression in FreeBSD 8.4-RC3 Message-ID: <20130529064224.GB3042@michelle.cdnetworks.com> References: <20130524044919.GA41292@icarus.home.lan> <20130524054720.GA1496@michelle.cdnetworks.com> <20130524.162926.395058052118975996.hrs@allbsd.org> <20130524.163646.628115045676432731.hrs@allbsd.org> <20130526113841.GA1511@michelle.cdnetworks.com> <20130527043923.GA1480@michelle.cdnetworks.com> <20130528023300.GA3077@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: jdc@koitsu.org, gjb@freebsd.org, freebsd-stable@freebsd.org, re@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 06:42:33 -0000 On Tue, May 28, 2013 at 03:34:00PM -0400, Michael L. Squires wrote: > Short answer: it didn't work. [...] > Patch did not solve the problem on the home NAT box. I'll try it on the Hmm, I can't reproduce it on my box. I double checked every possible controller initialization sequences in driver but couldn't find a clue. Let you know if I manage to narrow down the issue. > second 1U box at work tomorrow. > > I applied the patch (see below) and recompiled/reinstalled world. > Rebuilding kernel should be enough. > root@familysquires:/usr/src/sys/dev/fxp # uname -a > FreeBSD familysquires.net 8.4-RELEASE FreeBSD 8.4-RELEASE #54: Sun May 26 > 22:56:19 EDT 2013 root@familysquires.net:/usr/obj/usr/src/sys/NEWGATE > i386 > > drwxr-xr-x 236 root 3584 May 28 10:28 ../ > -rw-r--r-- 1 root 95366 May 28 10:28 if_fxp.c > -rw-r--r-- 1 root 94968 Mar 28 09:04 if_fxp.c.orig > -rw-r--r-- 1 root 15638 Mar 28 09:04 if_fxpreg.h > -rw-r--r-- 1 root 8717 Mar 28 09:04 if_fxpvar.h > -rw-r--r-- 1 root 23009 Mar 28 09:04 rcvbundl.h > > One immediate difference in behavior is that without the modified rc.conf > the box was unable to use ntp to the outside world; it eventually sync'd on > my internal ntp server. With the modified rc.conf the box immediately > sync'd to an ntp server in the outside world. > There is a side-effect of the rc.conf workaround. Parallel detection may or may not work and generally can result in duplex mismatch. From owner-freebsd-stable@FreeBSD.ORG Wed May 29 06:47:28 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 061E052F for ; Wed, 29 May 2013 06:47:28 +0000 (UTC) (envelope-from kamikaze@bsdforen.de) Received: from mail.server1.bsdforen.de (bsdforen.de [82.193.243.81]) by mx1.freebsd.org (Postfix) with ESMTP id C1036EB0 for ; Wed, 29 May 2013 06:47:27 +0000 (UTC) Received: from mobileKamikaze.norad (HSI-KBW-134-3-231-194.hsi14.kabel-badenwuerttemberg.de [134.3.231.194]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.server1.bsdforen.de (Postfix) with ESMTPSA id 5B1B2861C5 for ; Wed, 29 May 2013 08:41:39 +0200 (CEST) Message-ID: <51A5A322.1020503@bsdforen.de> Date: Wed, 29 May 2013 08:41:38 +0200 From: Dominic Fandrey User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130518 Thunderbird/17.0.6 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Subject: System doesn't dump X-Enigmail-Version: 1.5.1 Content-Type: text/plain; charset=ascii Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 06:47:28 -0000 I have a number of actions that reliably panic the system, such as performing shutdown -p (yes I'm booting into an inconsistent file system every time). Both with my notebook and my workstation. However I cannot get the system to dump. dumpdir=/var/crash and I've tried ada0s2b, /dev/ada0s2b, label/5swap, /dev/label/5swap and AUTO for dumpdev to no avail. The swap partition is 16g, the machines have 8g RAM and there's plenty of hard disk space available for /var/crash. I'm looking for that secret, undocumented trigger, that makes the system dump if a panic occurs. Once upon a time dumping just worked if the swap partition was large enough. I miss those olden days. -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? From owner-freebsd-stable@FreeBSD.ORG Wed May 29 07:11:43 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id CE721A59 for ; Wed, 29 May 2013 07:11:43 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta11.emeryville.ca.mail.comcast.net (qmta11.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:211]) by mx1.freebsd.org (Postfix) with ESMTP id 9CDCFFB6 for ; Wed, 29 May 2013 07:11:43 +0000 (UTC) Received: from omta18.emeryville.ca.mail.comcast.net ([76.96.30.74]) by qmta11.emeryville.ca.mail.comcast.net with comcast id hjBi1l0031bwxycABjBiKi; Wed, 29 May 2013 07:11:42 +0000 Received: from jdc.koitsu.org ([67.180.84.87]) by omta18.emeryville.ca.mail.comcast.net with comcast id hjBh1l00A1t3BNj8ejBhAv; Wed, 29 May 2013 07:11:42 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 379C473A33; Wed, 29 May 2013 00:11:41 -0700 (PDT) Date: Wed, 29 May 2013 00:11:41 -0700 From: Jeremy Chadwick To: Dominic Fandrey Subject: Re: System doesn't dump Message-ID: <20130529071141.GA90903@icarus.home.lan> References: <51A5A322.1020503@bsdforen.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <51A5A322.1020503@bsdforen.de> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1369811502; bh=/nolIBy/nD6opeFhI3PPKC9OCJeJzdIt/hYreTEmsJI=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=KxOC0B+dNdL4cmKpUT/y8Qk+nnd52wJ/OunPOmS6r3ONeSLi2jFjcZJvq85vEWHKc yh2wnB9+pxI2fMzUknGkoCQ3nmYhW0EDCFppoJmecqOyhfPMyJqcEI0fT/4y/Kgo/5 njw6e4A1/fvFnVwF6GiPFJL4dOyTq+qAstRgNz5SDD9wpoG3AT4ZnQmQISnP10uugM rZXTw9agKsN4fKRb3j5BarHkLV91naElu9HCbW5EoaZ5dupk8EcWVs++vXuW5zocT1 pcXyWSX2VOX9y6X93ezS91NnFECEzrAUH7Z/wk2qLl7zqsTd5aMGNz5qgo3fPN1IKg XYu8xEpB2df1Q== Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 07:11:43 -0000 On Wed, May 29, 2013 at 08:41:38AM +0200, Dominic Fandrey wrote: > I have a number of actions that reliably panic the system, such as > performing shutdown -p (yes I'm booting into an inconsistent file > system every time). Both with my notebook and my workstation. > > However I cannot get the system to dump. > > dumpdir=/var/crash > and I've tried ada0s2b, /dev/ada0s2b, label/5swap, /dev/label/5swap and AUTO > for dumpdev to no avail. > > The swap partition is 16g, the machines have 8g RAM and there's plenty > of hard disk space available for /var/crash. > > I'm looking for that secret, undocumented trigger, that makes the > system dump if a panic occurs. Once upon a time dumping just worked > if the swap partition was large enough. I miss those olden days. Foremost: the fact you did not disclose your FreeBSD version (and SVN rev if you have it) nor architecture is disappointing. It matters more than you think. Please disclose it. Onward ho... If you have VGA console access, try dropping to db> and issuing the command "call doadump" (possibly preceded by "panic"). If you have serial console access, there are ways to drop to ddb but it depends on your kernel config (look for BREAK_TO_DEBUGGER and ALT_BREAK_TO_DEBUGGER in /sys/conf/NOTES). "Break" with serial, by the way, means a serial-level break signal (often why I prefer ALT_BREAK_TO_DEBUGGER). After doing "call doadump" you should definitely see the kernel dumping memory to swap (it gives a progress indicator of sorts). Google for the phrase "call doadump" and look at some of the results to get an idea of what the output normally is during that phase, for comparison. If you don't see such, I'm sure many of the kernel folks here can help figure out why. See sysctl debug.ddb.scripting.scripts for what should get automatically done on a panic. This may or may not be affected by ddb_enable="yes" in rc.conf (which mandates DDB being enabled in your kernel) -- I can't remember though, so someone else may want to comment. If your issue is that the kernel actually *does* dump memory to swap but that on boot-up savecore(8) doesn't recover the memory dump and populate relevant files in /var/crash: that's a separate issue that has been discussed for probably 10 years or longer with (to my knowledge) no definitive explanation. Theories presented (going off of memory here) were that that something ended up writing over parts of the "panic metadata" on the swap disk/slice/etc. and thus savecore(8) finds nothing. This is why rc scripts/etc. have to make sure to look for the swap "panic metadata" and run savecore(8) **before** issuing dumpon(8). My opinion, others' may vary: Stick with using dumpdev="auto" in rc.conf, assuming you have a /etc/fstab entry of "swap" somewhere. Swap should ideally be a partition or slice, not something abstracted out by other layers (see above paragraph for why I advocate that, but my additional opinion is that when it comes to getting a kernel dump and system configurations, KISS principle applies heavily. If your system is crashing, the last thing you want to deal with is why you can't get a kernel dump -- you could spend more time doing that than you do getting the panic info + debugging the actual crash), but again, this is my own opinion and there are legitimate other opinions as well -- I just follow what I do because I know it works. Likewise I always get wary of people's setups when I start seeing labels mentioned. *waves cane* Screw all this newfandangled stuff. :-) -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed May 29 08:10:18 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id CE551ADF for ; Wed, 29 May 2013 08:10:18 +0000 (UTC) (envelope-from olli@grabthar.secnetix.de) Received: from grabthar.secnetix.de (grabthar.secnetix.de [212.17.241.225]) by mx1.freebsd.org (Postfix) with ESMTP id 37AEC319 for ; Wed, 29 May 2013 08:10:17 +0000 (UTC) Received: from grabthar.secnetix.de (localhost [127.0.0.1]) by grabthar.secnetix.de (8.14.5/8.14.5) with ESMTP id r4T89EO9024070; Wed, 29 May 2013 10:09:14 +0200 (CEST) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by grabthar.secnetix.de (8.14.5/8.14.5/Submit) id r4T89EvT024069; Wed, 29 May 2013 10:09:14 +0200 (CEST) (envelope-from olli) Date: Wed, 29 May 2013 10:09:14 +0200 (CEST) Message-Id: <201305290809.r4T89EvT024069@grabthar.secnetix.de> From: Oliver Fromme To: freebsd-stable@FreeBSD.ORG Subject: 9.1-stable: ATI IXP600 AHCI: CAM timeout X-Newsgroups: list.freebsd-stable User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (FreeBSD/9.1-PRERELEASE-20120811 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 08:10:18 -0000 Hi, Yesterday I have downloaded the latest 9.1 snapshot (May 15th) from ftp.freebsd.org and installed it on a machine that was previously running Linux. It works fine, except that I get many the following when there is heavy disk I/O, e.g. when building world or ports: ahcich0: Timeout on slot 23 port 0 ahcich0: is 00000000 cs f07fffff ss ffffffff rs ffffffff tfd c0 serr 00000000 cmd 0004bc17 (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 c9 e0 40 04 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command It happens for *both* ahcich0/ada0 and ahcich1/ada1 equally often (it's a gmirror), sometimes even at exactly the same time so the messages for ada0 and ada1 are interleaved in the dmesg output. The worst thing is that the whole system seems to freeze completely for about 10 seconds each time it happens. Other than that, I haven't seen any ill effects, i.e. no processes dying and no panics (so far). But the system is quite unusable because of the freezes. I'm pretty sure the hardware has no defects. The machine was running Linux fine until recently. Are there any known issues with FreeBSD + ATI IXP600? The kernel is the default GENERIC from the snapshot, the only additional modules loaded are geom_mirror and linux.ko. The dmesg messages related to disks are copied below, and the full dmesg can be found here: http://www.secnetix.de/olli/tmp/dmesg.nox.txt Best regards Oliver FreeBSD 9.1-STABLE #0: Mon May 13 05:10:23 UTC 2013 root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 .. ahci0: port 0xb000-0xb007,0xa000-0xa003,0x9000-0x9007,0x8000-0x8003,0x7000-0x700f mem 0xfe7ff800-0xfe7ffbff irq 22 at device 18.0 on pci0 ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported ahcich0: at channel 0 on ahci0 ahcich1: at channel 1 on ahci0 ahcich2: at channel 2 on ahci0 ahcich3: at channel 3 on ahci0 .. .. (aprobe0:ahcich0:0:15:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe0:ahcich0:0:15:0): CAM status: Command timeout (aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted (aprobe1:ahcich1:0:15:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 (aprobe1:ahcich1:0:15:0): CAM status: Command timeout (aprobe1:ahcich1:0:15:0): Error 5, Retries exhausted ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 ada0: ATA-8 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C) ada0: Previously was known as ad4 ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 ada1: ATA-8 SATA 2.x device ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C) ada1: Previously was known as ad6 .. GEOM_MIRROR: Device mirror/gm0 launched (2/2). .. Trying to mount root from ufs:/dev/mirror/gm0s1a [rw]... .. ahcich0: Timeout on slot 23 port 0 ahcich0: is 00000000 cs f07fffff ss ffffffff rs ffffffff tfd c0 serr 00000000 cmd 0004bc17 (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 c9 e0 40 04 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich1: Timeout on slot 12 port 0 ahcich1: is 00000000 cs ffff8fff ss ffffffff rs ffffffff tfd 40 serr 00000000 cmd 0004ee17 (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 80 85 e3 40 04 00 00 00 00 00 (ada1:ahcich1:0:0:0): CAM status: Command timeout (ada1:ahcich1:0:0:0): Retrying command ahcich1: Timeout on slot 2 port 0 ahcich1: is 00000000 cs 00000000 ss 0000001c rs 0000001c tfd 40 serr 00000000 cmd 0004e417 ahcich0: Timeout on slot 12 port 0 (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 e0 04 e3 40 04 00 00 00 00 00 (ada1:ahcich1:0:0:0): CAM status: Command timeout ahcich0: is 00000000 cs 00000000 ss 00007000 rs 00007000 tfd 40 serr 00000000 cmd 0004ee17 (ada1:ahcich1:0:0:0): Retrying command (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 e0 04 e3 40 04 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command pid 40615 (try), uid 0: exited on signal 10 (core dumped) ahcich1: Timeout on slot 7 port 0 ahcich1: is 00000000 cs fffff07f ss ffffffff rs ffffffff tfd c0 serr 00000000 cmd 0004ac17 ahcich0: (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 7d 92 40 02 00 00 00 00 00 Timeout on slot 19 port 0 (ada1:ahcich1:0:0:0): CAM status: Command timeout ahcich0: is 00000000 cs ff07ffff ss ffffffff rs ffffffff tfd c0 serr 00000000 cmd 0004b817 (ada1:ahcich1:0:0:0): Retrying command (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 7d 92 40 02 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich1: Timeout on slot 12 port 0 ahcich1: is 00000000 cs 00000000 ss 0000f000 rs 0000f000 tfd 40 serr 00000000 cmd 0004ef17 ahcich0: Timeout on slot 24 port 0 (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 d8 78 e4 40 04 00 00 00 00 00 (ada1:ahcich1:0:0:0): CAM status: Command timeout ahcich0: is 00000000 cs 00000000 ss 0f000000 rs 0f000000 tfd 40 serr 00000000 cmd 0004fb17 (ada1:ahcich1:0:0:0): Retrying command (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 d8 78 e4 40 04 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command ahcich1: Timeout on slot 1 port 0 ahcich1: is 00000000 cs 00000000 ss 0000003e rs 0000003e tfd 40 serr 00000000 cmd 0004e517 ahcich0: Timeout on slot 13 port 0 (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 30 e0 e4 40 04 00 00 00 00 00 (ada1:ahcich1:0:0:0): CAM status: Command timeout ahcich0: is 00000000 cs 00000000 ss 0003e000 rs 0003e000 tfd 40 serr 00000000 cmd 0004f117 (ada1:ahcich1:0:0:0): Retrying command (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 30 e0 e4 40 04 00 00 00 00 00 (ada0:ahcich0:0:0:0): CAM status: Command timeout (ada0:ahcich0:0:0:0): Retrying command .. .. -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Handelsregister: Amtsgericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsreg.: Amtsgericht München, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen/-Produkte + mehr: http://www.secnetix.de/bsd "The most important decision in [programming] language design concerns what is to be left out." -- Niklaus Wirth From owner-freebsd-stable@FreeBSD.ORG Wed May 29 08:55:53 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 784B4739 for ; Wed, 29 May 2013 08:55:53 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pb0-x234.google.com (mail-pb0-x234.google.com [IPv6:2607:f8b0:400e:c01::234]) by mx1.freebsd.org (Postfix) with ESMTP id 5079F75F for ; Wed, 29 May 2013 08:55:53 +0000 (UTC) Received: by mail-pb0-f52.google.com with SMTP id um15so8940329pbc.11 for ; Wed, 29 May 2013 01:55:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=c5oSk141zXkueaOSbrDs+5drhknO9giLpBWlVBdRQww=; b=YW53RP5w/VCwwMz7h2zOfsBvvaBkYCvOInHiu36SWH5C4Bjw022hIjoJ+a06sRNTE1 b352P8pX+20LaXH6RLAXdBLaTiJPR7TPCx191zwqjZMz+qIIcj8vrrX4YWyEnED1hakh +UE9jdR1ADJaqem2IaM9i21s8j6yhtmWynheCRLrTnapxOcixac/WI3ST3aNvK5Vo0el 9kENUvUG9A6kzYWt06/B1oDabnm2gmNNPvGPHrxPpX0lyC7uL3QM1CtqjMqXTGfdzTJx 61pG3iw8bgNZv1tpuQFLU9z5Wgk6BePrBhs5yRXShHoGDrDD2qqTJni93UoIwKvzSZG8 rPNQ== X-Received: by 10.68.42.134 with SMTP id o6mr1814028pbl.149.1369817752755; Wed, 29 May 2013 01:55:52 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id ue8sm39033220pac.14.2013.05.29.01.55.49 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Wed, 29 May 2013 01:55:51 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Wed, 29 May 2013 17:55:44 +0900 From: YongHyeon PYUN Date: Wed, 29 May 2013 17:55:44 +0900 To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130529085544.GC3042@michelle.cdnetworks.com> References: <20130528052953.GA1457@michelle.cdnetworks.com> <20130528064850.GB1457@michelle.cdnetworks.com> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="/04w6evG8XlLl3ft" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 08:55:53 -0000 --/04w6evG8XlLl3ft Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Tue, May 28, 2013 at 09:55:24AM +0300, Daniel Braniss wrote: > > On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: > > > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > > > > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > > > > > > > > > > > > > bge0: mem > > > > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > > > > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > > > miibus2: on bge0 > > > > > brgphy0: PHY 1 on miibus2 > > > > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd > > > > > bge1: mem > > > > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > > > > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > > > miibus3: on bge1 > > > > > brgphy1: PHY 1 on miibus3 > > > > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > > bge1: Ethernet address: 00:1b:24:5d:5b:be > > > > > > > > > > sf-10> ifconfig bge1 > > > > > bge1: flags=8802 metric 0 mtu 1500 > > > > > options=8009b > > > > TE> > > > > > ether 00:1b:24:5d:5b:be > > > > > nd6 options=21 > > > > > media: Ethernet autoselect (100baseTX ) > > > > > status: active > > > > > > > > > > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events. > > > > Do you have some network script run by cron? > > > > > > no scripts. > > > this port is shared with the ILO/IPMI, and back in March you fixed a problem > > > that it was hanging soon after it was initialized by the driver, > > > (r248226 - but I'm not sure if it was ever MFC'ed). > > > > It was MFCed. > > > > > Initialy I thought it could be caused by connections to it from other > > > hosts (either via the web, or ssh) so I killed them, but it didn't help. > > > without that patch the connection fails, and I don't see any DOWN/UP. > > > > Could you check how many number of interrupts you get from bge1? > > Ideally you shouldn't get any interrupts for bge1. > > it's not even mentioned :-) > sf-04> vmstat -i > interrupt total rate > irq3: uart1 964 0 > irq4: uart0 6 0 > irq14: ata0 227354 0 > irq17: bge0 1021981 2 > irq21: ohci0 28 0 > irq22: ehci0 2 0 > irq23: atapci1 293228 0 > cpu0:timer 383244076 1124 > cpu1:timer 2225144 6 > cpu2:timer 2056087 6 > cpu3:timer 2093943 6 > Total 391162813 1147 > Then the only way link UP/DOWN event could be generated for DOWN interface would be invocation of media status query (i.e. ifconfig -a) triggered by an external application. Most drivers I touched check IFF_UP flag before poking media status register. However I'm not sure you're seeing this issue because you do not use any network script run by cron. Anyway, try attached patch and let me know whether it makes any difference. > > > > > > > > > > > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > > > > > > > To check, I upgraded another identical host, and the same problem appears. > > > > > > > > > > > > What is the last known working revision? > > > > > > > > > > I have no idea, but I have older versions, and ill start from the oldets > > > > > (9.1-prerelease), but > > > > > it will take time, since it takes hours till it happens. > > > > > > > > > > > > > ok. > > > > > > > > --/04w6evG8XlLl3ft Content-Type: text/x-diff; charset=us-ascii Content-Disposition: attachment; filename="bge.media_sts.diff" Index: sys/dev/bge/if_bge.c =================================================================== --- sys/dev/bge/if_bge.c (revision 251021) +++ sys/dev/bge/if_bge.c (working copy) @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar BGE_LOCK(sc); + if ((ifp->if_flags & IFF_UP) == 0) { + BGE_UNLOCK(sc); + return; + } if (sc->bge_flags & BGE_FLAG_TBI) { ifmr->ifm_status = IFM_AVALID; ifmr->ifm_active = IFM_ETHER; --/04w6evG8XlLl3ft-- From owner-freebsd-stable@FreeBSD.ORG Wed May 29 09:32:45 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E718FCB3 for ; Wed, 29 May 2013 09:32:45 +0000 (UTC) (envelope-from pascal.braun@continum.net) Received: from mailsrv1.continum.net (mr1.continum.net [80.72.129.121]) by mx1.freebsd.org (Postfix) with ESMTP id 7480A8FD for ; Wed, 29 May 2013 09:32:45 +0000 (UTC) Received: from zimbra.continum.net ([80.72.133.238]) by mr1.continum.net with esmtp (Exim 4.67) (envelope-from ) id 1UhcIe-0000Ee-5Y for freebsd-stable@freebsd.org; Wed, 29 May 2013 11:04:04 +0200 Received: from localhost (localhost [127.0.0.1]) by zimbra.continum.net (Postfix) with ESMTP id 1E68B1CE012 for ; Wed, 29 May 2013 11:04:01 +0200 (CEST) X-Virus-Scanned: amavisd-new at zimbra.continum.net Received: from zimbra.continum.net ([127.0.0.1]) by localhost (zimbra.continum.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eOiyZHpGwrPI for ; Wed, 29 May 2013 11:04:00 +0200 (CEST) Received: from zimbra.continum.net (zimbra.continum.net [80.72.133.238]) by zimbra.continum.net (Postfix) with ESMTP id C108519E02A for ; Wed, 29 May 2013 11:04:00 +0200 (CEST) Date: Wed, 29 May 2013 11:04:00 +0200 (CEST) From: "Pascal Braun, Continum" To: freebsd-stable@freebsd.org Message-ID: <803931797.154623.1369818240686.JavaMail.root@continum.net> In-Reply-To: <631084870.154593.1369817992337.JavaMail.root@continum.net> Subject: ZFS crashing while zfs recv in progress MIME-Version: 1.0 X-Originating-IP: [80.72.130.250] X-Mailer: Zimbra 7.2.0_GA_2669 (ZimbraWebClient - GC23 (Linux)/7.2.0_GA_2669) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 09:32:46 -0000 Hi, I'm trying to send a zfs pool from an old freebsd 9.0 installation to a new machine with freebsd 9.1. The pool is quite heavy (about 16TB, lots of snapshots) and the receiving side keeps crashing on me. The command used to transfer (run on the old 9.0 installation): zfs send -R tank@snapshot | ssh 10.10.xx.xx zfs recv -F -d -v tank After a few hours the system stops all writing and I can't start any new processes. Processes still running like 'zpool iostat' are still working, or at least it is still reporting something. To me it looks like the filesystem just disappeared. Unfortunately I'm running root on zfs so I don't have any logs about this. The only message I sometimes find on the console are about not being able to write to swap, which is also on zfs. Do you have any ideas? I don't even know where to start. regards, Pascal From owner-freebsd-stable@FreeBSD.ORG Wed May 29 09:37:24 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 95EC1E1D for ; Wed, 29 May 2013 09:37:24 +0000 (UTC) (envelope-from prvs=1861a99eb2=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 3A826946 for ; Wed, 29 May 2013 09:37:23 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004060836.msg for ; Wed, 29 May 2013 10:37:16 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 29 May 2013 10:37:16 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1861a99eb2=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-stable@freebsd.org Message-ID: From: "Steven Hartland" To: "Pascal Braun, Continum" , References: <803931797.154623.1369818240686.JavaMail.root@continum.net> Subject: Re: ZFS crashing while zfs recv in progress Date: Wed, 29 May 2013 10:37:07 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 09:37:24 -0000 Silly question but I assume your new 9.1 system isnt running from tank at the time? ----- Original Message ----- From: "Pascal Braun, Continum" > > I'm trying to send a zfs pool from an old freebsd 9.0 installation to a new machine with freebsd 9.1. The pool is quite heavy > (about 16TB, lots of snapshots) and the receiving side keeps crashing on me. The command used to transfer (run on the old 9.0 > installation): > zfs send -R tank@snapshot | ssh 10.10.xx.xx zfs recv -F -d -v tank > > > After a few hours the system stops all writing and I can't start any new processes. Processes still running like 'zpool iostat' > are still working, or at least it is still reporting something. To me it looks like the filesystem just disappeared. > Unfortunately I'm running root on zfs so I don't have any logs about this. > The only message I sometimes find on the console are about not being able to write to swap, which is also on zfs. > > > Do you have any ideas? I don't even know where to start. > > > regards, Pascal > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-stable@FreeBSD.ORG Wed May 29 09:59:32 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4B64F1D8 for ; Wed, 29 May 2013 09:59:32 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smarthost1.greenhost.nl (smarthost1.greenhost.nl [195.190.28.78]) by mx1.freebsd.org (Postfix) with ESMTP id 0FDF7A2D for ; Wed, 29 May 2013 09:59:31 +0000 (UTC) Received: from smtp.greenhost.nl ([213.108.104.138]) by smarthost1.greenhost.nl with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.69) (envelope-from ) id 1UhdAH-0003r3-D3; Wed, 29 May 2013 11:59:29 +0200 Received: from [81.21.138.17] (helo=ronaldradial.versatec.local) by smtp.greenhost.nl with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1UhdAE-0008P8-OL; Wed, 29 May 2013 11:59:26 +0200 Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-stable@freebsd.org, "Pascal Braun, Continum" Subject: Re: ZFS crashing while zfs recv in progress References: <803931797.154623.1369818240686.JavaMail.root@continum.net> Date: Wed, 29 May 2013 11:59:26 +0200 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Ronald Klop" Message-ID: In-Reply-To: <803931797.154623.1369818240686.JavaMail.root@continum.net> User-Agent: Opera Mail/12.15 (Win32) X-Virus-Scanned: by clamav at smarthost1.samage.net X-Spam-Level: - X-Spam-Score: -1.9 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00 autolearn=disabled version=3.3.1 X-Scan-Signature: 5a1627636b35b65657045ef62631cd80 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 09:59:32 -0000 On Wed, 29 May 2013 11:04:00 +0200, Pascal Braun, Continum wrote: > > Hi, > > > I'm trying to send a zfs pool from an old freebsd 9.0 installation to a > new machine with freebsd 9.1. The pool is quite heavy (about 16TB, lots > of snapshots) and the receiving side keeps crashing on me. The command > used to transfer (run on the old 9.0 installation): > zfs send -R tank@snapshot | ssh 10.10.xx.xx zfs recv -F -d -v tank > > > After a few hours the system stops all writing and I can't start any new > processes. Processes still running like 'zpool iostat' are still > working, or at least it is still reporting something. To me it looks > like the filesystem just disappeared. Unfortunately I'm running root on > zfs so I don't have any logs about this. > The only message I sometimes find on the console are about not being > able to write to swap, which is also on zfs. > > > Do you have any ideas? I don't even know where to start. > Please send more information about the new server. Sometimes there are bugs found in drivers with large disks, etc. Or firmware of hardware. The contents of /var/run/dmesg.boot is interesting to a lot of people. As is the output of zpool status. As you are having trouble with swap on zfs. Is it possible to put that on a separate disk for the test? Ronald. From owner-freebsd-stable@FreeBSD.ORG Wed May 29 13:02:03 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 866EB895 for ; Wed, 29 May 2013 13:02:03 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 0FBE8AEF for ; Wed, 29 May 2013 13:02:02 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1Uhg0p-000N8o-7v; Wed, 29 May 2013 16:01:55 +0300 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: pyunyh@gmail.com Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP In-reply-to: <20130529085544.GC3042@michelle.cdnetworks.com> References: <20130528052953.GA1457@michelle.cdnetworks.com> <20130528064850.GB1457@michelle.cdnetworks.com> <20130529085544.GC3042@michelle.cdnetworks.com> Comments: In-reply-to YongHyeon PYUN message dated "Wed, 29 May 2013 17:55:44 +0900." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 29 May 2013 16:01:55 +0300 From: Daniel Braniss Message-ID: Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 13:02:03 -0000 > > --/04w6evG8XlLl3ft > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > > On Tue, May 28, 2013 at 09:55:24AM +0300, Daniel Braniss wrote: > > > On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: > > > > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > > > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > > > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > > > > > > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > > > > > > > > > > > > > > > > bge0: mem > > > > > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > > > > > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > > > > miibus2: on bge0 > > > > > > brgphy0: PHY 1 on miibus2 > > > > > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd > > > > > > bge1: mem > > > > > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > > > > > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > > > > miibus3: on bge1 > > > > > > brgphy1: PHY 1 on miibus3 > > > > > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > > > bge1: Ethernet address: 00:1b:24:5d:5b:be > > > > > > > > > > > > sf-10> ifconfig bge1 > > > > > > bge1: flags=8802 metric 0 mtu 1500 > > > > > > options=8009b > > > > > TE> > > > > > > ether 00:1b:24:5d:5b:be > > > > > > nd6 options=21 > > > > > > media: Ethernet autoselect (100baseTX ) > > > > > > status: active > > > > > > > > > > > > > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events. > > > > > Do you have some network script run by cron? > > > > > > > > no scripts. > > > > this port is shared with the ILO/IPMI, and back in March you fixed a problem > > > > that it was hanging soon after it was initialized by the driver, > > > > (r248226 - but I'm not sure if it was ever MFC'ed). > > > > > > It was MFCed. > > > > > > > Initialy I thought it could be caused by connections to it from other > > > > hosts (either via the web, or ssh) so I killed them, but it didn't help. > > > > without that patch the connection fails, and I don't see any DOWN/UP. > > > > > > Could you check how many number of interrupts you get from bge1? > > > Ideally you shouldn't get any interrupts for bge1. > > > > it's not even mentioned :-) > > sf-04> vmstat -i > > interrupt total rate > > irq3: uart1 964 0 > > irq4: uart0 6 0 > > irq14: ata0 227354 0 > > irq17: bge0 1021981 2 > > irq21: ohci0 28 0 > > irq22: ehci0 2 0 > > irq23: atapci1 293228 0 > > cpu0:timer 383244076 1124 > > cpu1:timer 2225144 6 > > cpu2:timer 2056087 6 > > cpu3:timer 2093943 6 > > Total 391162813 1147 > > > > Then the only way link UP/DOWN event could be generated for DOWN > interface would be invocation of media status query > (i.e. ifconfig -a) triggered by an external application. Most > drivers I touched check IFF_UP flag before poking media status > register. However I'm not sure you're seeing this issue because you > do not use any network script run by cron. > Anyway, try attached patch and let me know whether it makes any > difference. > > > > > > > > > > > > > > > > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > > > > > > > > To check, I upgraded another identical host, and the same problem appears. > > > > > > > > > > > > > > What is the last known working revision? > > > > > > > > > > > > I have no idea, but I have older versions, and ill start from the oldets > > > > > > (9.1-prerelease), but > > > > > > it will take time, since it takes hours till it happens. > > > > > > > > > > > > > > > > ok. > > > > > > > > > > > > > > --/04w6evG8XlLl3ft > Content-Type: text/x-diff; charset=us-ascii > Content-Disposition: attachment; filename="bge.media_sts.diff" > > Index: sys/dev/bge/if_bge.c > =================================================================== > --- sys/dev/bge/if_bge.c (revision 251021) > +++ sys/dev/bge/if_bge.c (working copy) > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar > > BGE_LOCK(sc); > > + if ((ifp->if_flags & IFF_UP) == 0) { > + BGE_UNLOCK(sc); > + return; > + } > if (sc->bge_flags & BGE_FLAG_TBI) { > ifmr->ifm_status = IFM_AVALID; > ifmr->ifm_active = IFM_ETHER; > > --/04w6evG8XlLl3ft-- done, will let you know in 24hs. thanks, danny From owner-freebsd-stable@FreeBSD.ORG Wed May 29 13:05:15 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 65F8BA59 for ; Wed, 29 May 2013 13:05:15 +0000 (UTC) (envelope-from olli@grabthar.secnetix.de) Received: from grabthar.secnetix.de (grabthar.secnetix.de [212.17.241.225]) by mx1.freebsd.org (Postfix) with ESMTP id E9B0FB5A for ; Wed, 29 May 2013 13:05:14 +0000 (UTC) Received: from grabthar.secnetix.de (localhost [127.0.0.1]) by grabthar.secnetix.de (8.14.5/8.14.5) with ESMTP id r4TD5DKK037955; Wed, 29 May 2013 15:05:13 +0200 (CEST) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by grabthar.secnetix.de (8.14.5/8.14.5/Submit) id r4TD5DAP037954; Wed, 29 May 2013 15:05:13 +0200 (CEST) (envelope-from olli) Date: Wed, 29 May 2013 15:05:13 +0200 (CEST) Message-Id: <201305291305.r4TD5DAP037954@grabthar.secnetix.de> From: Oliver Fromme To: freebsd-stable@FreeBSD.ORG Subject: Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout In-Reply-To: <201305290809.r4T89EvT024069@grabthar.secnetix.de> X-Newsgroups: list.freebsd-stable User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (FreeBSD/9.1-PRERELEASE-20120811 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 13:05:15 -0000 Now I have some more information ... The problem disappears when I disable NCQ, i.e. set the number of tags to 1 with camcontrol. Using binary search I found out that the problem also disappears with 2 tags, but with 3 tags I get the same amout of errors as with the default of 32 tags. Interestingly, the problems also disappears when I reduce the SATA level from II to I (i.e. from 3 to 1.5 Gbit/s), even if the NCQ tags are left at the default of 32. Now the question is: Is it better to reduce the NCQ tags from 32 to 2, or to reduce the SATA bandwidth from 3 Gbps to 1.5 Gbps? What is more likely to impact performance on a mixed server with shell users, apache, sendmail, DNS and a few other things? Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Handelsregister: Amtsgericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsreg.: Amtsgericht München, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen/-Produkte + mehr: http://www.secnetix.de/bsd In my experience the term "transparent proxy" is an oxymoron (like jumbo shrimp). "Transparent" proxies seem to vary from the distortions of a funhouse mirror to barely translucent. I really, really dislike them when trying to figure out the corrective lenses needed with each of them. -- R. Kevin Oberman, Network Engineer From owner-freebsd-stable@FreeBSD.ORG Wed May 29 13:11:51 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 58C06DBB for ; Wed, 29 May 2013 13:11:51 +0000 (UTC) (envelope-from prvs=1861a99eb2=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id F2086CF4 for ; Wed, 29 May 2013 13:11:50 +0000 (UTC) Received: from r2d2 ([82.69.141.170]) by mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) (MDaemon PRO v10.0.4) with ESMTP id md50004063752.msg for ; Wed, 29 May 2013 14:11:49 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Wed, 29 May 2013 14:11:49 +0100 (not processed: message from valid local sender) X-MDDKIM-Result: neutral (mail1.multiplay.co.uk) X-MDRemoteIP: 82.69.141.170 X-Return-Path: prvs=1861a99eb2=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-stable@FreeBSD.ORG Message-ID: From: "Steven Hartland" To: "Oliver Fromme" , References: <201305291305.r4TD5DAP037954@grabthar.secnetix.de> Subject: Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout Date: Wed, 29 May 2013 14:11:39 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 8bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.6157 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 13:11:51 -0000 Have you checked your sata cables and psu outputs? Both of these could be the underlying cause of poor signalling. ----- Original Message ----- From: "Oliver Fromme" To: Sent: Wednesday, May 29, 2013 2:05 PM Subject: Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout Now I have some more information ... The problem disappears when I disable NCQ, i.e. set the number of tags to 1 with camcontrol. Using binary search I found out that the problem also disappears with 2 tags, but with 3 tags I get the same amout of errors as with the default of 32 tags. Interestingly, the problems also disappears when I reduce the SATA level from II to I (i.e. from 3 to 1.5 Gbit/s), even if the NCQ tags are left at the default of 32. Now the question is: Is it better to reduce the NCQ tags from 32 to 2, or to reduce the SATA bandwidth from 3 Gbps to 1.5 Gbps? What is more likely to impact performance on a mixed server with shell users, apache, sendmail, DNS and a few other things? Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Handelsregister: Amtsgericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsreg.: Amtsgericht München, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen/-Produkte + mehr: http://www.secnetix.de/bsd In my experience the term "transparent proxy" is an oxymoron (like jumbo shrimp). "Transparent" proxies seem to vary from the distortions of a funhouse mirror to barely translucent. I really, really dislike them when trying to figure out the corrective lenses needed with each of them. -- R. Kevin Oberman, Network Engineer _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-stable@FreeBSD.ORG Wed May 29 13:30:05 2013 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 802FC4A5; Wed, 29 May 2013 13:30:05 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id 4B887E1A; Wed, 29 May 2013 13:30:05 +0000 (UTC) Received: from [192.168.43.26] (pyroxene.sentex.ca [199.212.134.18]) by smarthost1.sentex.ca (8.14.5/8.14.5) with ESMTP id r4TDU4dn091188; Wed, 29 May 2013 09:30:04 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <51A60301.3030506@sentex.net> Date: Wed, 29 May 2013 09:30:41 -0400 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= , stable@freebsd.org Subject: Re: svn commit: r247485 - in stable/9: crypto/openssh crypto/openssh/openbsd-compat secure/lib/libssh secure/usr.sbin/sshd References: <201302281843.r1SIhoaq004371@svn.freebsd.org> <5130D8E0.3020605@sentex.net> In-Reply-To: <5130D8E0.3020605@sentex.net> X-Enigmail-Version: 1.4.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.72 on 64.7.153.18 Cc: svn-src-stable-9@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 13:30:05 -0000 On 3/1/2013 11:35 AM, Mike Tancsa wrote: > On 2/28/2013 1:43 PM, Dag-Erling Smørgrav wrote: >> Author: des >> Date: Thu Feb 28 18:43:50 2013 >> New Revision: 247485 >> URL: http://svnweb.freebsd.org/changeset/base/247485 >> >> Log: >> Pull in OpenSSH 6.1 from head. > > Hi, > I updated a box to RELENG_9 with this change, and I can no longer ssh > from my secure crt client. I have a stock sshd_config, but when I connect, > > cipher_init: EVP_CipherInit: set key failed for aes128-cbc [preauth] For the archives, this is fixed in http://lists.freebsd.org/pipermail/svn-src-head/2013-May/047921.html It can be worked around by setting UsePrivilegeSeparation yes in /etc/ssh/sshd_config The default behind the scenes was changed to UsePrivilegeSeparation sandbox See the above thread for more information. ---Mike > > Starting the daemon in debug mode below. If I change the first cipher > offered to blowfish, it works. > > /usr/sbin/sshd -ddd > debug2: load_server_config: filename /etc/ssh/sshd_config > debug2: load_server_config: done config len = 219 > debug2: parse_server_config: config /etc/ssh/sshd_config len 219 > debug3: /etc/ssh/sshd_config:54 setting AuthorizedKeysFile > .ssh/authorized_keys > debug3: /etc/ssh/sshd_config:122 setting Subsystem sftp > /usr/libexec/sftp-server > debug1: HPN Buffer Size: 131072 > debug1: sshd version OpenSSH_6.1p1_hpn13v11 FreeBSD-20120901 > debug3: Incorrect RSA1 identifier > debug1: read PEM private key done: type RSA > debug1: private host key: #0 type 1 RSA > debug3: Incorrect RSA1 identifier > debug1: read PEM private key done: type DSA > debug1: private host key: #1 type 2 DSA > debug3: Incorrect RSA1 identifier > debug1: read PEM private key done: type ECDSA > debug1: private host key: #2 type 3 ECDSA > debug1: rexec_argv[0]='/usr/sbin/sshd' > debug1: rexec_argv[1]='-ddd' > debug2: fd 4 setting O_NONBLOCK > debug3: ssh_sock_set_v6only: set socket 4 IPV6_V6ONLY > debug1: Bind to port 22 on ::. > debug1: Server TCP RWIN socket size: 131072 > debug1: HPN Buffer Size: 131072 > Server listening on :: port 22. > debug2: fd 5 setting O_NONBLOCK > debug1: Bind to port 22 on 0.0.0.0. > debug1: Server TCP RWIN socket size: 131072 > debug1: HPN Buffer Size: 131072 > Server listening on 0.0.0.0 port 22. > debug1: fd 6 clearing O_NONBLOCK > debug1: Server will not fork when running in debugging mode. > debug3: send_rexec_state: entering fd = 9 config len 219 > debug3: ssh_msg_send: type 0 > debug3: send_rexec_state: done > debug1: rexec start in 6 out 6 newsock 6 pipe -1 sock 9 > debug1: inetd sockets after dupping: 4, 4 > debug1: res_init() > Connection from 2607:f3e0:0:4:f025:8813:7603:7e4a port 52567 > debug1: HPN Disabled: 0, HPN Buffer Size: 131072 > debug1: Client protocol version 2.0; client software version > SecureCRT_6.6.1 (x64 build 289) SecureCRT > debug1: no match: SecureCRT_6.6.1 (x64 build 289) SecureCRT > debug1: Enabling compatibility mode for protocol 2.0 > debug1: Local version string SSH-2.0-OpenSSH_6.1_hpn13v11 FreeBSD-20120901 > debug2: fd 4 setting O_NONBLOCK > debug3: ssh_sandbox_init: preparing rlimit sandbox > debug2: Network child is on pid 2667 > debug3: preauth child monitor started > debug3: privsep user:group 22:22 [preauth] > debug1: permanently_set_uid: 22/22 [preauth] > debug1: list_hostkey_types: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256 [preauth] > debug1: SSH2_MSG_KEXINIT sent [preauth] > debug1: SSH2_MSG_KEXINIT received [preauth] > debug2: kex_parse_kexinit: > ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 > [preauth] > debug2: kex_parse_kexinit: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256 [preauth] > debug2: kex_parse_kexinit: > aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se > [preauth] > debug2: kex_parse_kexinit: > aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se > [preauth] > debug2: kex_parse_kexinit: > hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96 > [preauth] > debug2: kex_parse_kexinit: > hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96 > [preauth] > debug2: kex_parse_kexinit: none,zlib@openssh.com [preauth] > debug2: kex_parse_kexinit: none,zlib@openssh.com [preauth] > debug2: kex_parse_kexinit: [preauth] > debug2: kex_parse_kexinit: [preauth] > debug2: kex_parse_kexinit: first_kex_follows 0 [preauth] > debug2: kex_parse_kexinit: reserved 0 [preauth] > debug2: kex_parse_kexinit: > diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1 > [preauth] > debug2: kex_parse_kexinit: > ssh-dss,ssh-rsa,x509v3-sign-rsa,x509v3-sign-dss [preauth] > debug2: kex_parse_kexinit: > aes128-cbc,aes256-cbc,aes256-ctr,aes192-ctr,aes128-ctr,aes192-cbc,twofish-cbc,blowfish-cbc,3des-cbc,arcfour > [preauth] > debug2: kex_parse_kexinit: > aes128-cbc,aes256-cbc,aes256-ctr,aes192-ctr,aes128-ctr,aes192-cbc,twofish-cbc,blowfish-cbc,3des-cbc,arcfour > [preauth] > debug2: kex_parse_kexinit: > hmac-sha1,hmac-sha1-96,hmac-md5,hmac-md5-96,umac-64@openssh.com [preauth] > debug2: kex_parse_kexinit: > hmac-sha1,hmac-sha1-96,hmac-md5,hmac-md5-96,umac-64@openssh.com [preauth] > debug2: kex_parse_kexinit: none [preauth] > debug2: kex_parse_kexinit: none [preauth] > debug2: kex_parse_kexinit: [preauth] > debug2: kex_parse_kexinit: [preauth] > debug2: kex_parse_kexinit: first_kex_follows 0 [preauth] > debug2: kex_parse_kexinit: reserved 0 [preauth] > debug2: mac_setup: found hmac-sha1 [preauth] > debug1: kex: client->server aes128-cbc hmac-sha1 none [preauth] > debug2: mac_setup: found hmac-sha1 [preauth] > debug1: kex: server->client aes128-cbc hmac-sha1 none [preauth] > debug1: SSH2_MSG_KEX_DH_GEX_REQUEST received [preauth] > debug3: mm_request_send entering: type 0 [preauth] > debug3: mm_choose_dh: waiting for MONITOR_ANS_MODULI [preauth] > debug3: mm_request_receive_expect entering: type 1 [preauth] > debug3: mm_request_receive entering [preauth] > debug3: mm_request_receive entering > debug3: monitor_read: checking request 0 > debug3: mm_answer_moduli: got parameters: 1024 2048 2048 > debug3: mm_request_send entering: type 1 > debug2: monitor_read: 0 used once, disabling now > debug3: mm_choose_dh: remaining 0 [preauth] > debug1: SSH2_MSG_KEX_DH_GEX_GROUP sent [preauth] > debug2: dh_gen_key: priv key bits set: 181/320 [preauth] > debug2: bits set: 1008/2048 [preauth] > debug1: expecting SSH2_MSG_KEX_DH_GEX_INIT [preauth] > debug2: bits set: 1009/2048 [preauth] > debug3: mm_key_sign entering [preauth] > debug3: mm_request_send entering: type 4 [preauth] > debug3: mm_key_sign: waiting for MONITOR_ANS_SIGN [preauth] > debug3: mm_request_receive_expect entering: type 5 [preauth] > debug3: mm_request_receive entering [preauth] > debug3: mm_request_receive entering > debug3: monitor_read: checking request 4 > debug3: mm_answer_sign > debug3: mm_answer_sign: signature 0x803017180(55) > debug3: mm_request_send entering: type 5 > debug2: monitor_read: 4 used once, disabling now > debug1: SSH2_MSG_KEX_DH_GEX_REPLY sent [preauth] > debug2: kex_derive_keys [preauth] > debug2: set_newkeys: mode 1 [preauth] > cipher_init: EVP_CipherInit: set key failed for aes128-cbc [preauth] > debug1: do_cleanup [preauth] > debug3: PAM: sshpam_thread_cleanup entering [preauth] > debug3: mm_request_receive entering > debug1: do_cleanup > debug3: PAM: sshpam_thread_cleanup entering > debug1: Killing privsep child 2667 > > -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-stable@FreeBSD.ORG Wed May 29 14:21:37 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2D931F2A for ; Wed, 29 May 2013 14:21:37 +0000 (UTC) (envelope-from olli@grabthar.secnetix.de) Received: from grabthar.secnetix.de (grabthar.secnetix.de [212.17.241.225]) by mx1.freebsd.org (Postfix) with ESMTP id B3A9E794 for ; Wed, 29 May 2013 14:21:36 +0000 (UTC) Received: from grabthar.secnetix.de (localhost [127.0.0.1]) by grabthar.secnetix.de (8.14.5/8.14.5) with ESMTP id r4TELYVL042537; Wed, 29 May 2013 16:21:34 +0200 (CEST) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by grabthar.secnetix.de (8.14.5/8.14.5/Submit) id r4TELY8p042536; Wed, 29 May 2013 16:21:34 +0200 (CEST) (envelope-from olli) Date: Wed, 29 May 2013 16:21:34 +0200 (CEST) Message-Id: <201305291421.r4TELY8p042536@grabthar.secnetix.de> From: Oliver Fromme To: freebsd-stable@FreeBSD.ORG, killing@multiplay.co.uk Subject: Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout In-Reply-To: X-Newsgroups: list.freebsd-stable User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (FreeBSD/9.1-PRERELEASE-20120811 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 14:21:37 -0000 Steven Hartland wrote: > Have you checked your sata cables and psu outputs? > > Both of these could be the underlying cause of poor signalling. I can't easily check that because it is a cheap rented server in a remote location. But I don't believe it is bad cabling or PSU anyway, or otherwise the problem would occur intermittently all the time if the load on the disks is sufficiently high. But it only occurs at tags=3 and above. At tags=2 it does not occur at all, no matter how hard I hammer on the disks. At the moment I'm inclined to believe that it is either a bug in the HDD firmware or in the controller. The disks aren't exactly new, they're 400 GB Samsung ones that are several years old. I think it's not uncommon to have bugs in the NCQ implementation in such disks. The only thing that puzzles me is the fact that the problem also disappears completely when I reduce the SATA rev from II to I, even at tags=32. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Handelsregister: Amtsgericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsreg.: Amtsgericht München, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen/-Produkte + mehr: http://www.secnetix.de/bsd "People still program in C. People keep writing shell scripts. *Most* people don't realize the shortcomings of the tools they are using because they a) don't reflect on their workflows and they are b) too lazy to check out alternatives to realize there is help." -- Simon 'corecode' Schubert From owner-freebsd-stable@FreeBSD.ORG Wed May 29 14:26:25 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 73499235; Wed, 29 May 2013 14:26:25 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost1.sentex.ca (smarthost1-6.sentex.ca [IPv6:2607:f3e0:0:1::12]) by mx1.freebsd.org (Postfix) with ESMTP id 320637F1; Wed, 29 May 2013 14:26:25 +0000 (UTC) Received: from [192.168.43.26] (pyroxene.sentex.ca [199.212.134.18]) by smarthost1.sentex.ca (8.14.5/8.14.5) with ESMTP id r4TEQOE1001319; Wed, 29 May 2013 10:26:25 -0400 (EDT) (envelope-from mike@sentex.net) Message-ID: <51A61035.9050900@sentex.net> Date: Wed, 29 May 2013 10:27:01 -0400 From: Mike Tancsa Organization: Sentex Communications User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: Scott Long Subject: vfs.read_min (was Re: svn commit: r250906 - stable/9/sys/kern) References: <201305220844.r4M8iLWJ005148@svn.freebsd.org> In-Reply-To: <201305220844.r4M8iLWJ005148@svn.freebsd.org> X-Enigmail-Version: 1.4.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.72 on 64.7.153.18 Cc: FreeBSD-STABLE Mailing List X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 14:26:25 -0000 Hi Scott, This seems to significantly help for reading through large files (argus / netflow files in my case) on my zfs server. Doing some quick tests and setting the size to 2 makes a difference from the default. In your case, did you find some optimal settings ? Are there some tradeoffs / caveats to setting this value to a non default value ? ---Mike On 5/22/2013 4:44 AM, Scott Long wrote: > Author: scottl > Date: Wed May 22 08:44:21 2013 > New Revision: 250906 > URL: http://svnweb.freebsd.org/changeset/base/250906 > > Log: > MFC r250327 > > Add a sysctl vfs.read_min to complement the exiting vfs.read_max. It > defaults to 1, meaning that it's off. > > When read-ahead is enabled on a file, the vfs cluster code deliberately > breaks a read into 2 I/O transactions; one to satisfy the actual read, > and one to perform read-ahead. This makes sense in low-latency > circumstances, but often produces unbalanced i/o transactions that > penalize disks. By setting vfs.read_min, we can tell the algorithm to > fetch a larger transaction that what we asked for, achieving the same > effect as the read-ahead but without the doubled, unbalanced transaction > and the slightly lower latency. This significantly helps our workloads > with video streaming. > > Submitted by: emax > Reviewed by: kib > Obtained from: Netflix > > Modified: > stable/9/sys/kern/vfs_cluster.c > Directory Properties: > stable/9/sys/ (props changed) > > Modified: stable/9/sys/kern/vfs_cluster.c > ============================================================================== > --- stable/9/sys/kern/vfs_cluster.c Wed May 22 07:52:41 2013 (r250905) > +++ stable/9/sys/kern/vfs_cluster.c Wed May 22 08:44:21 2013 (r250906) > @@ -75,6 +75,10 @@ static int read_max = 64; > SYSCTL_INT(_vfs, OID_AUTO, read_max, CTLFLAG_RW, &read_max, 0, > "Cluster read-ahead max block count"); > > +static int read_min = 1; > +SYSCTL_INT(_vfs, OID_AUTO, read_min, CTLFLAG_RW, &read_min, 0, > + "Cluster read min block count"); > + > /* Page expended to mark partially backed buffers */ > extern vm_page_t bogus_page; > > @@ -169,6 +173,7 @@ cluster_read(vp, filesize, lblkno, size, > } else { > off_t firstread = bp->b_offset; > int nblks; > + long minread; > > KASSERT(bp->b_offset != NOOFFSET, > ("cluster_read: no buffer offset")); > @@ -176,6 +181,13 @@ cluster_read(vp, filesize, lblkno, size, > ncontig = 0; > > /* > + * Adjust totread if needed > + */ > + minread = read_min * size; > + if (minread > totread) > + totread = minread; > + > + /* > * Compute the total number of blocks that we should read > * synchronously. > */ > _______________________________________________ > svn-src-stable-9@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/svn-src-stable-9 > To unsubscribe, send any mail to "svn-src-stable-9-unsubscribe@freebsd.org" > > -- ------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ From owner-freebsd-stable@FreeBSD.ORG Wed May 29 15:07:35 2013 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 15BC7469; Wed, 29 May 2013 15:07:35 +0000 (UTC) (envelope-from des@des.no) Received: from smtp.des.no (smtp.des.no [194.63.250.102]) by mx1.freebsd.org (Postfix) with ESMTP id CE912B37; Wed, 29 May 2013 15:07:34 +0000 (UTC) Received: from nine.des.no (smtp.des.no [194.63.250.102]) by smtp-int.des.no (Postfix) with ESMTP id 3242E7324; Wed, 29 May 2013 15:07:34 +0000 (UTC) Received: by nine.des.no (Postfix, from userid 1001) id BAE234AA42; Wed, 29 May 2013 17:07:35 +0200 (CEST) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: Mike Tancsa Subject: Re: svn commit: r247485 - in stable/9: crypto/openssh crypto/openssh/openbsd-compat secure/lib/libssh secure/usr.sbin/sshd References: <201302281843.r1SIhoaq004371@svn.freebsd.org> <5130D8E0.3020605@sentex.net> <51A60301.3030506@sentex.net> Date: Wed, 29 May 2013 17:07:35 +0200 In-Reply-To: <51A60301.3030506@sentex.net> (Mike Tancsa's message of "Wed, 29 May 2013 09:30:41 -0400") Message-ID: <8638t54yfc.fsf@nine.des.no> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: stable@freebsd.org, svn-src-stable-9@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 15:07:35 -0000 Mike Tancsa writes: > For the archives, this is fixed in > http://lists.freebsd.org/pipermail/svn-src-head/2013-May/047921.html Fixed in head, but not stable/9 yet. I'll pull 6.2p2 from head to stable/9 later this week. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no From owner-freebsd-stable@FreeBSD.ORG Wed May 29 15:16:21 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 312EDA6A for ; Wed, 29 May 2013 15:16:21 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from mho-01-ewr.mailhop.org (mho-03-ewr.mailhop.org [204.13.248.66]) by mx1.freebsd.org (Postfix) with ESMTP id 0C89ADAB for ; Wed, 29 May 2013 15:16:20 +0000 (UTC) Received: from c-24-8-230-52.hsd1.co.comcast.net ([24.8.230.52] helo=damnhippie.dyndns.org) by mho-01-ewr.mailhop.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.72) (envelope-from ) id 1Uhi6u-000Gk7-65; Wed, 29 May 2013 15:16:20 +0000 Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r4TFGHDA007633; Wed, 29 May 2013 09:16:17 -0600 (MDT) (envelope-from ian@FreeBSD.org) X-Mail-Handler: Dyn Standard SMTP by Dyn X-Originating-IP: 24.8.230.52 X-Report-Abuse-To: abuse@dyndns.com (see http://www.dyndns.com/services/sendlabs/outbound_abuse.html for abuse reporting information) X-MHO-User: U2FsdGVkX18NkzI6uh0bYWEts4wr9GuL Subject: Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout From: Ian Lepore To: Oliver Fromme In-Reply-To: <201305291421.r4TELY8p042536@grabthar.secnetix.de> References: <201305291421.r4TELY8p042536@grabthar.secnetix.de> Content-Type: text/plain; charset="us-ascii" Date: Wed, 29 May 2013 09:16:17 -0600 Message-ID: <1369840577.1258.45.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: killing@multiplay.co.uk, freebsd-stable@FreeBSD.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 15:16:21 -0000 On Wed, 2013-05-29 at 16:21 +0200, Oliver Fromme wrote: > Steven Hartland wrote: > > Have you checked your sata cables and psu outputs? > > > > Both of these could be the underlying cause of poor signalling. > > I can't easily check that because it is a cheap rented > server in a remote location. > > But I don't believe it is bad cabling or PSU anyway, or > otherwise the problem would occur intermittently all the > time if the load on the disks is sufficiently high. > But it only occurs at tags=3 and above. At tags=2 it does > not occur at all, no matter how hard I hammer on the disks. > > At the moment I'm inclined to believe that it is either > a bug in the HDD firmware or in the controller. The disks > aren't exactly new, they're 400 GB Samsung ones that are > several years old. I think it's not uncommon to have bugs > in the NCQ implementation in such disks. > > The only thing that puzzles me is the fact that the problem > also disappears completely when I reduce the SATA rev from > II to I, even at tags=32. > It seems to me that you dismiss signaling problems too quickly. Consider the possibilities... A bad cable leads to intermittant errors at higher speeds. When NCQ is disabled or limited the software handles these errors pretty much transparently. When NCQ is not limitted and there are many outstanding requests, suddenly the error handling in the software breaks down somehow and a minor recoverable problem becomes an in-your-face error. I'm not saying any of the foregoing is true, just that you should consider the possibility that you're dealing with multiple problems which are only loosely coupled, but together can seem like a single more serious problem. You don't know enough yet to casually dismiss anything. -- Ian From owner-freebsd-stable@FreeBSD.ORG Wed May 29 18:44:13 2013 Return-Path: Delivered-To: freebsd-stable@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id DBEBC430; Wed, 29 May 2013 18:44:13 +0000 (UTC) (envelope-from olli@grabthar.secnetix.de) Received: from grabthar.secnetix.de (grabthar.secnetix.de [212.17.241.225]) by mx1.freebsd.org (Postfix) with ESMTP id 5BCA4FF8; Wed, 29 May 2013 18:44:09 +0000 (UTC) Received: from grabthar.secnetix.de (localhost [127.0.0.1]) by grabthar.secnetix.de (8.14.5/8.14.5) with ESMTP id r4TIi8lP055054; Wed, 29 May 2013 20:44:08 +0200 (CEST) (envelope-from oliver.fromme@secnetix.de) Received: (from olli@localhost) by grabthar.secnetix.de (8.14.5/8.14.5/Submit) id r4TIi8N2055053; Wed, 29 May 2013 20:44:08 +0200 (CEST) (envelope-from olli) Date: Wed, 29 May 2013 20:44:08 +0200 (CEST) Message-Id: <201305291844.r4TIi8N2055053@grabthar.secnetix.de> From: Oliver Fromme To: freebsd-stable@FreeBSD.ORG, ian@FreeBSD.ORG Subject: Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout In-Reply-To: <1369840577.1258.45.camel@revolution.hippie.lan> X-Newsgroups: list.freebsd-stable User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (FreeBSD/9.1-PRERELEASE-20120811 (i386)) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 18:44:13 -0000 Ian Lepore wrote: > On Wed, 2013-05-29 at 16:21 +0200, Oliver Fromme wrote: > > Steven Hartland wrote: > > > Have you checked your sata cables and psu outputs? > > > > > > Both of these could be the underlying cause of poor signalling. > > > > I can't easily check that because it is a cheap rented > > server in a remote location. > > > > But I don't believe it is bad cabling or PSU anyway, or > > otherwise the problem would occur intermittently all the > > time if the load on the disks is sufficiently high. > > But it only occurs at tags=3 and above. At tags=2 it does > > not occur at all, no matter how hard I hammer on the disks. > > > > At the moment I'm inclined to believe that it is either > > a bug in the HDD firmware or in the controller. The disks > > aren't exactly new, they're 400 GB Samsung ones that are > > several years old. I think it's not uncommon to have bugs > > in the NCQ implementation in such disks. > > > > The only thing that puzzles me is the fact that the problem > > also disappears completely when I reduce the SATA rev from > > II to I, even at tags=32. > > It seems to me that you dismiss signaling problems too quickly. > Consider the possibilities... A bad cable leads to intermittant errors > at higher speeds. When NCQ is disabled or limited the software handles > these errors pretty much transparently. When NCQ is not limitted and > there are many outstanding requests, suddenly the error handling in the > software breaks down somehow and a minor recoverable problem becomes an > in-your-face error. > > I'm not saying any of the foregoing is true, just that you should > consider the possibility that you're dealing with multiple problems > which are only loosely coupled, but together can seem like a single more > serious problem. You don't know enough yet to casually dismiss > anything. Well ... I also can't dismiss the possibility that there is a mouse in the machine that is pulling the SATA cables twice every minute. :-) But seriously ... I don't see how bad cabling could cause errors at tags=3 and no errors at all at tags=2. It shouldn't make a difference for the cables if there are two or three tags used. And by the way, it doesn't make a difference at all whether I use tags=3 or tags=32; the rate of errors is the same in both cases (about two per minute during buildword). I have googled a bit; the Samsung HD401LJ and HD403LJ don't seem to be innocent ... There are lots of pages mentioning problems with NCQ and SATA I vs. II. Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing Handelsregister: Amtsgericht Muenchen, HRA 74606, Geschäftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsreg.: Amtsgericht München, HRB 125758, Geschäftsführer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen/-Produkte + mehr: http://www.secnetix.de/bsd "A misleading benchmark test can accomplish in minutes what years of good engineering can never do." -- Dilbert (2009-03-02) From owner-freebsd-stable@FreeBSD.ORG Wed May 29 19:53:28 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 75DF2CB5 for ; Wed, 29 May 2013 19:53:28 +0000 (UTC) (envelope-from mcdouga9@egr.msu.edu) Received: from mail.egr.msu.edu (hill.egr.msu.edu [35.9.37.162]) by mx1.freebsd.org (Postfix) with ESMTP id 5017C3D5 for ; Wed, 29 May 2013 19:53:27 +0000 (UTC) Received: from hill (localhost [127.0.0.1]) by mail.egr.msu.edu (Postfix) with ESMTP id 5BE5D273DE for ; Wed, 29 May 2013 15:47:08 -0400 (EDT) X-Virus-Scanned: amavisd-new at egr.msu.edu Received: from mail.egr.msu.edu ([127.0.0.1]) by hill (hill.egr.msu.edu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id YCG0VzpAb010 for ; Wed, 29 May 2013 15:47:08 -0400 (EDT) Received: from EGR authenticated sender Message-ID: <51A65B3C.4060205@egr.msu.edu> Date: Wed, 29 May 2013 15:47:08 -0400 From: Adam McDougall User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130516 Thunderbird/17.0.6 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Subject: Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout References: <201305291421.r4TELY8p042536@grabthar.secnetix.de> In-Reply-To: <201305291421.r4TELY8p042536@grabthar.secnetix.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 19:53:28 -0000 On 05/29/13 10:21, Oliver Fromme wrote: > Steven Hartland wrote: > > Have you checked your sata cables and psu outputs? > > > > Both of these could be the underlying cause of poor signalling. > > I can't easily check that because it is a cheap rented > server in a remote location. > > But I don't believe it is bad cabling or PSU anyway, or > otherwise the problem would occur intermittently all the > time if the load on the disks is sufficiently high. > But it only occurs at tags=3 and above. At tags=2 it does > not occur at all, no matter how hard I hammer on the disks. > > At the moment I'm inclined to believe that it is either > a bug in the HDD firmware or in the controller. The disks > aren't exactly new, they're 400 GB Samsung ones that are > several years old. I think it's not uncommon to have bugs > in the NCQ implementation in such disks. > > The only thing that puzzles me is the fact that the problem > also disappears completely when I reduce the SATA rev from > II to I, even at tags=32. > > Best regards > Oliver > > Jeremy Chadwick knows of some hardware faults with IXP600/700, there may be more information on the freebsd-fs mailing list archives or if you can discuss with him: http://docs.freebsd.org/cgi/mid.cgi?20130414194440.GB38338 That email mentions port multipliers but the problems may extend beyond. From owner-freebsd-stable@FreeBSD.ORG Wed May 29 22:41:38 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E62D6172 for ; Wed, 29 May 2013 22:41:38 +0000 (UTC) (envelope-from hugo@barafranca.com) Received: from mail.barafranca.com (mail.barafranca.com [67.213.67.47]) by mx1.freebsd.org (Postfix) with ESMTP id C3EBDD3F for ; Wed, 29 May 2013 22:41:38 +0000 (UTC) Received: from localhost (unknown [172.16.100.24]) by mail.barafranca.com (Postfix) with ESMTP id 7E27690 for ; Wed, 29 May 2013 22:41:38 +0000 (UTC) X-Virus-Scanned: amavisd-new at barafranca.com Received: from mail.barafranca.com ([172.16.100.24]) by localhost (mail.barafranca.com [172.16.100.24]) (amavisd-new, port 10024) with ESMTP id 8yLBxSrLICKJ for ; Wed, 29 May 2013 22:40:57 +0000 (UTC) Received: from [192.168.1.1] (a89-152-175-134.cpe.netcabo.pt [89.152.175.134]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.barafranca.com (Postfix) with ESMTPSA id D341284 for ; Wed, 29 May 2013 22:40:53 +0000 (UTC) Message-ID: <51A683E1.6070306@barafranca.com> Date: Wed, 29 May 2013 23:40:33 +0100 From: Hugo Silva User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Subject: Constant AE_AML_NO_OPERAND errors on Toshiba Z930 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 22:41:39 -0000 Hello! I'm seeing a lot of the following on this laptop: ACPI Error: Method execution failed [\\_SB_.BAT1._HID] (Node 0xfffffe0002a13600), AE_AML_NO_OPERAND (20110527/uteval-113) ACPI Error: No object attached to node 0xfffffe0002a13600 (20110527/exresnte-139) Running 9.1-STABLE/amd64 as of yesterday. Any idea how to improve the situation? Thanks! From owner-freebsd-stable@FreeBSD.ORG Wed May 29 22:54:15 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 53C0F39D for ; Wed, 29 May 2013 22:54:15 +0000 (UTC) (envelope-from jdc@koitsu.org) Received: from qmta13.emeryville.ca.mail.comcast.net (qmta13.emeryville.ca.mail.comcast.net [IPv6:2001:558:fe2d:44:76:96:27:243]) by mx1.freebsd.org (Postfix) with ESMTP id 26B70DAC for ; Wed, 29 May 2013 22:54:15 +0000 (UTC) Received: from omta07.emeryville.ca.mail.comcast.net ([76.96.30.59]) by qmta13.emeryville.ca.mail.comcast.net with comcast id hx001l0041GXsucADyuEL1; Wed, 29 May 2013 22:54:14 +0000 Received: from jdc.koitsu.org ([67.180.84.87]) by omta07.emeryville.ca.mail.comcast.net with comcast id hyuC1l00d1t3BNj8UyuDKB; Wed, 29 May 2013 22:54:13 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 9877373A33; Wed, 29 May 2013 15:54:12 -0700 (PDT) Date: Wed, 29 May 2013 15:54:12 -0700 From: Jeremy Chadwick To: Oliver Fromme Subject: Re: 9.1-stable: ATI IXP600 AHCI: CAM timeout Message-ID: <20130529225412.GA8102@icarus.home.lan> References: <201305290809.r4T89EvT024069@grabthar.secnetix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201305290809.r4T89EvT024069@grabthar.secnetix.de> User-Agent: Mutt/1.5.21 (2010-09-15) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1369868054; bh=Eg3tIPDldo90aDRtiWrlnqS/oHu4qsoZ5SBfpadbEws=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=mblpZ8eYdJoAS7clQTUPWPnG9yC6LaYn5MmTAQMWnEH9TLSDY3gzwN2WyNwLvCzE/ 1O2iNF7G+YfwHM2aPj14WHOxR+fi4vRvk2eyCWDEFOv7WqNZOX5ESWFLWDHxrHG0W/ oNMRgdHG1JJihFoEhMsbdhvwZwsQOT1Ihpm9gNNJJCRk0l+HL2to9+Y0bY1xJPnrh+ f5QCCfHpF86D2p4ZA3wHtVIE29kjTc3YTwS2ZKvsIbRp4eLMiyTHblTvju/SY0SGss f3hkvXbBovC4ms8JtQxRd0wZNsjelSXXwxHSGnmpw6sK2C//YTy/FxACPYZi5q7OTm 4Wex+xHoy16qA== Cc: freebsd-stable@FreeBSD.ORG X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 May 2013 22:54:15 -0000 On Wed, May 29, 2013 at 10:09:14AM +0200, Oliver Fromme wrote: > Hi, > > Yesterday I have downloaded the latest 9.1 snapshot (May 15th) > from ftp.freebsd.org and installed it on a machine that was > previously running Linux. It works fine, except that I get > many the following when there is heavy disk I/O, e.g. when > building world or ports: > > ahcich0: Timeout on slot 23 port 0 > ahcich0: is 00000000 cs f07fffff ss ffffffff rs ffffffff tfd c0 serr 00000000 cmd 0004bc17 > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 c9 e0 40 04 00 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command The messages above indicate two things: 1) The AHCI driver is reporting an internal timeout when trying to speak to the underlying device (disk) attached to whatever maps to ahcich0; this is an "AHCI-level timeout", and the 2nd line shows all of the AHCI-level status conditions at that time, 2) CAM reports what it was trying to do when that happened, specifically issue WRITE_FPDMA_QUEUED (an NCQ-based write to ada0), which timed out after 30 seconds (kern.cam.ada.default_timeout). > It happens for *both* ahcich0/ada0 and ahcich1/ada1 equally > often (it's a gmirror), sometimes even at exactly the same > time so the messages for ada0 and ada1 are interleaved in > the dmesg output. Both surprising and not surprising (to me anyway), on numerous levels. > The worst thing is that the whole system seems to freeze > completely for about 10 seconds each time it happens. > Other than that, I haven't seen any ill effects, i.e. no > processes dying and no panics (so far). But the system is > quite unusable because of the freezes. There isn't much you can do about this. I get the impression from your statement this is the first time you've ever encountered an I/O timeout in your life? :-) This is just how it works -- pretty much the entire I/O subsystem (for the device(s) involved) "stalls" until a response to the CDB gets received. It's like this on all OSes, all systems; it's how I/O works. The AHCI driver may have different timeout settings; I haven't looked. The same CDB gets re-submit to the controller 5 times (kern.cam.ada.retry_count will say 4, but it starts at 0 if I remember right), in hopes that the I/O transaction will eventually go through. Repeated device timeouts with no successful responses will eventually cause CAM or AHCI (I forget which driver/subsystem) to drop the disk. In your case, this could mean ada0 and ada1 eventually getting dropped, which would induce a panic since you're using them for your root filesystem. (I wonder if there are readers of this thread who are starting to see why I use a single disk for my main OS drive...) > I'm pretty sure the hardware has no defects. The machine > was running Linux fine until recently. > > Are there any known issues with FreeBSD + ATI IXP600? This is opening a can of worms, which I've discussed in the past. Please see my posts to freebsd-fs and/or freebsd-stable archives (another person in this thread mentioned it as well). Fact: there is still not enough low-level, hard evidence at this time to determine if the problem is with the AHCI driver, the AMD/ATI IXP600 controller, or Samsung disks. The situations I have dealt with in the past always were inconclusive. There have been reports of problems with non-Samsung disks as well, but the report count there is extremely low in comparison. Fact: You will find complaints on Linux lists about both the controller and the drives as well (in combo). Take that to mean whatever you wish. Use Google and search for "SB600 HD403LJ Linux" or "SB600 Samsung Linux" and see for yourself. Fact: Samsung's SpinPoint series has had a troubling past of firmware bugs. Things have gotten better on their newer-ish drives, but the "slightly older" ones, to me, seemed more like a learning experience for engineers. I am not picking on Samsung exclusively; all drive vendors have had problems historically, there is no such thing as a "reliable" drive vendor in this day and age. You go with whatever works for you/whatever your experiences justify. All that said: There is some code in sys/dev/ahci/ahci.c that indicates "one-off" behaviour for the SB600/IXP600, pertaining to NCQ. However this was committed a long time ago (r196777 and r196796). I look at this code and I can think of one problem with it, but answers to my below questions will provide what I need. > The kernel is the default GENERIC from the snapshot, the > only additional modules loaded are geom_mirror and linux.ko. > The dmesg messages related to disks are copied below, and > the full dmesg can be found here: > http://www.secnetix.de/olli/tmp/dmesg.nox.txt > > Best regards > Oliver > > FreeBSD 9.1-STABLE #0: Mon May 13 05:10:23 UTC 2013 > root@snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64 > .. > ahci0: port 0xb000-0xb007,0xa000-0xa003,0x9000-0x9007,0x8000-0x8003,0x7000-0x700f mem 0xfe7ff800-0xfe7ffbff irq 22 at device 18.0 on pci0 > ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported > ahcich0: at channel 0 on ahci0 > ahcich1: at channel 1 on ahci0 > ahcich2: at channel 2 on ahci0 > ahcich3: at channel 3 on ahci0 > .. > .. > (aprobe0:ahcich0:0:15:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 > (aprobe0:ahcich0:0:15:0): CAM status: Command timeout > (aprobe0:ahcich0:0:15:0): Error 5, Retries exhausted > (aprobe1:ahcich1:0:15:0): NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00 > (aprobe1:ahcich1:0:15:0): CAM status: Command timeout > (aprobe1:ahcich1:0:15:0): Error 5, Retries exhausted > ada0 at ahcich0 bus 0 scbus0 target 0 lun 0 > ada0: ATA-8 SATA 2.x device > ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada0: Command Queueing enabled > ada0: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C) > ada0: Previously was known as ad4 > ada1 at ahcich1 bus 0 scbus1 target 0 lun 0 > ada1: ATA-8 SATA 2.x device > ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada1: Command Queueing enabled > ada1: 381554MB (781422768 512 byte sectors: 16H 63S/T 16383C) > ada1: Previously was known as ad6 > .. > GEOM_MIRROR: Device mirror/gm0 launched (2/2). > .. > Trying to mount root from ufs:/dev/mirror/gm0s1a [rw]... > .. > ahcich0: Timeout on slot 23 port 0 > ahcich0: is 00000000 cs f07fffff ss ffffffff rs ffffffff tfd c0 serr 00000000 cmd 0004bc17 > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 c9 e0 40 04 00 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > ahcich1: Timeout on slot 12 port 0 > ahcich1: is 00000000 cs ffff8fff ss ffffffff rs ffffffff tfd 40 serr 00000000 cmd 0004ee17 > (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 80 85 e3 40 04 00 00 00 00 00 > (ada1:ahcich1:0:0:0): CAM status: Command timeout > (ada1:ahcich1:0:0:0): Retrying command > ahcich1: Timeout on slot 2 port 0 > ahcich1: is 00000000 cs 00000000 ss 0000001c rs 0000001c tfd 40 serr 00000000 cmd 0004e417 > ahcich0: Timeout on slot 12 port 0 > (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 e0 04 e3 40 04 00 00 00 00 00 > (ada1:ahcich1:0:0:0): CAM status: Command timeout > ahcich0: is 00000000 cs 00000000 ss 00007000 rs 00007000 tfd 40 serr 00000000 cmd 0004ee17 > (ada1:ahcich1:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 e0 04 e3 40 04 00 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > pid 40615 (try), uid 0: exited on signal 10 (core dumped) > ahcich1: Timeout on slot 7 port 0 > ahcich1: is 00000000 cs fffff07f ss ffffffff rs ffffffff tfd c0 serr 00000000 cmd 0004ac17 > ahcich0: (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 7d 92 40 02 00 00 00 00 00 > Timeout on slot 19 port 0 > (ada1:ahcich1:0:0:0): CAM status: Command timeout > ahcich0: is 00000000 cs ff07ffff ss ffffffff rs ffffffff tfd c0 serr 00000000 cmd 0004b817 > (ada1:ahcich1:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 40 00 7d 92 40 02 00 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > ahcich1: Timeout on slot 12 port 0 > ahcich1: is 00000000 cs 00000000 ss 0000f000 rs 0000f000 tfd 40 serr 00000000 cmd 0004ef17 > ahcich0: Timeout on slot 24 port 0 > (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 d8 78 e4 40 04 00 00 00 00 00 > (ada1:ahcich1:0:0:0): CAM status: Command timeout > ahcich0: is 00000000 cs 00000000 ss 0f000000 rs 0f000000 tfd 40 serr 00000000 cmd 0004fb17 > (ada1:ahcich1:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 20 d8 78 e4 40 04 00 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > ahcich1: Timeout on slot 1 port 0 > ahcich1: is 00000000 cs 00000000 ss 0000003e rs 0000003e tfd 40 serr 00000000 cmd 0004e517 > ahcich0: Timeout on slot 13 port 0 > (ada1:ahcich1:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 30 e0 e4 40 04 00 00 00 00 00 > (ada1:ahcich1:0:0:0): CAM status: Command timeout > ahcich0: is 00000000 cs 00000000 ss 0003e000 rs 0003e000 tfd 40 serr 00000000 cmd 0004f117 > (ada1:ahcich1:0:0:0): Retrying command > (ada0:ahcich0:0:0:0): WRITE_FPDMA_QUEUED. ACB: 61 08 30 e0 e4 40 04 00 00 00 00 00 > (ada0:ahcich0:0:0:0): CAM status: Command timeout > (ada0:ahcich0:0:0:0): Retrying command > .. > .. It's worth pointing out that all of the events you provided are writes. In my experience, historically, that has usually been the case. If a drive firmware screws around when handling an NCQ write, taking too long to do something (think firmware bug), this can happen. If that's the case, the fact it happens on 2 disks of the same type thus wouldn't surprise me. I've mentioned in the past that I know of a few situations where this can happen, particularly with 4KByte sector drives, depending on how the user set up the system. In this case, the Samsung HD403LJ is supposedly a 512-byte sector drive, but the drive probably complies with an older ATA specification and thus only provides the logical sector size in ATA IDENTIFY output, thus the system must assume physical=logical (camcontrol and smartmontools will both say something to the effect of "512 bytes logical/physical"). I would appreciate the following: 1. smartctl -x {ada0,ada1} output using a recent version of smartmontools (6.1 if possible please), 2. camcontrol identify {ada0,ada1} -v output (note the -v), 3. If you are running smartd(8) or not, 4. pciconf -lvbc output. Anecdotal story: A lot of people forget the infamous nVidia nForce 4 vs. Maxtor NCQ issue that circulated "PC enthusiast" sites during the mid-2000s. Neither company wanted to own up to the problem, blaming each other instead. There was never any official statement made as to where the problem was, only that nVidia updated their nForce 4 controller drivers with some sort of workaround (details were not disclosed), and Maxtor also quietly added a document to their website stating that you could get a firmware from Technical Support that would address the problem as well. I had a combination of the two at the time, which is why I remember it. Still to this day nobody knows who was really responsible. I won't get into the whole political/societal aspects of why vendors always blame one another rather than solve real problems. There is no way at this time (in real-time or via loader.conf) to disable NCQ within the AHCI driver. It is possible to add an entry to the AHCI quirks table for your controller that sets AHCI_Q_NONCQ, if you want to try that. I can give you a patch for that, but I need to see the output from the above (4) commands first -- it may not be necessary to try, depending on the results. I have probably left out key/important informations within this mail, which is an indicator of how tired I have grown of seeing it come up. :-( -- | Jeremy Chadwick jdc@koitsu.org | | UNIX Systems Administrator http://jdc.koitsu.org/ | | Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Thu May 30 06:44:42 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 14BB2DEF for ; Thu, 30 May 2013 06:44:42 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 811FC257 for ; Thu, 30 May 2013 06:44:41 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1UhwbD-000Ea3-SL; Thu, 30 May 2013 09:44:35 +0300 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: pyunyh@gmail.com Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP In-reply-to: <20130529085544.GC3042@michelle.cdnetworks.com> References: <20130528052953.GA1457@michelle.cdnetworks.com> <20130528064850.GB1457@michelle.cdnetworks.com> <20130529085544.GC3042@michelle.cdnetworks.com> Comments: In-reply-to YongHyeon PYUN message dated "Wed, 29 May 2013 17:55:44 +0900." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 May 2013 09:44:35 +0300 From: Daniel Braniss Message-ID: Cc: freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 May 2013 06:44:42 -0000 > > --/04w6evG8XlLl3ft > Content-Type: text/plain; charset=us-ascii > Content-Disposition: inline > > On Tue, May 28, 2013 at 09:55:24AM +0300, Daniel Braniss wrote: > > > On Tue, May 28, 2013 at 09:28:00AM +0300, Daniel Braniss wrote: > > > > > On Mon, May 27, 2013 at 10:59:28AM +0300, Daniel Braniss wrote: > > > > > > > On Fri, May 24, 2013 at 05:31:13PM +0300, Daniel Braniss wrote: > > > > > > > > hi, after upgrading to 9.1-stable, this particular hardware - SunFire X2200, > > > > > > > > > > > > > > Show me dmesg(bge(4) and brgphy(4) only) and 'ifconfig bge1' output. > > > > > > > > > > > > > > > > > > > bge0: mem > > > > > > 0xfdff0000-0xfdffffff,0xfdfe0000-0xfdfeffff irq 17 at device 4.0 on pci6 > > > > > > bge0: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > > > > miibus2: on bge0 > > > > > > brgphy0: PHY 1 on miibus2 > > > > > > brgphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > > > bge0: Ethernet address: 00:1b:24:5d:5b:bd > > > > > > bge1: mem > > > > > > 0xfdfc0000-0xfdfcffff,0xfdfb0000-0xfdfbffff irq 18 at device 4.1 on pci6 > > > > > > bge1: CHIP ID 0x00009003; ASIC REV 0x09; CHIP REV 0x90; PCI-X 133 MHz > > > > > > miibus3: on bge1 > > > > > > brgphy1: PHY 1 on miibus3 > > > > > > brgphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, > > > > > > 1000baseT-master, 1000baseT-FDX, 1000baseT-FDX-master, auto, auto-flow > > > > > > bge1: Ethernet address: 00:1b:24:5d:5b:be > > > > > > > > > > > > sf-10> ifconfig bge1 > > > > > > bge1: flags=8802 metric 0 mtu 1500 > > > > > > options=8009b > > > > > TE> > > > > > > ether 00:1b:24:5d:5b:be > > > > > > nd6 options=21 > > > > > > media: Ethernet autoselect (100baseTX ) > > > > > > status: active > > > > > > > > > > > > > > > > Because bge1 is not UP, I wonder how you get link UP/DOWN events. > > > > > Do you have some network script run by cron? > > > > > > > > no scripts. > > > > this port is shared with the ILO/IPMI, and back in March you fixed a problem > > > > that it was hanging soon after it was initialized by the driver, > > > > (r248226 - but I'm not sure if it was ever MFC'ed). > > > > > > It was MFCed. > > > > > > > Initialy I thought it could be caused by connections to it from other > > > > hosts (either via the web, or ssh) so I killed them, but it didn't help. > > > > without that patch the connection fails, and I don't see any DOWN/UP. > > > > > > Could you check how many number of interrupts you get from bge1? > > > Ideally you shouldn't get any interrupts for bge1. > > > > it's not even mentioned :-) > > sf-04> vmstat -i > > interrupt total rate > > irq3: uart1 964 0 > > irq4: uart0 6 0 > > irq14: ata0 227354 0 > > irq17: bge0 1021981 2 > > irq21: ohci0 28 0 > > irq22: ehci0 2 0 > > irq23: atapci1 293228 0 > > cpu0:timer 383244076 1124 > > cpu1:timer 2225144 6 > > cpu2:timer 2056087 6 > > cpu3:timer 2093943 6 > > Total 391162813 1147 > > > > Then the only way link UP/DOWN event could be generated for DOWN > interface would be invocation of media status query > (i.e. ifconfig -a) triggered by an external application. Most > drivers I touched check IFF_UP flag before poking media status > register. However I'm not sure you're seeing this issue because you > do not use any network script run by cron. > Anyway, try attached patch and let me know whether it makes any > difference. > > > > > > > > > > > > > > > > > > > > > is toggeling bge1 DOWN/UP every few hours, this port is being used by the ILO. > > > > > > > > To check, I upgraded another identical host, and the same problem appears. > > > > > > > > > > > > > > What is the last known working revision? > > > > > > > > > > > > I have no idea, but I have older versions, and ill start from the oldets > > > > > > (9.1-prerelease), but > > > > > > it will take time, since it takes hours till it happens. > > > > > > > > > > > > > > > > ok. > > > > > > > > > > > > > > --/04w6evG8XlLl3ft > Content-Type: text/x-diff; charset=us-ascii > Content-Disposition: attachment; filename="bge.media_sts.diff" > > Index: sys/dev/bge/if_bge.c > =================================================================== > --- sys/dev/bge/if_bge.c (revision 251021) > +++ sys/dev/bge/if_bge.c (working copy) > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar > > BGE_LOCK(sc); > > + if ((ifp->if_flags & IFF_UP) == 0) { > + BGE_UNLOCK(sc); > + return; > + } > if (sc->bge_flags & BGE_FLAG_TBI) { > ifmr->ifm_status = IFM_AVALID; > ifmr->ifm_active = IFM_ETHER; > > --/04w6evG8XlLl3ft-- after 18hs, the logs are empty! it seems the patch fixes the problem. now maybe it's time to hunt for who is randomly calling for bge_ifmedia_sts ... thanks, danny From owner-freebsd-stable@FreeBSD.ORG Thu May 30 13:11:14 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 09515E9 for ; Thu, 30 May 2013 13:11:14 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id DBBE1F48 for ; Thu, 30 May 2013 13:11:13 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 42358B949; Thu, 30 May 2013 09:11:13 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Date: Thu, 30 May 2013 08:59:20 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <20130529085544.GC3042@michelle.cdnetworks.com> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305300859.20928.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 30 May 2013 09:11:13 -0400 (EDT) Cc: pyunyh@gmail.com X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 May 2013 13:11:14 -0000 On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote: > > --/04w6evG8XlLl3ft > > Content-Type: text/x-diff; charset=us-ascii > > Content-Disposition: attachment; filename="bge.media_sts.diff" > > > > Index: sys/dev/bge/if_bge.c > > =================================================================== > > --- sys/dev/bge/if_bge.c (revision 251021) > > +++ sys/dev/bge/if_bge.c (working copy) > > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar > > > > BGE_LOCK(sc); > > > > + if ((ifp->if_flags & IFF_UP) == 0) { > > + BGE_UNLOCK(sc); > > + return; > > + } > > if (sc->bge_flags & BGE_FLAG_TBI) { > > ifmr->ifm_status = IFM_AVALID; > > ifmr->ifm_active = IFM_ETHER; > > > > --/04w6evG8XlLl3ft-- > after 18hs, the logs are empty! > it seems the patch fixes the problem. > > now maybe it's time to hunt for who is randomly calling for bge_ifmedia_sts > ... It could be any number of daemons that query interface state such as an SNMP server, ladvd, etc. If you wanted help you could modify the patch so that it does something like this: if (/* test for IFF_UP */) { BGE_UNLOCK(sc); if_printf(ifp, "state queried on down interface by pid %d (%s)", curthread->td_proc->p_pid, curthread->td_proc->p_comm); return; } -- John Baldwin From owner-freebsd-stable@FreeBSD.ORG Thu May 30 13:11:17 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0771BEC for ; Thu, 30 May 2013 13:11:17 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id DB45BF49 for ; Thu, 30 May 2013 13:11:16 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 3D238B98A; Thu, 30 May 2013 09:11:16 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Subject: Re: System doesn't dump Date: Thu, 30 May 2013 09:02:29 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <51A5A322.1020503@bsdforen.de> In-Reply-To: <51A5A322.1020503@bsdforen.de> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305300902.29766.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 30 May 2013 09:11:16 -0400 (EDT) Cc: Dominic Fandrey X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 May 2013 13:11:17 -0000 On Wednesday, May 29, 2013 2:41:38 am Dominic Fandrey wrote: > I have a number of actions that reliably panic the system, such as > performing shutdown -p (yes I'm booting into an inconsistent file > system every time). Both with my notebook and my workstation. > > However I cannot get the system to dump. > > dumpdir=/var/crash > and I've tried ada0s2b, /dev/ada0s2b, label/5swap, /dev/label/5swap and AUTO > for dumpdev to no avail. > > The swap partition is 16g, the machines have 8g RAM and there's plenty > of hard disk space available for /var/crash. > > I'm looking for that secret, undocumented trigger, that makes the > system dump if a panic occurs. Once upon a time dumping just worked > if the swap partition was large enough. I miss those olden days. Does /dev/dumpdev exist and point to your swap partition after booting? -- John Baldwin From owner-freebsd-stable@FreeBSD.ORG Thu May 30 15:21:02 2013 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 6040A347; Thu, 30 May 2013 15:21:02 +0000 (UTC) (envelope-from baptiste.daroussin@gmail.com) Received: from mail-we0-x234.google.com (mail-we0-x234.google.com [IPv6:2a00:1450:400c:c03::234]) by mx1.freebsd.org (Postfix) with ESMTP id 35B15CFE; Thu, 30 May 2013 15:20:58 +0000 (UTC) Received: by mail-we0-f180.google.com with SMTP id t60so367370wes.11 for ; Thu, 30 May 2013 08:20:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:date:from:to:cc:subject:message-id:mime-version:content-type :content-disposition:user-agent; bh=QzdFLneOejklLTB/G9d08zoj6wX8gn7X1PbtyyU1Vg4=; b=RI5Cz6erLdFt4q3qjqMdFHj3GoFzkH/ANphiLNDr3b2byQglpAKgs88S7CUye31OQ4 j/no+9cW/LL5sMb7LS6qIbpu8VSzLAvk6srLfXGt4DOPjkdwqO52Pjgpqh7/JGq6K+3+ jBuuzJUzqvy7lnNv0GhYnno0zswFkwosDAm8FMcsoL+J3Dx54tJLdTJ2woK7MmOwX5ez nGpxOPcW+CKdDDPNnyEl1P0FXbY1dQrv8qvGd1ebw9+5QYqVe6qIvvnYY0FY+oUMKbsy CISTasRBd6Qpr7JpkmfjE89aZghTrv5dGRyCwSWGNu1tBRQFK92mBbzasobTsTcFmVC2 M6Dg== X-Received: by 10.180.211.197 with SMTP id ne5mr5067873wic.54.1369927257350; Thu, 30 May 2013 08:20:57 -0700 (PDT) Received: from ithaqua.etoilebsd.net (ithaqua.etoilebsd.net. [37.59.37.188]) by mx.google.com with ESMTPSA id cw8sm38630691wib.7.2013.05.30.08.20.55 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 30 May 2013 08:20:56 -0700 (PDT) Sender: Baptiste Daroussin Date: Thu, 30 May 2013 17:20:54 +0200 From: Baptiste Daroussin To: pkg@Freebsd.org Subject: [HEADSUP] New pkg-devel 1.1.0 beta1 Message-ID: <20130530152053.GA19621@ithaqua.etoilebsd.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="J/dobhs11T7y2rNN" Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Cc: ports@FreeBSD.org, stable@FreeBSD.org, current@FreeBSD.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 May 2013 15:21:02 -0000 --J/dobhs11T7y2rNN Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, The pkg developement team is proud to announce the new 1.1.0 beta1 release = of pkg. Here is the list of new features that happened in pkg 1.1: - new simpler and more reliable solver - shared libraries are now always tracked - ssh:// is supported as a protocol to distribute packages (needs pkg 1.1+ = on the server hosting the packages) - multirepository is no longer considered experimental and works by default. - incremental update of the catalog (only if the repository was created by = pkg 1.1+) - simplification of the public API - stabilisation of the public API (we will now try to keep it stable and if change are needed there will be deprecation time before removal of some o= ld functions) - new experimental pkg convert (can convert from and to legacy pkg database) pkg2ng now uses pkg convert (still recommanded to use pkg2ng) - new pkg lock/unlock to prevent any manipulation of a given package (no upgrade,delete,etc) - improved UI (now you can see the progress of an upgrade what is left to be done) - new pkg annotation to allow one to add annotations (free form key/value) = to a package) - pkg audit is now able to directly parse the vuxml native format and not o= nly the compact version - pkg -vv now shows all available options and their current settings - pkg -vvv now shows a description of all the available options - pkg info now automatically considers the query as globbing if * is in the requested pattern - new hook plugin interface (allows users to create hooks that get called at anytime during and upgrade/installation/deletion of a package) - new cmd plugin interface (allows users to create new sub command availabl= e for pkg) - pkg register can now register a port installation in the legacy database format - repository can be defined in simple yaml files Internal: - massive usage of hash tables (uthash), which simplifies a lot of the code, and improves performances - lots of optimisation in plist and manifest parsing - lots of optimisation in loading packages (mmap used when possible) - lots of cleanup in memory usage - regression test framework is now ready (using atf) regression test are sl= owly being added and populated. To use this new version: Ports users (or in building factories: poudriere/tinderbox): Add WITH_PKGNG=3Ddevel to your make.conf pkg set -o ports-mgmt/pkg:ports-mgmt/pkg-devel Binary package users, if the remote repository is providing pkg 1.1: pkg set -o ports-mgmt/pkg:ports-mgmt/pkg-devel pkg upgrade Note that pkg 1.1 can use a repository created for pkg 1.0 and vis versa. Huge thanks to all the people that have contributed to the pkg developement= :=20 - may that be by code - documentation - bug report - feedback - ideas List of people who contributed code: Baptiste Daroussin, Matthew Seaman, Bryan Drewery, Vsevolod Stakhov, Marin Atanasov Nikolov, Alexandre Perrin, Romain Tarti=C3=A8re, Julien Laff= aye, Glen Barber, John Marino, Alex Kozlov, Roman Naumann, Sofian Brabez, Alberto Villa, Will Andrews, Eitan Adler, Dan McGregor, namor, niamtokik, Arthur Gautier, Garrett Cooper, Andrew Turner, Jeremy Chadwick, Hajimu UMEMOTO, Mark Lokowich, Eygene Ryabinkin, Pietro Cerutti, Rolf Grossmann, Ed Schouten, Dimitry Andric, David Forsythe, Stefan Grundma= nn, Craig Rodrigues, Antoine Brodin, Andrey Zonov, Joel Dahl Stats between 1.0 and 1.1: 287 files changed, 63418 insertions(+), 18763 deletions(-) 1198 commits regards, Bapt --J/dobhs11T7y2rNN Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlGnblUACgkQ8kTtMUmk6Ey2JQCfWLWmwg/ldAnKn1VVkVGVqFO4 eP8An1FLMPau7L/fchvl+CgxLl2avZUK =KD94 -----END PGP SIGNATURE----- --J/dobhs11T7y2rNN-- From owner-freebsd-stable@FreeBSD.ORG Thu May 30 15:27:32 2013 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B1389D3A for ; Thu, 30 May 2013 15:27:32 +0000 (UTC) (envelope-from bdrewery@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:1900:2254:206c::16:87]) by mx1.freebsd.org (Postfix) with ESMTP id 82568E0B for ; Thu, 30 May 2013 15:27:32 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.7/8.14.7) with ESMTP id r4UFRWjc089299 for ; Thu, 30 May 2013 15:27:32 GMT (envelope-from bdrewery@freefall.freebsd.org) Received: (from bdrewery@localhost) by freefall.freebsd.org (8.14.7/8.14.7/Submit) id r4UFRWRf089292 for stable@FreeBSD.org; Thu, 30 May 2013 15:27:32 GMT (envelope-from bdrewery) Received: (qmail 11036 invoked from network); 30 May 2013 10:27:30 -0500 Received: from unknown (HELO ?173.160.118.90?) (freebsd@shatow.net@173.160.118.90) by sweb.xzibition.com with ESMTPA; 30 May 2013 10:27:30 -0500 Message-ID: <51A76FE7.1050900@FreeBSD.org> Date: Thu, 30 May 2013 10:27:35 -0500 From: Bryan Drewery Organization: FreeBSD User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130509 Thunderbird/17.0.6 MIME-Version: 1.0 Subject: Re: [HEADSUP] New pkg-devel 1.1.0 beta1 References: <20130530152053.GA19621@ithaqua.etoilebsd.net> In-Reply-To: <20130530152053.GA19621@ithaqua.etoilebsd.net> X-Enigmail-Version: 1.5.1 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="----enig2AXANWMMHAUFNJHBSDNOF" Cc: pkg@Freebsd.org, ports@FreeBSD.org, stable@FreeBSD.org, current@FreeBSD.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 May 2013 15:27:32 -0000 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) ------enig2AXANWMMHAUFNJHBSDNOF Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On 5/30/2013 10:20 AM, Baptiste Daroussin wrote: > Ports users (or in building factories: poudriere/tinderbox): > Add WITH_PKGNG=3Ddevel to your make.conf > pkg set -o ports-mgmt/pkg:ports-mgmt/pkg-devel FYI this will not currently work with portupgrade. I plan to address it soon. --=20 Regards, Bryan Drewery ------enig2AXANWMMHAUFNJHBSDNOF Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (MingW32) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJRp2/nAAoJEG54KsA8mwz5jTIP/0KDzruOJpYqrGSuOVQF3U8n Bx7dFopj2lBjOyn/CAfpfMLwl0N49wzDb7R2FJrI3h1u03p6/mbYEeEpnXwKqKVL cEnBvIuoy3fJg5BBvZqF0tvosQTwq1b6sOS/bnPjdQEtxNyB9zEeBntCva49lGVH sEih9onOOPZU27fKuhWKihvRt4zDH6PpBUORaGgNT/f4iWaA33MPT+mySQcvHvrM BiQJ/rzy8mKKUa+4Hrwjg/zYLbLtMBzb2Ol363rEBXPbpSktIuEEje26MUbS/0+7 nfjjgGW5zq3RdYRaYLIsbphZtiHsboQT4jYaFFwwiCMab7+JwzvHRJvUF30AhI7F QRdtvGOylzHH6thpKRKTwIZzKQrp6V3zAN7s0rCQLosbnAjLWhqy30qxusYvf9Fv VmkreqQawmLgYIPIEsd2TBsdR61D6JE4R4gmF2inFqfvbrEQiPLTu7UiZxkQ+XFo zCZSsUCcXplZ6QWhI5PkDi5SLz/W2xfm0s2E188RJHMbz8RCuNDaDs9tIMK6A0Sh nrNguu7TBoAbPIZSi5VeYElWXWJkkzYJNLxqKIfyerP69kzNMOI/uEbDtNpnZLWN R/mwQ/DeRfYewm5VFZicd/mR4qhpNVDaNvgivHugb7PRRaSQzMWFVH1icvCy1IPK VJLpoCtmSjV03MZhoqQ3 =phJs -----END PGP SIGNATURE----- ------enig2AXANWMMHAUFNJHBSDNOF-- From owner-freebsd-stable@FreeBSD.ORG Fri May 31 01:51:09 2013 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 3036CAAF; Fri, 31 May 2013 01:51:09 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-stable.sentex.ca (freebsd-stable.sentex.ca [IPv6:2607:f3e0:0:3::6502:9b]) by mx1.freebsd.org (Postfix) with ESMTP id ED38AE75; Fri, 31 May 2013 01:51:08 +0000 (UTC) Received: from freebsd-stable.sentex.ca (localhost [127.0.0.1]) by freebsd-stable.sentex.ca (8.14.5/8.14.5) with ESMTP id r4V1p8rX070065; Fri, 31 May 2013 01:51:08 GMT (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-stable.sentex.ca (8.14.5/8.14.5/Submit) id r4V1p7qZ070048; Fri, 31 May 2013 01:51:08 GMT (envelope-from tinderbox@freebsd.org) Date: Fri, 31 May 2013 01:51:08 GMT Message-Id: <201305310151.r4V1p7qZ070048@freebsd-stable.sentex.ca> X-Authentication-Warning: freebsd-stable.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Subject: [releng_9 tinderbox] failure on i386/i386 Precedence: bulk X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 01:51:09 -0000 TB --- 2013-05-30 22:50:22 - tinderbox 2.10 running on freebsd-stable.sentex.ca TB --- 2013-05-30 22:50:22 - FreeBSD freebsd-stable.sentex.ca 8.3-STABLE FreeBSD 8.3-STABLE #0: Tue Oct 16 17:37:58 UTC 2012 mdtancsa@freebsd-stable.sentex.ca:/usr/obj/usr/src/sys/server amd64 TB --- 2013-05-30 22:50:22 - starting RELENG_9 tinderbox run for i386/i386 TB --- 2013-05-30 22:50:22 - cleaning the object tree TB --- 2013-05-30 22:50:22 - /usr/local/bin/svn stat /src TB --- 2013-05-30 22:50:28 - At svn revision 251168 TB --- 2013-05-30 22:50:29 - building world TB --- 2013-05-30 22:50:29 - CROSS_BUILD_TESTING=YES TB --- 2013-05-30 22:50:29 - MAKEOBJDIRPREFIX=/obj TB --- 2013-05-30 22:50:29 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2013-05-30 22:50:29 - SRCCONF=/dev/null TB --- 2013-05-30 22:50:29 - TARGET=i386 TB --- 2013-05-30 22:50:29 - TARGET_ARCH=i386 TB --- 2013-05-30 22:50:29 - TZ=UTC TB --- 2013-05-30 22:50:29 - __MAKE_CONF=/dev/null TB --- 2013-05-30 22:50:29 - cd /src TB --- 2013-05-30 22:50:29 - /usr/bin/make -B buildworld >>> World build started on Thu May 30 22:50:29 UTC 2013 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri May 31 01:44:34 UTC 2013 TB --- 2013-05-31 01:44:34 - generating LINT kernel config TB --- 2013-05-31 01:44:34 - cd /src/sys/i386/conf TB --- 2013-05-31 01:44:34 - /usr/bin/make -B LINT TB --- 2013-05-31 01:44:34 - cd /src/sys/i386/conf TB --- 2013-05-31 01:44:34 - /usr/sbin/config -m LINT TB --- 2013-05-31 01:44:34 - building LINT kernel TB --- 2013-05-31 01:44:34 - CROSS_BUILD_TESTING=YES TB --- 2013-05-31 01:44:34 - MAKEOBJDIRPREFIX=/obj TB --- 2013-05-31 01:44:34 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2013-05-31 01:44:34 - SRCCONF=/dev/null TB --- 2013-05-31 01:44:34 - TARGET=i386 TB --- 2013-05-31 01:44:34 - TARGET_ARCH=i386 TB --- 2013-05-31 01:44:34 - TZ=UTC TB --- 2013-05-31 01:44:34 - __MAKE_CONF=/dev/null TB --- 2013-05-31 01:44:34 - cd /src TB --- 2013-05-31 01:44:34 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri May 31 01:44:34 UTC 2013 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-sse -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/aha/aha_isa.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-sse -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/aha/aha_mca.c In file included from /src/sys/dev/aha/aha_mca.c:49: /src/sys/dev/aha/ahareg.h:300: error: field 'timer' has incomplete type /src/sys/dev/aha/aha_mca.c: In function 'aha_mca_attach': /src/sys/dev/aha/aha_mca.c:194: error: 'aha' undeclared (first use in this function) /src/sys/dev/aha/aha_mca.c:194: error: (Each undeclared identifier is reported only once /src/sys/dev/aha/aha_mca.c:194: error: for each function it appears in.) *** Error code 1 Stop in /obj/i386.i386/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2013-05-31 01:51:07 - WARNING: /usr/bin/make returned exit code 1 TB --- 2013-05-31 01:51:07 - ERROR: failed to build LINT kernel TB --- 2013-05-31 01:51:07 - 8352.52 user 914.26 system 10844.89 real http://tinderbox.freebsd.org/tinderbox-freebsd9-build-RELENG_9-i386-i386.full From owner-freebsd-stable@FreeBSD.ORG Fri May 31 05:25:00 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id F2E66C6E; Fri, 31 May 2013 05:24:59 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id A9862151; Fri, 31 May 2013 05:24:59 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1UiHpX-000AaB-38; Fri, 31 May 2013 08:24:47 +0300 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: John Baldwin Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP In-reply-to: <201305300859.20928.jhb@freebsd.org> References: <20130529085544.GC3042@michelle.cdnetworks.com> <201305300859.20928.jhb@freebsd.org> Comments: In-reply-to John Baldwin message dated "Thu, 30 May 2013 08:59:20 -0400." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 31 May 2013 08:24:47 +0300 From: Daniel Braniss Message-ID: Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 05:25:00 -0000 > On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote: > > > --/04w6evG8XlLl3ft > > > Content-Type: text/x-diff; charset=us-ascii > > > Content-Disposition: attachment; filename="bge.media_sts.diff" > > > > > > Index: sys/dev/bge/if_bge.c > > > =================================================================== > > > --- sys/dev/bge/if_bge.c (revision 251021) > > > +++ sys/dev/bge/if_bge.c (working copy) > > > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar > > > > > > BGE_LOCK(sc); > > > > > > + if ((ifp->if_flags & IFF_UP) == 0) { > > > + BGE_UNLOCK(sc); > > > + return; > > > + } > > > if (sc->bge_flags & BGE_FLAG_TBI) { > > > ifmr->ifm_status = IFM_AVALID; > > > ifmr->ifm_active = IFM_ETHER; > > > > > > --/04w6evG8XlLl3ft-- > > after 18hs, the logs are empty! > > it seems the patch fixes the problem. > > > > now maybe it's time to hunt for who is randomly calling for bge_ifmedia_sts > > ... > > It could be any number of daemons that query interface state such as an > SNMP server, ladvd, etc. > > If you wanted help you could modify the patch so that it does something like > this: > #include > if (/* test for IFF_UP */) { > BGE_UNLOCK(sc); > if_printf(ifp, "state queried on down interface by pid %d (%s)", ------------------------------------------------------------------------------| add a \n > curthread->td_proc->p_pid, curthread->td_proc->p_comm); > return; > } > > -- > John Baldwin snmpd call this several times a second, (difficult to measeure since sysolog just says last message repeated 22 times in any case, the DOWN/UP appears once every few hours, oh well. I have now stopped the snmpd daemon, maybe there is someone else ... thanks, danny From owner-freebsd-stable@FreeBSD.ORG Fri May 31 05:47:22 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id AE4D9F1D; Fri, 31 May 2013 05:47:22 +0000 (UTC) (envelope-from pyunyh@gmail.com) Received: from mail-pd0-f170.google.com (mail-pd0-f170.google.com [209.85.192.170]) by mx1.freebsd.org (Postfix) with ESMTP id 85770243; Fri, 31 May 2013 05:47:22 +0000 (UTC) Received: by mail-pd0-f170.google.com with SMTP id x10so1626170pdj.15 for ; Thu, 30 May 2013 22:47:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:date:to:cc:subject:message-id:reply-to:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=iwYVUiLEJ0SXsSM92c/G964+7MNw4TquCbetruuX/Zw=; b=dR5MpckIjs5IGG6htKLAW3j4M19ByT0vT/QnvkrFERjkVNaQooOjbxKu+bqu/2/hoX r3yOTGh1MLwEYHvW0Sl4xJWXYaXmTHjIzG3ZKQ/ilVbNWDcRO688TSj7DUWtjORYsYVG 2RvYXucrHk/5xuS84gB9MsVPbv2RfyyqNz1AzlSjV1FgTUH5ArI0aruGRATvWRuBkphL 2qf0XV971GGBTHEheaeQb5lLOrwZ9UZARDtXiY5BbMbWEGASvIwQL4rN0ssY54r6rfq5 vLn6CYoUWWXVjxMSa1mg5I97pSrK918b/kZ3m6ebi/Kr0W3ty6dJQJJuZOHd+wIjktSq ps7A== X-Received: by 10.68.32.161 with SMTP id k1mr11284892pbi.221.1369979241846; Thu, 30 May 2013 22:47:21 -0700 (PDT) Received: from pyunyh@gmail.com (lpe4.p59-icn.cdngp.net. [114.111.62.249]) by mx.google.com with ESMTPSA id pb5sm44722099pbc.29.2013.05.30.22.47.18 for (version=TLSv1 cipher=RC4-SHA bits=128/128); Thu, 30 May 2013 22:47:20 -0700 (PDT) Received: by pyunyh@gmail.com (sSMTP sendmail emulation); Fri, 31 May 2013 14:47:13 +0900 From: YongHyeon PYUN Date: Fri, 31 May 2013 14:47:13 +0900 To: Daniel Braniss Subject: Re: SunFire X2200 ilo's bge1 DOWN/UP Message-ID: <20130531054713.GB1478@michelle.cdnetworks.com> References: <20130529085544.GC3042@michelle.cdnetworks.com> <201305300859.20928.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: freebsd-stable@freebsd.org, John Baldwin X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: pyunyh@gmail.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 05:47:22 -0000 On Fri, May 31, 2013 at 08:24:47AM +0300, Daniel Braniss wrote: > > On Thursday, May 30, 2013 2:44:35 am Daniel Braniss wrote: > > > > --/04w6evG8XlLl3ft > > > > Content-Type: text/x-diff; charset=us-ascii > > > > Content-Disposition: attachment; filename="bge.media_sts.diff" > > > > > > > > Index: sys/dev/bge/if_bge.c > > > > =================================================================== > > > > --- sys/dev/bge/if_bge.c (revision 251021) > > > > +++ sys/dev/bge/if_bge.c (working copy) > > > > @@ -5583,6 +5583,10 @@ bge_ifmedia_sts(struct ifnet *ifp, struct ifmediar > > > > > > > > BGE_LOCK(sc); > > > > > > > > + if ((ifp->if_flags & IFF_UP) == 0) { > > > > + BGE_UNLOCK(sc); > > > > + return; > > > > + } > > > > if (sc->bge_flags & BGE_FLAG_TBI) { > > > > ifmr->ifm_status = IFM_AVALID; > > > > ifmr->ifm_active = IFM_ETHER; > > > > > > > > --/04w6evG8XlLl3ft-- > > > after 18hs, the logs are empty! > > > it seems the patch fixes the problem. > > > > > > now maybe it's time to hunt for who is randomly calling for bge_ifmedia_sts > > > ... > > > > It could be any number of daemons that query interface state such as an > > SNMP server, ladvd, etc. > > > > If you wanted help you could modify the patch so that it does something like > > this: > > > #include > > if (/* test for IFF_UP */) { > > BGE_UNLOCK(sc); > > if_printf(ifp, "state queried on down interface by pid %d (%s)", > ------------------------------------------------------------------------------| > add a \n > > curthread->td_proc->p_pid, curthread->td_proc->p_comm); > > return; > > } > > > > -- > > John Baldwin > snmpd call this several times a second, (difficult to measeure since sysolog > just says > last message repeated 22 times > in any case, the DOWN/UP appears once every few hours, oh well. > I have now stopped the snmpd daemon, maybe there is someone else ... I have no idea why snmpd wants to know media status for interfaces that are put into down state. The media status resolved after bringing up the interface may be different one that was seen before. The patch also makes dhclient think driver got a valid link regardless of link establishment. I guess that wouldn't be issue though. I'll commit the patch after some more testing. Thanks for reporting and testing! > > thanks, > danny > > From owner-freebsd-stable@FreeBSD.ORG Fri May 31 10:12:05 2013 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id C372B11A; Fri, 31 May 2013 10:12:05 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-stable.sentex.ca (freebsd-stable.sentex.ca [IPv6:2607:f3e0:0:3::6502:9b]) by mx1.freebsd.org (Postfix) with ESMTP id 8D73F1FA; Fri, 31 May 2013 10:12:05 +0000 (UTC) Received: from freebsd-stable.sentex.ca (localhost [127.0.0.1]) by freebsd-stable.sentex.ca (8.14.5/8.14.5) with ESMTP id r4VAC5VZ060995; Fri, 31 May 2013 10:12:05 GMT (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-stable.sentex.ca (8.14.5/8.14.5/Submit) id r4VAC5dv060990; Fri, 31 May 2013 10:12:05 GMT (envelope-from tinderbox@freebsd.org) Date: Fri, 31 May 2013 10:12:05 GMT Message-Id: <201305311012.r4VAC5dv060990@freebsd-stable.sentex.ca> X-Authentication-Warning: freebsd-stable.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Subject: [releng_9 tinderbox] failure on i386/i386 Precedence: bulk X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 10:12:05 -0000 TB --- 2013-05-31 07:10:25 - tinderbox 2.10 running on freebsd-stable.sentex.ca TB --- 2013-05-31 07:10:25 - FreeBSD freebsd-stable.sentex.ca 8.3-STABLE FreeBSD 8.3-STABLE #0: Tue Oct 16 17:37:58 UTC 2012 mdtancsa@freebsd-stable.sentex.ca:/usr/obj/usr/src/sys/server amd64 TB --- 2013-05-31 07:10:25 - starting RELENG_9 tinderbox run for i386/i386 TB --- 2013-05-31 07:10:25 - cleaning the object tree TB --- 2013-05-31 07:10:45 - /usr/local/bin/svn stat /src TB --- 2013-05-31 07:10:50 - At svn revision 251176 TB --- 2013-05-31 07:10:51 - building world TB --- 2013-05-31 07:10:51 - CROSS_BUILD_TESTING=YES TB --- 2013-05-31 07:10:51 - MAKEOBJDIRPREFIX=/obj TB --- 2013-05-31 07:10:51 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2013-05-31 07:10:51 - SRCCONF=/dev/null TB --- 2013-05-31 07:10:51 - TARGET=i386 TB --- 2013-05-31 07:10:51 - TARGET_ARCH=i386 TB --- 2013-05-31 07:10:51 - TZ=UTC TB --- 2013-05-31 07:10:51 - __MAKE_CONF=/dev/null TB --- 2013-05-31 07:10:51 - cd /src TB --- 2013-05-31 07:10:51 - /usr/bin/make -B buildworld >>> World build started on Fri May 31 07:10:51 UTC 2013 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything >>> World build completed on Fri May 31 10:05:30 UTC 2013 TB --- 2013-05-31 10:05:30 - generating LINT kernel config TB --- 2013-05-31 10:05:30 - cd /src/sys/i386/conf TB --- 2013-05-31 10:05:30 - /usr/bin/make -B LINT TB --- 2013-05-31 10:05:30 - cd /src/sys/i386/conf TB --- 2013-05-31 10:05:30 - /usr/sbin/config -m LINT TB --- 2013-05-31 10:05:30 - building LINT kernel TB --- 2013-05-31 10:05:30 - CROSS_BUILD_TESTING=YES TB --- 2013-05-31 10:05:30 - MAKEOBJDIRPREFIX=/obj TB --- 2013-05-31 10:05:30 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2013-05-31 10:05:30 - SRCCONF=/dev/null TB --- 2013-05-31 10:05:30 - TARGET=i386 TB --- 2013-05-31 10:05:30 - TARGET_ARCH=i386 TB --- 2013-05-31 10:05:30 - TZ=UTC TB --- 2013-05-31 10:05:30 - __MAKE_CONF=/dev/null TB --- 2013-05-31 10:05:30 - cd /src TB --- 2013-05-31 10:05:30 - /usr/bin/make -B buildkernel KERNCONF=LINT >>> Kernel build for LINT started on Fri May 31 10:05:30 UTC 2013 >>> stage 1: configuring the kernel >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3.1: making dependencies >>> stage 3.2: building everything [...] cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-sse -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/aha/aha_isa.c cc -c -O2 -pipe -fno-strict-aliasing -std=c99 -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -Wmissing-include-dirs -fdiagnostics-show-option -nostdinc -I. -I/src/sys -I/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000 -DGPROF -falign-functions=16 -DGPROF4 -DGUPROF -fno-builtin -mno-align-long-strings -mpreferred-stack-boundary=2 -mno-mmx -mno-sse -msoft-float -ffreestanding -fstack-protector -Werror -pg -mprofiler-epilogue /src/sys/dev/aha/aha_mca.c In file included from /src/sys/dev/aha/aha_mca.c:49: /src/sys/dev/aha/ahareg.h:300: error: field 'timer' has incomplete type /src/sys/dev/aha/aha_mca.c: In function 'aha_mca_attach': /src/sys/dev/aha/aha_mca.c:194: error: 'aha' undeclared (first use in this function) /src/sys/dev/aha/aha_mca.c:194: error: (Each undeclared identifier is reported only once /src/sys/dev/aha/aha_mca.c:194: error: for each function it appears in.) *** Error code 1 Stop in /obj/i386.i386/src/sys/LINT. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2013-05-31 10:12:04 - WARNING: /usr/bin/make returned exit code 1 TB --- 2013-05-31 10:12:04 - ERROR: failed to build LINT kernel TB --- 2013-05-31 10:12:04 - 8357.76 user 914.91 system 10899.77 real http://tinderbox.freebsd.org/tinderbox-freebsd9-build-RELENG_9-i386-i386.full From owner-freebsd-stable@FreeBSD.ORG Fri May 31 12:41:40 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 845FBB6F for ; Fri, 31 May 2013 12:41:40 +0000 (UTC) (envelope-from Andre.Albsmeier@siemens.com) Received: from david.siemens.de (david.siemens.de [192.35.17.14]) by mx1.freebsd.org (Postfix) with ESMTP id 1D7AEDB8 for ; Fri, 31 May 2013 12:41:39 +0000 (UTC) Received: from mail2.siemens.de (localhost [127.0.0.1]) by david.siemens.de (8.13.6/8.13.6) with ESMTP id r4VCQB34000975 for ; Fri, 31 May 2013 14:26:11 +0200 Received: from curry.mchp.siemens.de (curry.mchp.siemens.de [139.25.40.130]) by mail2.siemens.de (8.13.6/8.13.6) with ESMTP id r4VCQBlI031684 for ; Fri, 31 May 2013 14:26:11 +0200 Received: (from localhost) by curry.mchp.siemens.de (8.14.7/8.14.7) id r4VCQBgI029892; Date: Fri, 31 May 2013 14:26:11 +0200 From: Andre Albsmeier To: freebsd-stable@freebsd.org Subject: FreeBSD-9.1: machine reboots during snapshot creation, LORs found Message-ID: <20130531122611.GA6607@bali> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Echelon: X-Advice: Drop that crappy M$-Outlook, I'm tired of your viruses! User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 12:41:40 -0000 Each day at 5:15 we are generating snapshots on various machines. This used to work perfectly under 7-STABLE for years but since we started to use 9.1-STABLE the machine reboots in about 10% of all cases. After rebooting we find a new snapshot file which is a bit smaller than the good ones and with different permissions It does not succeed a fsck. In this example it is the one whose name is beginning with s3: -r--r----- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 -r-------- 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel I see the following LORs (mksnap_ffs starts exactly at 5:15): May 29 05:15:00 palveli kernel: lock order reversal: May 29 05:15:00 palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 May 29 05:15:00 palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 May 29 05:15:04 palveli kernel: lock order reversal: May 29 05:15:04 palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 May 29 05:15:04 palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 Unfortunatley no corefiles are being generated ;-(. I have checked and even rebuilt the (UFS1) fs in question from scratch. I have also seen this happen on an UFS2 on another machine and on a third one when running "dump -L" on a root fs. Any hints of how to proceed? -Andre From owner-freebsd-stable@FreeBSD.ORG Fri May 31 13:44:10 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 4B807F42 for ; Fri, 31 May 2013 13:44:10 +0000 (UTC) (envelope-from h.schmalzbauer@omnilan.de) Received: from host.omnilan.net (s1.omnilan.net [62.245.232.135]) by mx1.freebsd.org (Postfix) with ESMTP id BBE562D2 for ; Fri, 31 May 2013 13:44:09 +0000 (UTC) Received: from titan.inop.wdn.omnilan.net (titan.inop.wdn.omnilan.net [172.21.3.1]) (authenticated bits=0) by host.omnilan.net (8.13.8/8.13.8) with ESMTP id r4VDh5ki020919 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 31 May 2013 15:43:05 +0200 (CEST) (envelope-from h.schmalzbauer@omnilan.de) Message-ID: <51A8A8E4.5000004@omnilan.de> Date: Fri, 31 May 2013 15:43:00 +0200 From: Harald Schmalzbauer Organization: OmniLAN User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; de-DE; rv:1.9.2.8) Gecko/20100906 Lightning/1.0b2 Thunderbird/3.1.2 MIME-Version: 1.0 To: FreeBSD Stable Subject: pf loosing (v6) TCP states much too early, "no-route" not working with IPv6 X-Enigmail-Version: 1.1.2 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig5374106B3D496F7FBAB429F8" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 13:44:10 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig5374106B3D496F7FBAB429F8 Content-Type: text/plain; charset=ISO-8859-15 Content-Transfer-Encoding: quoted-printable Hello, my default pf config blocks everything and allowes specific connections. One of them is "in from x to self port ssh" which expands to "port ssh keep state flags S/SA" by default. After ssh login, I see the corresponding entry in the states table: all tcp 2001:db8:f0bb:1::1[22] <- 2001:db8:f0bb:1::3:1[42730] =20 ESTABLISHED:ESTABLISHED pfctl -s info claims: TIMEOUTS: =2E.. tcp.established 86400s =2E.. After a couple of hours of inactivity, the ssh session silently stalls. Here's what I have in the log: rule 3/0(match): block in on rl1: 2001:db8:f0bb:1::3:1.42730 > 2001:db8:f0bb:1::1.22: Flags [P.], ack 1444009640, win 65535, length 48 The rule evaluation by itself is correct, it's no TCP-SYN, so it get's blocked, but this packet should not get through the ruleset at all, at least not before 86400s of idle connection. In my case, it was after ~3 hours. And ports numbers are exactly the same as in the state table entry from some hours before. So the state table entry seems to got lost!= My question: Is such a problem known? Did I miss enything else? System runs 8.1-STABLE/x86 Another issue was that "no-route" doesn't work for IPv6 connections. I had to replace it with "any". Thansk for any hints in advance, -Harry P.S.: It's an embedded box where upgrading is overdue, but not that easy.= =2E. --------------enig5374106B3D496F7FBAB429F8 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (FreeBSD) iEYEARECAAYFAlGoqOkACgkQLDqVQ9VXb8hKigCdH2JVV4Rh/TyTwDWzHU0Vxk94 B2IAn3BsdCATvh9E6aWRWdscANM1UFia =mWSN -----END PGP SIGNATURE----- --------------enig5374106B3D496F7FBAB429F8-- From owner-freebsd-stable@FreeBSD.ORG Fri May 31 14:58:36 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1A47E2AF for ; Fri, 31 May 2013 14:58:36 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigwig.baldwin.cx [IPv6:2001:470:1f11:75::1]) by mx1.freebsd.org (Postfix) with ESMTP id EC738911 for ; Fri, 31 May 2013 14:58:35 +0000 (UTC) Received: from jhbbsd.localnet (unknown [209.249.190.124]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 6940CB917; Fri, 31 May 2013 10:58:35 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Subject: Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found Date: Fri, 31 May 2013 10:51:03 -0400 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p25; KDE/4.5.5; amd64; ; ) References: <20130531122611.GA6607@bali> In-Reply-To: <20130531122611.GA6607@bali> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201305311051.03157.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 31 May 2013 10:58:35 -0400 (EDT) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 14:58:36 -0000 On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: > Each day at 5:15 we are generating snapshots on various machines. > This used to work perfectly under 7-STABLE for years but since > we started to use 9.1-STABLE the machine reboots in about 10% > of all cases. > > After rebooting we find a new snapshot file which is a bit > smaller than the good ones and with different permissions > It does not succeed a fsck. In this example it is the one > whose name is beginning with s3: > > -r--r----- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 > -r-------- 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 > -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 > -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 > -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 > > After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel > I see the following LORs (mksnap_ffs starts exactly at 5:15): > > May 29 05:15:00 palveli kernel: lock order reversal: > May 29 05:15:00 palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 > May 29 05:15:00 palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 > May 29 05:15:04 palveli kernel: lock order reversal: > May 29 05:15:04 palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 > May 29 05:15:04 palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 > > Unfortunatley no corefiles are being generated ;-(. > > I have checked and even rebuilt the (UFS1) fs in question > from scratch. I have also seen this happen on an UFS2 on > another machine and on a third one when running "dump -L" > on a root fs. > > Any hints of how to proceed? Would it be possible to setup a serial console that is logged on this machine to see if it is panic'ing but failing to write out a crashdump? -- John Baldwin From owner-freebsd-stable@FreeBSD.ORG Fri May 31 17:27:59 2013 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 88AC1EF7; Fri, 31 May 2013 17:27:59 +0000 (UTC) (envelope-from Andre.Albsmeier@siemens.com) Received: from thoth.sbs.de (thoth.sbs.de [192.35.17.2]) by mx1.freebsd.org (Postfix) with ESMTP id 22B14155; Fri, 31 May 2013 17:27:58 +0000 (UTC) Received: from mail1.siemens.de (localhost [127.0.0.1]) by thoth.sbs.de (8.13.6/8.13.6) with ESMTP id r4VHPNrI006224; Fri, 31 May 2013 19:25:23 +0200 Received: from curry.mchp.siemens.de (curry.mchp.siemens.de [139.25.40.130]) by mail1.siemens.de (8.13.6/8.13.6) with ESMTP id r4VHPNLh004654; Fri, 31 May 2013 19:25:23 +0200 Received: (from localhost) by curry.mchp.siemens.de (8.14.7/8.14.7) id r4VHPNTH030764; Date: Fri, 31 May 2013 19:25:23 +0200 From: Andre Albsmeier To: John Baldwin Subject: Re: FreeBSD-9.1: machine reboots during snapshot creation, LORs found Message-ID: <20130531172523.GA9188@bali> References: <20130531122611.GA6607@bali> <201305311051.03157.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201305311051.03157.jhb@freebsd.org> X-Echelon: X-Advice: Drop that crappy M$-Outlook, I'm tired of your viruses! User-Agent: Mutt/1.5.21 (2010-09-15) Cc: "freebsd-stable@freebsd.org" X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 31 May 2013 17:27:59 -0000 On Fri, 31-May-2013 at 16:51:03 +0200, John Baldwin wrote: > On Friday, May 31, 2013 8:26:11 am Andre Albsmeier wrote: > > Each day at 5:15 we are generating snapshots on various machines. > > This used to work perfectly under 7-STABLE for years but since > > we started to use 9.1-STABLE the machine reboots in about 10% > > of all cases. > > > > After rebooting we find a new snapshot file which is a bit > > smaller than the good ones and with different permissions > > It does not succeed a fsck. In this example it is the one > > whose name is beginning with s3: > > > > -r--r----- 1 root operator snapshot 72802894528 29 May 05:15 s2-2013.05.28-03.15.04 > > -r-------- 1 root operator snapshot 72802893824 29 May 05:15 s3-2013.05.29-03.15.03 > > -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s4-2013.05.23-06.38.44 > > -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s5-2013.05.24-03.15.03 > > -r--r----- 1 root operator snapshot 72802894528 28 May 14:22 s6-2013.05.25-03.15.03 > > > > After enabling DIAGNOSTIC, WITNESS and INVARIANTS in the kernel > > I see the following LORs (mksnap_ffs starts exactly at 5:15): > > > > May 29 05:15:00 palveli kernel: lock order reversal: > > May 29 05:15:00 palveli kernel: 1st 0xc2371da8 ufs (ufs) @ /src/src-9/sys/kern/vfs_mount.c:1240 > > May 29 05:15:00 palveli kernel: 2nd 0xc2371ec4 devfs (devfs) @ /src/src-9/sys/ufs/ffs/ffs_vfsops.c:1414 > > May 29 05:15:04 palveli kernel: lock order reversal: > > May 29 05:15:04 palveli kernel: 1st 0xc228471c snaplk (snaplk) @ /src/src-9/sys/ufs/ufs/ufs_vnops.c:976 > > May 29 05:15:04 palveli kernel: 2nd 0xc22f25e4 ufs (ufs) @ /src/src-9/sys/ufs/ffs/ffs_snapshot.c:1626 > > > > Unfortunatley no corefiles are being generated ;-(. > > > > I have checked and even rebuilt the (UFS1) fs in question > > from scratch. I have also seen this happen on an UFS2 on > > another machine and on a third one when running "dump -L" > > on a root fs. > > > > Any hints of how to proceed? > > Would it be possible to setup a serial console that is logged on this machine > to see if it is panic'ing but failing to write out a crashdump? I'll try to arrange that. It'll take a bit since this box is 200 km away... Maybe I'll find another one nearby to reproduce it... -Andre -- This email has been checked as virus-free. It may still be full of nonsense however.