From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 03:21:20 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 152F91065672 for ; Sun, 26 Sep 2010 03:21:20 +0000 (UTC) (envelope-from adrian.chadd@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id C9E7B8FC15 for ; Sun, 26 Sep 2010 03:21:19 +0000 (UTC) Received: by iwn34 with SMTP id 34so4605528iwn.13 for ; Sat, 25 Sep 2010 20:21:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=6caLjY8lk3S4dfBN16KQ8rSyAfwe9I5+sEH9xizp/JQ=; b=K3aaBpMh+8hhG4yd9hzG6zYZF4U3y73FcdBeuyUO3MOS/Ozseo0JI31VvETVlPVsaS ehHDiXe6yOi13syDtkRTPxmxQnDeSnO/SfKunuhkxfhsjzYEhIqayFueNDSnVifWHEj7 Kyk5r4Fvn99RLL4sk7uHAnzMkjvVyNR1TtWBI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=NKI3UuCo/maXN1NTpVwhHihc5TpMIq8LZGbya/Et47COCzuv/NpJ5Q+MsTpPooTy6r UQ4tR583XJMoCOuj38KDghRMhVx1bpYem7hqAlTCDeW3BUbA/jsjrJZBV+SQC3RQmthJ /5rgWReDmPf6bXPyEzNZSDojqgsyCROCgoUzE= MIME-Version: 1.0 Received: by 10.231.19.3 with SMTP id y3mr6502280iba.156.1285471277459; Sat, 25 Sep 2010 20:21:17 -0700 (PDT) Sender: adrian.chadd@gmail.com Received: by 10.231.156.206 with HTTP; Sat, 25 Sep 2010 20:21:17 -0700 (PDT) In-Reply-To: <15c5073c1f543e6b6f404233a341b0e97f1d8692@mail.qip.ru> References: <15043432234b76fe1b2ad7382bfd0e7901412ad1@mail.qip.ru> <15c5073c1f543e6b6f404233a341b0e97f1d8692@mail.qip.ru> Date: Sun, 26 Sep 2010 11:21:17 +0800 X-Google-Sender-Auth: Wi-qscJM4vbA006pdWnDITu5XQI Message-ID: From: Adrian Chadd To: Makaruk Roman Valerevich Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: Atheros AR2427 in FreeBSD 8.1 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 03:21:20 -0000 2010/9/18 Makaruk Roman Valerevich : > =A0 =A0Oh! Thank you! I already lost hope that I can properly run its Wi-= Fi card. > =A0 =A0Regards, Roman. Don't lose hope. :) It's going to take me a bit of time (I have a lot going on at the moment) but I will do it. :) Adrian From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 09:17:20 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F27031065672; Sun, 26 Sep 2010 09:17:19 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id AB0978FC19; Sun, 26 Sep 2010 09:17:19 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.4/8.14.3) with ESMTP id o8Q9HIu7066308; Sun, 26 Sep 2010 05:17:18 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.4/8.14.3/Submit) id o8Q9HIgP066304; Sun, 26 Sep 2010 09:17:18 GMT (envelope-from tinderbox@freebsd.org) Date: Sun, 26 Sep 2010 09:17:18 GMT Message-Id: <201009260917.o8Q9HIgP066304@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8 tinderbox] failure on mips/mips X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 09:17:20 -0000 TB --- 2010-09-26 04:31:40 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-09-26 04:31:40 - starting RELENG_8 tinderbox run for mips/mips TB --- 2010-09-26 04:31:40 - cleaning the object tree TB --- 2010-09-26 04:33:01 - cvsupping the source tree TB --- 2010-09-26 04:33:01 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup5.freebsd.org /tinderbox/RELENG_8/mips/mips/supfile TB --- 2010-09-26 04:37:40 - building world TB --- 2010-09-26 04:37:40 - MAKEOBJDIRPREFIX=/obj TB --- 2010-09-26 04:37:40 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2010-09-26 04:37:40 - TARGET=mips TB --- 2010-09-26 04:37:40 - TARGET_ARCH=mips TB --- 2010-09-26 04:37:40 - TZ=UTC TB --- 2010-09-26 04:37:40 - __MAKE_CONF=/dev/null TB --- 2010-09-26 04:37:40 - cd /src TB --- 2010-09-26 04:37:40 - /usr/bin/make -B buildworld >>> World build started on Sun Sep 26 04:37:41 UTC 2010 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything [...] /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 *** Error code 1 Stop in /src/usr.bin/tftp. *** Error code 1 Stop in /src/usr.bin. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2010-09-26 09:17:18 - WARNING: /usr/bin/make returned exit code 1 TB --- 2010-09-26 09:17:18 - ERROR: failed to build world TB --- 2010-09-26 09:17:18 - 2059.02 user 9184.98 system 17138.43 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8-mips-mips.full From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 09:31:46 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 518FC106566C for ; Sun, 26 Sep 2010 09:31:46 +0000 (UTC) (envelope-from demelier.david@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id D6F138FC0A for ; Sun, 26 Sep 2010 09:31:45 +0000 (UTC) Received: by bwz15 with SMTP id 15so3675722bwz.13 for ; Sun, 26 Sep 2010 02:31:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=kus1Kl+joKRgBd3bR8Qf0ntTi9zQ+raQsQaicknI9tI=; b=QZz8/JU43iLhXqciDBk97GD8nDZezppA4lZUpuKBC4JSoryqzG3mE85m6Loip2twVD t1tZafzDeLuV9VZr4AYKz/QPvombO8xyfsMEGRSaNxKMYN0n1snwHB3WbWdZBiDkFr6e fkI7BsXzEbWWOPpFWHgmIyb5akwFS8qKGAKDc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=xf4EKp50jAuWyMKx4c2zGA5mN/pdScLz19NQv2sxzKgIQfZH54L7ttCGbTSGF3mB8G IkFUC6L8sKXhYErO5qcewiqiFowr6NZD0Il83HAJq9/RXF69x/yEw3EnaetsO5/DdhkN FSiAPb115n0iaDHoV2MBCVKo4/2zW+empZGXc= MIME-Version: 1.0 Received: by 10.204.75.81 with SMTP id x17mr4040573bkj.72.1285493501465; Sun, 26 Sep 2010 02:31:41 -0700 (PDT) Received: by 10.204.97.208 with HTTP; Sun, 26 Sep 2010 02:31:41 -0700 (PDT) Date: Sun, 26 Sep 2010 11:31:41 +0200 Message-ID: From: David DEMELIER To: freebsd-stable Content-Type: text/plain; charset=UTF-8 Subject: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 09:31:46 -0000 Hi, I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/146464 I can't burn with the optical drive, I can read it but not burn. When I try to use burncd it says acd0: FAILURE - MODE_SELECT_BIG timed out It's a Master: acd0 SATA revision 1.x Slave: no device present I would love to use ahci instead of ide since it's really faster. I don't know how I can help you more, FreeBSD Melon.malikania.fr 8.1-STABLE FreeBSD 8.1-STABLE #3: Sun Sep 26 10:15:10 CEST 2010 root@Melon.malikania.fr:/usr/obj/usr/src/sys/Melon amd64 Kind regards, -- Demelier David From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 09:48:47 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 761EC1065696 for ; Sun, 26 Sep 2010 09:48:47 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id B8A618FC1E for ; Sun, 26 Sep 2010 09:48:46 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA14279; Sun, 26 Sep 2010 12:48:43 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Oznqc-000HXv-Os; Sun, 26 Sep 2010 12:48:42 +0300 Message-ID: <4C9F16F9.50707@icyb.net.ua> Date: Sun, 26 Sep 2010 12:48:41 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: David DEMELIER References: In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-stable Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 09:48:47 -0000 on 26/09/2010 12:31 David DEMELIER said the following: > Hi, > > I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/146464 > > I can't burn with the optical drive, I can read it but not burn. When > I try to use burncd it says > > acd0: FAILURE - MODE_SELECT_BIG timed out > > It's a > Master: acd0 SATA revision 1.x > Slave: no device present > > I would love to use ahci instead of ide since it's really faster. I > don't know how I can help you more, > > FreeBSD Melon.malikania.fr 8.1-STABLE FreeBSD 8.1-STABLE #3: Sun Sep > 26 10:15:10 CEST 2010 > root@Melon.malikania.fr:/usr/obj/usr/src/sys/Melon amd64 Try using ahci(4) driver. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 09:53:03 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA0D11065670 for ; Sun, 26 Sep 2010 09:53:03 +0000 (UTC) (envelope-from demelier.david@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 379AE8FC15 for ; Sun, 26 Sep 2010 09:53:02 +0000 (UTC) Received: by bwz15 with SMTP id 15so3680783bwz.13 for ; Sun, 26 Sep 2010 02:53:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=mVRQg5xG7A8H9YAv+hYWYP0KhS6ogxJ/pwVVmV1RiOE=; b=gUD7U/n8KOhqOPo1SedpmuNvdxJMzpMtOKOro96npdkpgYL9sXQtxQkD3Q5gkJKiB5 Ba/rVfCx+1qQa1ZJ1kWnMVglExj05YJ6twWgPKZxltRpdOmSfoXc2D1R6fQTkDypnHhf dl+XqLJPtH6+i+D+TQxN0Pd+3geE2WxN6IpUQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=aDRcBGNrA789cJl1dsSW1rI/iWpXiCiaj9wXk7VnWlEgjHy+6UJIMsLept7zwWkNDv sJlJqACyVazDUrIUEm6kQf56vEXzJwzc6o0kXL8ZQJdDTlS3Mmq2/D7Az7D13o6DwdJC BnRRk1PQ4vYCKW47LWxsWjUGipkMo5L3xA98M= MIME-Version: 1.0 Received: by 10.204.39.203 with SMTP id h11mr4143976bke.8.1285494781461; Sun, 26 Sep 2010 02:53:01 -0700 (PDT) Received: by 10.204.97.208 with HTTP; Sun, 26 Sep 2010 02:53:01 -0700 (PDT) In-Reply-To: <4C9F16F9.50707@icyb.net.ua> References: <4C9F16F9.50707@icyb.net.ua> Date: Sun, 26 Sep 2010 11:53:01 +0200 Message-ID: From: David DEMELIER To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 09:53:03 -0000 2010/9/26 Andriy Gapon : > on 26/09/2010 12:31 David DEMELIER said the following: >> Hi, >> >> I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/146= 464 >> >> I can't burn with the optical drive, I can read it but not burn. When >> I try to use burncd it says >> >> acd0: FAILURE - MODE_SELECT_BIG timed out >> >> It's a >> Master: acd0 SATA revision 1.x >> =C2=A0 =C2=A0 Slave: =C2=A0 =C2=A0 =C2=A0 no device present >> >> I would love to use ahci instead of ide since it's really faster. I >> don't know how I can help you more, >> >> FreeBSD Melon.malikania.fr 8.1-STABLE FreeBSD 8.1-STABLE #3: Sun Sep >> 26 10:15:10 CEST 2010 >> root@Melon.malikania.fr:/usr/obj/usr/src/sys/Melon =C2=A0amd64 > > Try using ahci(4) driver. > > -- > Andriy Gapon > That's what I'm doing. --=20 Demelier David From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 10:13:59 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D508106564A for ; Sun, 26 Sep 2010 10:13:59 +0000 (UTC) (envelope-from kouk@noc.uoa.gr) Received: from msa.uoa.gr (msa.uoa.gr [195.134.100.72]) by mx1.freebsd.org (Postfix) with ESMTP id B89048FC0C for ; Sun, 26 Sep 2010 10:13:58 +0000 (UTC) Received: by MSA with id DD9E017BF5EF30723E89FF2D4ED3A21E009F7592 Received: from minipanic (cust-129-132.on4.ontelecoms.gr [92.118.129.132]) (authenticated bits=0) by msa.uoa.gr (8.14.1/8.14.1) with ESMTP id o8Q990gl021330 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for ; Sun, 26 Sep 2010 12:09:06 +0300 (EEST) Date: Sun, 26 Sep 2010 12:13:20 +0300 From: Konstantinos Koukopoulos To: freebsd-stable@freebsd.org Message-ID: <20100926091317.GA16268@minipanic> Mail-Followup-To: freebsd-stable@freebsd.org References: <4C9E3BB9.2040800@DataIX.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: X-Accept-Language: en, el, el_GR X-GPG-ID: 0xA22D3103 X-GPG-Fingerprint: 61C1 6B96 1BB9 1B05 4753 F822 230F 2353 A22D 3103 User-Agent: Mutt/1.5.20 (2009-06-14) Sender: kouk@noc.uoa.gr X-UoAMSAId: DD9E017BF5EF30723E89FF2D4ED3A21E009F7592 Subject: Re: Web feeds for UPDATING files X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 10:13:59 -0000 On Sat 25.Sep.10 22:25, David DEMELIER wrote: > >It's so great that it should be merge to the official FreeBSD web page! > I totaly agree. This has saved me some grief and I'm sure it will others as well. Thanks Alexander! From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 10:40:27 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 00BA9106564A for ; Sun, 26 Sep 2010 10:40:27 +0000 (UTC) (envelope-from erikt@midgard.homeip.net) Received: from ch-smtp03.sth.basefarm.net (ch-smtp03.sth.basefarm.net [80.76.149.214]) by mx1.freebsd.org (Postfix) with ESMTP id 7BC628FC12 for ; Sun, 26 Sep 2010 10:40:26 +0000 (UTC) Received: from c83-255-61-120.bredband.comhem.se ([83.255.61.120]:27977 helo=falcon.midgard.homeip.net) by ch-smtp03.sth.basefarm.net with esmtp (Exim 4.68) (envelope-from ) id 1Ozocx-00053f-CQ for freebsd-stable@freebsd.org; Sun, 26 Sep 2010 12:38:41 +0200 Received: (qmail 36047 invoked from network); 26 Sep 2010 12:38:36 +0200 Received: from owl.midgard.homeip.net (10.1.5.7) by falcon.midgard.homeip.net with ESMTP; 26 Sep 2010 12:38:36 +0200 Received: (qmail 33654 invoked by uid 1001); 26 Sep 2010 12:38:36 +0200 Date: Sun, 26 Sep 2010 12:38:36 +0200 From: Erik Trulsson To: David DEMELIER Message-ID: <20100926103836.GA33415@owl.midgard.homeip.net> References: <4C9F16F9.50707@icyb.net.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) X-Originating-IP: 83.255.61.120 X-Scan-Result: No virus found in message 1Ozocx-00053f-CQ. X-Scan-Signature: ch-smtp03.sth.basefarm.net 1Ozocx-00053f-CQ 96b2e58f67b67dd3650cf647d41b6a2f Cc: freebsd-stable , Andriy Gapon Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 10:40:27 -0000 On Sun, Sep 26, 2010 at 11:53:01AM +0200, David DEMELIER wrote: > 2010/9/26 Andriy Gapon : > > on 26/09/2010 12:31 David DEMELIER said the following: > >> Hi, > >> > >> I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/1= 46464 > >> > >> I can't burn with the optical drive, I can read it but not burn. When > >> I try to use burncd it says > >> > >> acd0: FAILURE - MODE_SELECT_BIG timed out > >> > >> It's a > >> Master: acd0 SATA revision 1.x > >> =A0 =A0 Slave: =A0 =A0 =A0 no device present > >> > >> I would love to use ahci instead of ide since it's really faster. I > >> don't know how I can help you more, > >> > >> FreeBSD Melon.malikania.fr 8.1-STABLE FreeBSD 8.1-STABLE #3: Sun Sep > >> 26 10:15:10 CEST 2010 > >> root@Melon.malikania.fr:/usr/obj/usr/src/sys/Melon =A0amd64 > > > > Try using ahci(4) driver. > > > > -- > > Andriy Gapon > > >=20 > That's what I'm doing. If you were using the ahci(4) driver the optical drive ought to show up as cd0 rather than acd0. Anyway, I think burncd will only work with the old ata(4) driver. The new ahci(4) and ada(4) drivers use the cam(4) system and looks pretty much like SCSI drives to the rest of the system. Try the sysutils/cdrecord port for burning. (Should work with all SCSI and ATAPI optical drives, regardless of which driver is used.) --=20 Erik Trulsson ertr1013@student.uu.se From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 10:46:20 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 19DA51065670 for ; Sun, 26 Sep 2010 10:46:20 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 616888FC08 for ; Sun, 26 Sep 2010 10:46:19 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA14804; Sun, 26 Sep 2010 13:46:17 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1OzokL-000Hbv-3M; Sun, 26 Sep 2010 13:46:17 +0300 Message-ID: <4C9F2478.2010105@icyb.net.ua> Date: Sun, 26 Sep 2010 13:46:16 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: David DEMELIER References: <4C9F16F9.50707@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-stable Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 10:46:20 -0000 on 26/09/2010 12:53 David DEMELIER said the following: > 2010/9/26 Andriy Gapon : >> on 26/09/2010 12:31 David DEMELIER said the following: >>> Hi, >>> >>> I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/146464 >>> >>> I can't burn with the optical drive, I can read it but not burn. When >>> I try to use burncd it says >>> >>> acd0: FAILURE - MODE_SELECT_BIG timed out >>> >>> It's a >>> Master: acd0 SATA revision 1.x >>> Slave: no device present >>> >>> I would love to use ahci instead of ide since it's really faster. I >>> don't know how I can help you more, >>> >>> FreeBSD Melon.malikania.fr 8.1-STABLE FreeBSD 8.1-STABLE #3: Sun Sep >>> 26 10:15:10 CEST 2010 >>> root@Melon.malikania.fr:/usr/obj/usr/src/sys/Melon amd64 >> >> Try using ahci(4) driver. >> >> -- >> Andriy Gapon >> > > That's what I'm doing. Highly unlikely. Proof? -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 17:12:19 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3FD2510656C2 for ; Sun, 26 Sep 2010 17:12:19 +0000 (UTC) (envelope-from demelier.david@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id BF8388FC1C for ; Sun, 26 Sep 2010 17:12:18 +0000 (UTC) Received: by bwz15 with SMTP id 15so3831417bwz.13 for ; Sun, 26 Sep 2010 10:12:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=83hdgW6yQr+ji4VQCPqQ1kcipq+QiiZ5JeFCPgOPvNo=; b=ePcy1s1VNprZajxroymIDcxtOICLklNAjKbO88xvSPTEyThWu70jCRT5JZtcKPtXD/ VJ/cCCkLIEk7FBOIABZCuuvngHHFrv9sqoc/+QAZyqEd+7rcebXhPza08cu4KyJrtXKl 0fN0LLDGgb+ND/y76JxtCXbtvg+UA7aYBXypA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=mLY4/Gm4pGO2/wQ9fpZak3IuLto8MKopvSDNS/0hbS5jg3An3ldyO+LwUu7/JASxKb 0MFHyiUXjXt70ZHUZ5yhSkZPXYzk+wSn1YWc7ZmiTwY2e0myfuvPREz5FB17Fr9b0eB/ JSPytuLU0r5wbK7i1Q4gEBAFZK52xAPx2cVAI= MIME-Version: 1.0 Received: by 10.204.133.146 with SMTP id f18mr4352832bkt.97.1285521137235; Sun, 26 Sep 2010 10:12:17 -0700 (PDT) Received: by 10.204.97.208 with HTTP; Sun, 26 Sep 2010 10:12:17 -0700 (PDT) In-Reply-To: <20100926103836.GA33415@owl.midgard.homeip.net> References: <4C9F16F9.50707@icyb.net.ua> <20100926103836.GA33415@owl.midgard.homeip.net> Date: Sun, 26 Sep 2010 19:12:17 +0200 Message-ID: From: David DEMELIER To: Erik Trulsson Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable , Andriy Gapon Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 17:12:19 -0000 2010/9/26 Erik Trulsson : > On Sun, Sep 26, 2010 at 11:53:01AM +0200, David DEMELIER wrote: >> 2010/9/26 Andriy Gapon : >> > on 26/09/2010 12:31 David DEMELIER said the following: >> >> Hi, >> >> >> >> I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dkern/= 146464 >> >> >> >> I can't burn with the optical drive, I can read it but not burn. When >> >> I try to use burncd it says >> >> >> >> acd0: FAILURE - MODE_SELECT_BIG timed out >> >> >> >> It's a >> >> Master: acd0 SATA revision 1.x >> >> =C2=A0 =C2=A0 Slave: =C2=A0 =C2=A0 =C2=A0 no device present >> >> >> >> I would love to use ahci instead of ide since it's really faster. I >> >> don't know how I can help you more, >> >> >> >> FreeBSD Melon.malikania.fr 8.1-STABLE FreeBSD 8.1-STABLE #3: Sun Sep >> >> 26 10:15:10 CEST 2010 >> >> root@Melon.malikania.fr:/usr/obj/usr/src/sys/Melon =C2=A0amd64 >> > >> > Try using ahci(4) driver. >> > >> > -- >> > Andriy Gapon >> > >> >> That's what I'm doing. > > If you were using the ahci(4) driver the optical drive ought to show up > as cd0 rather than acd0. > > Anyway, I think burncd will only work with the old ata(4) driver. =C2=A0T= he > new ahci(4) and ada(4) drivers use the cam(4) system and looks pretty > much like SCSI drives to the rest of the system. > > Try the sysutils/cdrecord port for burning. (Should work with all SCSI > and ATAPI optical drives, regardless of which driver is used.) > > > > > -- > > Erik Trulsson > ertr1013@student.uu.se > No it's still acd0, do I need atapicam as device in my kernel configuration= ? Thanks for your answers. --=20 Demelier David From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 17:19:39 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AA03D1065672 for ; Sun, 26 Sep 2010 17:19:39 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id F069C8FC17 for ; Sun, 26 Sep 2010 17:19:38 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id UAA20209; Sun, 26 Sep 2010 20:19:36 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Ozusx-000I2c-Se; Sun, 26 Sep 2010 20:19:35 +0300 Message-ID: <4C9F80A7.30002@icyb.net.ua> Date: Sun, 26 Sep 2010 20:19:35 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: David DEMELIER References: <4C9F16F9.50707@icyb.net.ua> <20100926103836.GA33415@owl.midgard.homeip.net> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-stable Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 17:19:39 -0000 on 26/09/2010 20:12 David DEMELIER said the following: > No it's still acd0, do I need atapicam as device in my kernel configuration? No, you don't. You have to try to use to ahci(4) driver. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 17:25:29 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E02D1065672 for ; Sun, 26 Sep 2010 17:25:29 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta08.emeryville.ca.mail.comcast.net (qmta08.emeryville.ca.mail.comcast.net [76.96.30.80]) by mx1.freebsd.org (Postfix) with ESMTP id 0F8FE8FC16 for ; Sun, 26 Sep 2010 17:25:28 +0000 (UTC) Received: from omta14.emeryville.ca.mail.comcast.net ([76.96.30.60]) by qmta08.emeryville.ca.mail.comcast.net with comcast id BVGR1f0041HpZEsA8VRUPG; Sun, 26 Sep 2010 17:25:28 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta14.emeryville.ca.mail.comcast.net with comcast id BVRT1f0043LrwQ28aVRTWZ; Sun, 26 Sep 2010 17:25:28 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 0F7D49B418; Sun, 26 Sep 2010 10:25:27 -0700 (PDT) Date: Sun, 26 Sep 2010 10:25:27 -0700 From: Jeremy Chadwick To: David DEMELIER Message-ID: <20100926172527.GA38405@icarus.home.lan> References: <4C9F16F9.50707@icyb.net.ua> <20100926103836.GA33415@owl.midgard.homeip.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable , Andriy Gapon Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 17:25:29 -0000 On Sun, Sep 26, 2010 at 07:12:17PM +0200, David DEMELIER wrote: > 2010/9/26 Erik Trulsson : > > On Sun, Sep 26, 2010 at 11:53:01AM +0200, David DEMELIER wrote: > >> 2010/9/26 Andriy Gapon : > >> > on 26/09/2010 12:31 David DEMELIER said the following: > >> >> Hi, > >> >> > >> >> I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/146464 > >> >> > >> >> I can't burn with the optical drive, I can read it but not burn. When > >> >> I try to use burncd it says > >> >> > >> >> acd0: FAILURE - MODE_SELECT_BIG timed out > >> >> > >> >> It's a > >> >> Master: acd0 SATA revision 1.x > >> >> Slave: no device present > >> >> > >> >> I would love to use ahci instead of ide since it's really faster. I > >> >> don't know how I can help you more, > >> >> > >> >> FreeBSD Melon.malikania.fr 8.1-STABLE FreeBSD 8.1-STABLE #3: Sun Sep > >> >> 26 10:15:10 CEST 2010 > >> >> root@Melon.malikania.fr:/usr/obj/usr/src/sys/Melon amd64 > >> > > >> > Try using ahci(4) driver. > >> > > >> > -- > >> > Andriy Gapon > >> > > >> > >> That's what I'm doing. > > > > If you were using the ahci(4) driver the optical drive ought to show up > > as cd0 rather than acd0. > > > > Anyway, I think burncd will only work with the old ata(4) driver. The > > new ahci(4) and ada(4) drivers use the cam(4) system and looks pretty > > much like SCSI drives to the rest of the system. > > > > Try the sysutils/cdrecord port for burning. (Should work with all SCSI > > and ATAPI optical drives, regardless of which driver is used.) > > > > > > > > > > -- > > > > Erik Trulsson > > ertr1013@student.uu.se > > > > No it's still acd0, do I need atapicam as device in my kernel configuration? > > Thanks for your answers. There's a difference between ataahci.ko ("device ataahci") and ahci.ko, in case you're confusing to two. Can you please provide your loader.conf and kernel configuration file? Thanks. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 17:37:17 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0AB85106566C for ; Sun, 26 Sep 2010 17:37:17 +0000 (UTC) (envelope-from demelier.david@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 865C18FC13 for ; Sun, 26 Sep 2010 17:37:16 +0000 (UTC) Received: by bwz15 with SMTP id 15so3841003bwz.13 for ; Sun, 26 Sep 2010 10:37:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=Hp3bqvxcaCg1gHR+8k1/t2XnAcJ87mlUZoIqL5SxrEk=; b=Mvcd8ZkBwqpoVVbUh/J1F5zVb2KKbJoqNF+SXwrgRJd3zjNIKVyrGLaoFHXQ3ehkqR o33CXXafOAIbLx0y2CvdeN7QAem4x/FL2RiTSPgAzFGOjFtQE4Tui5N/mDWT4Zkp6one iZjhdQMLWZqYMFtC5Wb9Tab0ml2uXZem9ZE50= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=d1rygxT1kXHh7d5P2ppnTS0lGqaUThuFGaMz4wFvblIEviIcSHZe7sSnKkMuMi3xtP FGyqplN6LHdxVfYOo1VirG9bFu7ob8zG1ZfFvnKrw8v3BTvslQj3ujfHo1syV1V/G6jQ oa1SBw3B7sCIsCmKtqbNJKY1HQmT0awIqeMm0= MIME-Version: 1.0 Received: by 10.204.11.13 with SMTP id r13mr4386608bkr.96.1285522635147; Sun, 26 Sep 2010 10:37:15 -0700 (PDT) Received: by 10.204.97.208 with HTTP; Sun, 26 Sep 2010 10:37:15 -0700 (PDT) In-Reply-To: <20100926172527.GA38405@icarus.home.lan> References: <4C9F16F9.50707@icyb.net.ua> <20100926103836.GA33415@owl.midgard.homeip.net> <20100926172527.GA38405@icarus.home.lan> Date: Sun, 26 Sep 2010 19:37:15 +0200 Message-ID: From: David DEMELIER To: Jeremy Chadwick Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable , Andriy Gapon Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 17:37:17 -0000 2010/9/26 Jeremy Chadwick : > On Sun, Sep 26, 2010 at 07:12:17PM +0200, David DEMELIER wrote: >> 2010/9/26 Erik Trulsson : >> > On Sun, Sep 26, 2010 at 11:53:01AM +0200, David DEMELIER wrote: >> >> 2010/9/26 Andriy Gapon : >> >> > on 26/09/2010 12:31 David DEMELIER said the following: >> >> >> Hi, >> >> >> >> >> >> I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=3Dke= rn/146464 >> >> >> >> >> >> I can't burn with the optical drive, I can read it but not burn. W= hen >> >> >> I try to use burncd it says >> >> >> >> >> >> acd0: FAILURE - MODE_SELECT_BIG timed out >> >> >> >> >> >> It's a >> >> >> Master: acd0 SATA revision 1.x >> >> >> =C2=A0 =C2=A0 Slave: =C2=A0 =C2=A0 =C2=A0 no device present >> >> >> >> >> >> I would love to use ahci instead of ide since it's really faster. = I >> >> >> don't know how I can help you more, >> >> >> >> >> >> FreeBSD Melon.malikania.fr 8.1-STABLE FreeBSD 8.1-STABLE #3: Sun S= ep >> >> >> 26 10:15:10 CEST 2010 >> >> >> root@Melon.malikania.fr:/usr/obj/usr/src/sys/Melon =C2=A0amd64 >> >> > >> >> > Try using ahci(4) driver. >> >> > >> >> > -- >> >> > Andriy Gapon >> >> > >> >> >> >> That's what I'm doing. >> > >> > If you were using the ahci(4) driver the optical drive ought to show u= p >> > as cd0 rather than acd0. >> > >> > Anyway, I think burncd will only work with the old ata(4) driver. =C2= =A0The >> > new ahci(4) and ada(4) drivers use the cam(4) system and looks pretty >> > much like SCSI drives to the rest of the system. >> > >> > Try the sysutils/cdrecord port for burning. (Should work with all SCSI >> > and ATAPI optical drives, regardless of which driver is used.) >> > >> > >> > >> > >> > -- >> > >> > Erik Trulsson >> > ertr1013@student.uu.se >> > >> >> No it's still acd0, do I need atapicam as device in my kernel configurat= ion? >> >> Thanks for your answers. > > There's a difference between ataahci.ko ("device ataahci") and ahci.ko, > in case you're confusing to two. > > Can you please provide your loader.conf and kernel configuration file? > Thanks. > > -- > | Jeremy Chadwick =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 jdc@parodiu= s.com | > | Parodius Networking =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 http://www.parodius.com/ | > | UNIX Systems Administrator =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0Mountain View, CA, USA | > | Making life hard for others since 1977. =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0PGP: 4BD6C0CB | > > http://files.malikania.fr/Melon and http://files.malikania.fr/loader.conf markand@Melon ~ $ sudo camcontrol devlist at scbus0 target 0 lun 0 (ada0) at scbus1 target 0 lun 0 (cd0) markand@Melon ~ $ sudo cdrecord downloads/FreeBSD-8.1-RELEASE-amd64-livefs.= iso cdrecord: No write mode specified. cdrecord: Assuming -sao mode. cdrecord: If your drive does not accept -sao, try -tao. cdrecord: Future versions of cdrecord may have different drive dependent defaults. Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright (C) 1995-2010 J=C3=B6rg Schilling cdrecord: Error 0. Cannot open or use SCSI driver. cdrecord: For possible targets try 'cdrecord -scanbus'. cdrecord: For possible transport specifiers try 'cdrecord dev=3Dhelp'. markand@Melon ~ $ sudo cdrecord -scanbus Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright (C) 1995-2010 J=C3=B6rg Schilling cdrecord: Error 0. Cannot open or use SCSI driver. cdrecord: For possible targets try 'cdrecord -scanbus'. cdrecord: For possible transport specifiers try 'cdrecord dev=3Dhelp'. Cheers --=20 Demelier David From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 17:47:39 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B7746106564A for ; Sun, 26 Sep 2010 17:47:39 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta10.westchester.pa.mail.comcast.net (qmta10.westchester.pa.mail.comcast.net [76.96.62.17]) by mx1.freebsd.org (Postfix) with ESMTP id 5EF3F8FC14 for ; Sun, 26 Sep 2010 17:47:38 +0000 (UTC) Received: from omta02.westchester.pa.mail.comcast.net ([76.96.62.19]) by qmta10.westchester.pa.mail.comcast.net with comcast id BViK1f0010QuhwU5AVnf1K; Sun, 26 Sep 2010 17:47:39 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta02.westchester.pa.mail.comcast.net with comcast id BVne1f0033LrwQ23NVneX6; Sun, 26 Sep 2010 17:47:39 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id E6FC89B418; Sun, 26 Sep 2010 10:47:36 -0700 (PDT) Date: Sun, 26 Sep 2010 10:47:36 -0700 From: Jeremy Chadwick To: David DEMELIER Message-ID: <20100926174736.GA38750@icarus.home.lan> References: <4C9F16F9.50707@icyb.net.ua> <20100926103836.GA33415@owl.midgard.homeip.net> <20100926172527.GA38405@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable , Andriy Gapon Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 17:47:39 -0000 On Sun, Sep 26, 2010 at 07:37:15PM +0200, David DEMELIER wrote: > 2010/9/26 Jeremy Chadwick : > > On Sun, Sep 26, 2010 at 07:12:17PM +0200, David DEMELIER wrote: > >> 2010/9/26 Erik Trulsson : > >> > On Sun, Sep 26, 2010 at 11:53:01AM +0200, David DEMELIER wrote: > >> >> 2010/9/26 Andriy Gapon : > >> >> > on 26/09/2010 12:31 David DEMELIER said the following: > >> >> >> Hi, > >> >> >> > >> >> >> I posted this PR : http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/146464 > >> >> >> > >> >> >> I can't burn with the optical drive, I can read it but not burn. When > >> >> >> I try to use burncd it says > >> >> >> > >> >> >> acd0: FAILURE - MODE_SELECT_BIG timed out > >> >> >> > >> >> >> It's a > >> >> >> Master: acd0 SATA revision 1.x > >> >> >> Slave: no device present > [...] > >> > If you were using the ahci(4) driver the optical drive ought to show up > >> > as cd0 rather than acd0. > [...] > markand@Melon ~ $ sudo camcontrol devlist > at scbus0 target 0 lun 0 (ada0) > at scbus1 target 0 lun 0 (cd0) Your initial post consists of indication that you're using classic ata(4) (you can see it as a result of acd0 being in use, not cd0, and the output you provided came from atacontrol), followed by statements that you absolutely are using ahci(4) (which prompted Andriy's comments), followed by what appears to be you actually using ahci(4). A mistake was made and/or fixed/addressed between comments, or something is being falsified. Can you shed some light on this? The only reason I mention it is that it makes debugging/troubleshooting basically impossible when there's a complete mismatch in information. It makes it difficult to trust problem reports when something keeps changing. Let us know. Thanks! -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 17:53:01 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B3B38106564A for ; Sun, 26 Sep 2010 17:53:01 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 003678FC08 for ; Sun, 26 Sep 2010 17:53:00 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id UAA20650; Sun, 26 Sep 2010 20:52:55 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1OzvPC-000I5n-R2; Sun, 26 Sep 2010 20:52:54 +0300 Message-ID: <4C9F8876.1040007@icyb.net.ua> Date: Sun, 26 Sep 2010 20:52:54 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: David DEMELIER References: <4C9F16F9.50707@icyb.net.ua> <20100926103836.GA33415@owl.midgard.homeip.net> <20100926172527.GA38405@icarus.home.lan> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cc: freebsd-stable , Jeremy Chadwick Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 17:53:01 -0000 on 26/09/2010 20:37 David DEMELIER said the following: > markand@Melon ~ $ sudo camcontrol devlist > at scbus0 target 0 lun 0 (ada0) > at scbus1 target 0 lun 0 (cd0) > > markand@Melon ~ $ sudo cdrecord downloads/FreeBSD-8.1-RELEASE-amd64-livefs.iso > cdrecord: No write mode specified. > cdrecord: Assuming -sao mode. > cdrecord: If your drive does not accept -sao, try -tao. > cdrecord: Future versions of cdrecord may have different drive > dependent defaults. > Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright > (C) 1995-2010 Jörg Schilling > cdrecord: Error 0. Cannot open or use SCSI driver. > cdrecord: For possible targets try 'cdrecord -scanbus'. > cdrecord: For possible transport specifiers try 'cdrecord dev=help'. > > markand@Melon ~ $ sudo cdrecord -scanbus > Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright > (C) 1995-2010 Jörg Schilling > cdrecord: Error 0. Cannot open or use SCSI driver. > cdrecord: For possible targets try 'cdrecord -scanbus'. > cdrecord: For possible transport specifiers try 'cdrecord dev=help'. Doesn't look like you have xpt and pass devices in your kernel, which are (almost) mandatory when use CAM-based drivers. Not sure if ahci(4) mentions this. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 18:20:30 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D41C2106564A for ; Sun, 26 Sep 2010 18:20:30 +0000 (UTC) (envelope-from demelier.david@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 5B98A8FC12 for ; Sun, 26 Sep 2010 18:20:30 +0000 (UTC) Received: by bwz15 with SMTP id 15so3857711bwz.13 for ; Sun, 26 Sep 2010 11:20:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=yOvKuj6XwDuHd6SGmOC5QZU8Q6AmmR+svjJ4gWD9urY=; b=JDdMPdKqhng/iKhDzAgL4wEs7//lPhqvaVZbmxZKluJzXCKKHanYdHb7pQaaXDnza0 w5glsLqe/aeNI1HCuoz2uY05oJ+w8aVmPceLjvlT+tPezUTI1ujI1H4ZxcL/WnMrZKTA wb7V0tVLSYEd7bQc1kveHhwPMz2DjwVquMK4U= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=hpZnfVJRcmZNasKL0QVNXacSFaWrBWuO/c7s0WV28NyZ7K/tO/ECqAOaIlgC7RR1CR 8fWo/+eB5nuZRiNKiN6z2GC20f21oD9m8jKFvMGQ303e2NkWxrqUPFm6A/mBbmZFcQBz 1Yt7Ttd02oTi35drmwY0t9EYnMjpS4yqAEP1U= MIME-Version: 1.0 Received: by 10.204.11.13 with SMTP id r13mr4422063bkr.96.1285525229176; Sun, 26 Sep 2010 11:20:29 -0700 (PDT) Received: by 10.204.97.208 with HTTP; Sun, 26 Sep 2010 11:20:29 -0700 (PDT) In-Reply-To: <4C9F8876.1040007@icyb.net.ua> References: <4C9F16F9.50707@icyb.net.ua> <20100926103836.GA33415@owl.midgard.homeip.net> <20100926172527.GA38405@icarus.home.lan> <4C9F8876.1040007@icyb.net.ua> Date: Sun, 26 Sep 2010 20:20:29 +0200 Message-ID: From: David DEMELIER To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable , Jeremy Chadwick Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 18:20:30 -0000 2010/9/26 Andriy Gapon : > on 26/09/2010 20:37 David DEMELIER said the following: >> markand@Melon ~ $ sudo camcontrol devlist >> =C2=A0 at scbus0 target 0 lun 0 (ada0) >> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 at scbu= s1 target 0 lun 0 (cd0) >> >> markand@Melon ~ $ sudo cdrecord downloads/FreeBSD-8.1-RELEASE-amd64-live= fs.iso >> cdrecord: No write mode specified. >> cdrecord: Assuming -sao mode. >> cdrecord: If your drive does not accept -sao, try -tao. >> cdrecord: Future versions of cdrecord may have different drive >> dependent defaults. >> Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright >> (C) 1995-2010 J=C3=B6rg Schilling >> cdrecord: Error 0. Cannot open or use SCSI driver. >> cdrecord: For possible targets try 'cdrecord -scanbus'. >> cdrecord: For possible transport specifiers try 'cdrecord dev=3Dhelp'. >> >> markand@Melon ~ $ sudo cdrecord -scanbus >> Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright >> (C) 1995-2010 J=C3=B6rg Schilling >> cdrecord: Error 0. Cannot open or use SCSI driver. >> cdrecord: For possible targets try 'cdrecord -scanbus'. >> cdrecord: For possible transport specifiers try 'cdrecord dev=3Dhelp'. > > Doesn't look like you have xpt and pass devices in your kernel, which are > (almost) mandatory when use CAM-based drivers. > Not sure if ahci(4) mentions this. > > -- > Andriy Gapon > Everything works now=E2=80=A6 Sorry for the noise ! We can close the PR too. --=20 Demelier David From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 18:30:52 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2F8AC106564A for ; Sun, 26 Sep 2010 18:30:52 +0000 (UTC) (envelope-from demelier.david@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id AA06A8FC0A for ; Sun, 26 Sep 2010 18:30:51 +0000 (UTC) Received: by bwz15 with SMTP id 15so3861610bwz.13 for ; Sun, 26 Sep 2010 11:30:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=YUsjDhdI99etoAlsnLShSl9EEXbjVAK469qVRuKq7Kg=; b=kjnf/+ZvTyzmiYK8xqciCnS8U1/yTBdxs8RUuNzIdOPQ07hepEvrH5Cw2TaV3VATqy ccRIHkmIzRa7f4AuadyT5hDs3jce9tKnf4b+WsQ9W31nNyBPR/fXCIW1NrYiKM6o95hT N6T3xeV/VDG1iW8wyY42XtmrOM14oydpcOl7g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=ebE7B3Gu6FIKPsdAwZJ/V5sAYXKasUIVpYFJHhwtTSJHkevnktD35mmHg+AXAB15Zb gs7zofAV6ILjpoZQUQAx01lWu3cyssTn/hLOi3yPO30IRVNk1j6OIf3cOKV7AanVq+MR D8pwzKSjF9dwgCGVak9dv4daalDkDZLmL/u3I= MIME-Version: 1.0 Received: by 10.204.98.74 with SMTP id p10mr4468152bkn.84.1285525850382; Sun, 26 Sep 2010 11:30:50 -0700 (PDT) Received: by 10.204.97.208 with HTTP; Sun, 26 Sep 2010 11:30:50 -0700 (PDT) In-Reply-To: References: <4C9F16F9.50707@icyb.net.ua> <20100926103836.GA33415@owl.midgard.homeip.net> <20100926172527.GA38405@icarus.home.lan> <4C9F8876.1040007@icyb.net.ua> Date: Sun, 26 Sep 2010 20:30:50 +0200 Message-ID: From: David DEMELIER To: Andriy Gapon Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable , Jeremy Chadwick Subject: Re: Failure to burn with ahci mode. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 18:30:52 -0000 2010/9/26 David DEMELIER : > 2010/9/26 Andriy Gapon : >> on 26/09/2010 20:37 David DEMELIER said the following: >>> markand@Melon ~ $ sudo camcontrol devlist >>> =C2=A0 at scbus0 target 0 lun 0 (ada0) >>> =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 at scb= us1 target 0 lun 0 (cd0) >>> >>> markand@Melon ~ $ sudo cdrecord downloads/FreeBSD-8.1-RELEASE-amd64-liv= efs.iso >>> cdrecord: No write mode specified. >>> cdrecord: Assuming -sao mode. >>> cdrecord: If your drive does not accept -sao, try -tao. >>> cdrecord: Future versions of cdrecord may have different drive >>> dependent defaults. >>> Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright >>> (C) 1995-2010 J=C3=B6rg Schilling >>> cdrecord: Error 0. Cannot open or use SCSI driver. >>> cdrecord: For possible targets try 'cdrecord -scanbus'. >>> cdrecord: For possible transport specifiers try 'cdrecord dev=3Dhelp'. >>> >>> markand@Melon ~ $ sudo cdrecord -scanbus >>> Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright >>> (C) 1995-2010 J=C3=B6rg Schilling >>> cdrecord: Error 0. Cannot open or use SCSI driver. >>> cdrecord: For possible targets try 'cdrecord -scanbus'. >>> cdrecord: For possible transport specifiers try 'cdrecord dev=3Dhelp'. >> >> Doesn't look like you have xpt and pass devices in your kernel, which ar= e >> (almost) mandatory when use CAM-based drivers. >> Not sure if ahci(4) mentions this. >> >> -- >> Andriy Gapon >> > > Everything works now=E2=80=A6 Sorry for the noise ! > > We can close the PR too. > > -- > Demelier David > or not, markand@Melon ~ $ sudo cdrecord downloads/FreeBSD-8.1-RELEASE-i386-livefs.i= so cdrecord: No write mode specified. cdrecord: Assuming -sao mode. cdrecord: If your drive does not accept -sao, try -tao. cdrecord: Future versions of cdrecord may have different drive dependent defaults. Cdrecord-ProDVD-ProBD-Clone 3.00 (amd64-unknown-freebsd8.1) Copyright (C) 1995-2010 J=C3=B6rg Schilling Using libscg version 'schily-0.9'. No target specified, trying to find one... Using dev=3D1,0,0. Device type : Removable CD-ROM Version : 0 Response Format: 2 Capabilities : Vendor_info : 'hp ' Identifikation : 'DVDRAM GT20L ' Revision : 'DC05' Device seems to be: Generic mmc2 DVD-R/DVD-RW/DVD-RAM. Using generic SCSI-3/mmc CD-R/CD-RW driver (mmc_cdr). Driver flags : MMC-3 SWABAUDIO BURNFREE Supported modes: TAO PACKET SAO SAO/R96P SAO/R96R RAW/R16 RAW/R96P RAW/R96R LAYER_JUMP Starting to write CD/DVD/BD at speed 24 in real SAO mode for single session= . Last chance to quit, starting real write 0 seconds. Operation starts. cdrecord: WARNING: Drive returns wrong startsec (0) using -150 cdrecord: Input/output error. write_g1: scsi sendcmd: retryable error CDB: 2A 00 FF FF FF 6A 00 00 20 00 status: 0x2 (CHECK CONDITION) Sense Bytes: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Sense Key: 0xFFFFFFFF [], Segment 0 Sense Code: 0x00 Qual 0x00 (no additional sense information) Fru 0x0 Sense flags: Blk 0 (not valid) cmd finished after 11.961s timeout 200s write track pad data: error after 0 bytes BFree: 597 K BSize: 597 K cdrecord: Input/output error. write_g1: scsi sendcmd: retryable error CDB: 2A 00 00 00 00 00 00 00 20 00 status: 0x2 (CHECK CONDITION) Sense Bytes: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 Sense Key: 0xFFFFFFFF [], Segment 0 Sense Code: 0x00 Qual 0x00 (no additional sense information) Fru 0x0 Sense flags: Blk 0 (not valid) cmd finished after 0.001s timeout 200s write track data: error after 0 bytes cdrecord: A write error occured. cdrecord: Please properly read the error message above. I will rollback to the old working ata driver. --=20 Demelier David From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 21:57:35 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 25889106564A for ; Sun, 26 Sep 2010 21:57:35 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2-6.sentex.ca [IPv6:2607:f3e0:80:80::2]) by mx1.freebsd.org (Postfix) with ESMTP id A19DD8FC15 for ; Sun, 26 Sep 2010 21:57:34 +0000 (UTC) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by smarthost2.sentex.ca (8.14.4/8.14.4) with ESMTP id o8QLvSMu027258 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 26 Sep 2010 17:57:28 -0400 (EDT) (envelope-from mike@sentex.net) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.14.4/8.14.4) with ESMTP id o8QLvR0L012171; Sun, 26 Sep 2010 17:57:27 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <201009262157.o8QLvR0L012171@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Sun, 26 Sep 2010 17:57:20 -0400 To: Jack Vogel From: Mike Tancsa In-Reply-To: References: <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <201008171955.o7HJt67T087902@lava.sentex.ca> <20100817200020.GE6482@michelle.cdnetworks.com> <201009141759.o8EHxcZ0013539@lava.sentex.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Scanned-By: MIMEDefang 2.67 on 205.211.164.50 Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: RELENG_7 em problems (and RELENG_8) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 21:57:35 -0000 At 06:36 PM 9/24/2010, Jack Vogel wrote: >There is a new revision of the em driver coming next week, its going thru some >stress pounding over the weekend, if no issues show up I'll put it into HEAD. > >Yongari's changes in TX context handling which effects checksum and tso >are added. I've also decided that multiple queues in 82574 just are a source >of problems without a lot of benefit, so it still uses MSIX but with >only 3 vectors, >meaning it seperates TX and RX but has a single queue. Thanks, looking forward to trying it out! With respect to the multiple queues, I thought the driver already used just the one on RELENG_8 ? If not, is there a way to force the existing driver to use just the one queue ? On the box that has the NIC locking up, it shows em1@pci0:9:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' class = network subclass = ethernet cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) and vmstat -i shows irq256: em0 5129063 353 irq257: em1 531251 36 in a wedged state, stats look like dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5 dev.em.1.%driver: em dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 subdevice=0x34ec class=0x020000 dev.em.1.%parent: pci9 dev.em.1.nvm: -1 dev.em.1.rx_int_delay: 0 dev.em.1.tx_int_delay: 66 dev.em.1.rx_abs_int_delay: 66 dev.em.1.tx_abs_int_delay: 66 dev.em.1.rx_processing_limit: 100 dev.em.1.link_irq: 0 dev.em.1.mbuf_alloc_fail: 0 dev.em.1.cluster_alloc_fail: 0 dev.em.1.dropped: 0 dev.em.1.tx_dma_fail: 0 dev.em.1.fc_high_water: 18432 dev.em.1.fc_low_water: 16932 dev.em.1.mac_stats.excess_coll: 0 dev.em.1.mac_stats.symbol_errors: 0 dev.em.1.mac_stats.sequence_errors: 0 dev.em.1.mac_stats.defer_count: 0 dev.em.1.mac_stats.missed_packets: 41522 dev.em.1.mac_stats.recv_no_buff: 19 dev.em.1.mac_stats.recv_errs: 0 dev.em.1.mac_stats.crc_errs: 0 dev.em.1.mac_stats.alignment_errs: 0 dev.em.1.mac_stats.coll_ext_errs: 0 dev.em.1.mac_stats.rx_overruns: 41398 dev.em.1.mac_stats.watchdog_timeouts: 0 dev.em.1.mac_stats.xon_recvd: 0 dev.em.1.mac_stats.xon_txd: 0 dev.em.1.mac_stats.xoff_recvd: 0 dev.em.1.mac_stats.xoff_txd: 0 dev.em.1.mac_stats.total_pkts_recvd: 95229129 dev.em.1.mac_stats.good_pkts_recvd: 95187607 dev.em.1.mac_stats.bcast_pkts_recvd: 79244 dev.em.1.mac_stats.mcast_pkts_recvd: 0 dev.em.1.mac_stats.rx_frames_64: 93680 dev.em.1.mac_stats.rx_frames_65_127: 1516349 dev.em.1.mac_stats.rx_frames_128_255: 4464941 dev.em.1.mac_stats.rx_frames_256_511: 4024 dev.em.1.mac_stats.rx_frames_512_1023: 2096067 dev.em.1.mac_stats.rx_frames_1024_1522: 87012546 dev.em.1.mac_stats.good_octets_recvd: 0 dev.em.1.mac_stats.good_octest_txd: 0 dev.em.1.mac_stats.total_pkts_txd: 66775098 dev.em.1.mac_stats.good_pkts_txd: 66775098 dev.em.1.mac_stats.bcast_pkts_txd: 509 dev.em.1.mac_stats.mcast_pkts_txd: 7 dev.em.1.mac_stats.tx_frames_64: 48038472 dev.em.1.mac_stats.tx_frames_65_127: 13402833 dev.em.1.mac_stats.tx_frames_128_255: 5324413 dev.em.1.mac_stats.tx_frames_256_511: 957 dev.em.1.mac_stats.tx_frames_512_1023: 319 dev.em.1.mac_stats.tx_frames_1024_1522: 8104 dev.em.1.mac_stats.tso_txd: 1069 dev.em.1.mac_stats.tso_ctx_fail: 0 dev.em.1.interrupts.asserts: 0 dev.em.1.interrupts.rx_pkt_timer: 0 dev.em.1.interrupts.rx_abs_timer: 0 dev.em.1.interrupts.tx_pkt_timer: 0 dev.em.1.interrupts.tx_abs_timer: 0 dev.em.1.interrupts.tx_queue_empty: 0 dev.em.1.interrupts.tx_queue_min_thresh: 0 dev.em.1.interrupts.rx_desc_min_thresh: 0 dev.em.1.interrupts.rx_overrun: 0 dev.em.1.host.breaker_tx_pkt: 0 dev.em.1.host.host_tx_pkt_discard: 0 dev.em.1.host.rx_pkt: 0 dev.em.1.host.breaker_rx_pkts: 0 dev.em.1.host.breaker_rx_pkt_drop: 0 dev.em.1.host.tx_good_pkt: 0 dev.em.1.host.breaker_tx_pkt_drop: 0 dev.em.1.host.rx_good_bytes: 0 dev.em.1.host.tx_good_bytes: 0 dev.em.1.host.length_errors: 0 dev.em.1.host.serdes_violation_pkt: 0 dev.em.1.host.header_redir_missed: 0 ifconfig down/up just panics or locks up the box when its in this state. I also have IPMI enabled on this nic, but it shows the same issue with it disabled. ---Mike -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 22:19:12 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 02465106564A for ; Sun, 26 Sep 2010 22:19:12 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 69DD88FC12 for ; Sun, 26 Sep 2010 22:19:11 +0000 (UTC) Received: by wwc33 with SMTP id 33so5438866wwc.31 for ; Sun, 26 Sep 2010 15:19:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=Miore+x1PodUlnqGcdMoq2tC2GZVcssmHMkrtN7Z5QI=; b=u8iXjP474mCqi3KFKhBoqxASBOa0KOz/p+DtZ6iNfpYj3qaw7k+lBgBg9Ed2ODhc7p dU4JOqrFvTySMUpmN1rX9XwMYT92xDmALT9RY7lAtqQrKHEcAsDFUvZrISgv0BD7S/2L K3J5tLRVSP+T+p1VKMM8x1wNjPFWi+2T4/LqA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=lqYpCm0xudYd1hNcPfMNedsuI9c1H1nXFl0hIl7pIv7fqF4DZgcevcv0t67dQ/4lHo eeBR8r0VaptW60YFzEQp/OHN5vCAKehJRT+q6eTYme1zgP+w4y0dpIkU4s3CiCXOYtL9 BaWRzeNq7t2w6YGBfnPDaklb3Ly39PMpQ1Ih0= MIME-Version: 1.0 Received: by 10.216.2.141 with SMTP id 13mr12049498wef.84.1285539550196; Sun, 26 Sep 2010 15:19:10 -0700 (PDT) Received: by 10.216.48.20 with HTTP; Sun, 26 Sep 2010 15:19:10 -0700 (PDT) In-Reply-To: <201009262157.o8QLvR0L012171@lava.sentex.ca> References: <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <201008171955.o7HJt67T087902@lava.sentex.ca> <20100817200020.GE6482@michelle.cdnetworks.com> <201009141759.o8EHxcZ0013539@lava.sentex.ca> <201009262157.o8QLvR0L012171@lava.sentex.ca> Date: Sun, 26 Sep 2010 15:19:10 -0700 Message-ID: From: Jack Vogel To: Mike Tancsa Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: RELENG_7 em problems (and RELENG_8) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 22:19:12 -0000 The NIC has 5 MSIX vectors and can have 2 queues, I have been trying to release code with both queues active, but its been unstable, I finally concluded its not worth the aggrevation :) Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm not sure whats broken from what you show here. I will try to get the new driver out shortly for you to try. Jack On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa wrote: > At 06:36 PM 9/24/2010, Jack Vogel wrote: > >> There is a new revision of the em driver coming next week, its going thru >> some >> stress pounding over the weekend, if no issues show up I'll put it into >> HEAD. >> >> Yongari's changes in TX context handling which effects checksum and tso >> are added. I've also decided that multiple queues in 82574 just are a >> source >> of problems without a lot of benefit, so it still uses MSIX but with only >> 3 vectors, >> meaning it seperates TX and RX but has a single queue. >> > > Thanks, looking forward to trying it out! With respect to the multiple > queues, I thought the driver already used just the one on RELENG_8 ? If > not, is there a way to force the existing driver to use just the one queue ? > > On the box that has the NIC locking up, it shows > > em1@pci0:9:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 rev=0x00 > hdr=0x00 > > vendor = 'Intel Corporation' > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > > and > > vmstat -i shows > > irq256: em0 5129063 353 > irq257: em1 531251 36 > > in a wedged state, stats look like > > dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5 > dev.em.1.%driver: em > dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART > dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 > subdevice=0x34ec class=0x020000 > dev.em.1.%parent: pci9 > dev.em.1.nvm: -1 > dev.em.1.rx_int_delay: 0 > dev.em.1.tx_int_delay: 66 > dev.em.1.rx_abs_int_delay: 66 > dev.em.1.tx_abs_int_delay: 66 > dev.em.1.rx_processing_limit: 100 > dev.em.1.link_irq: 0 > dev.em.1.mbuf_alloc_fail: 0 > dev.em.1.cluster_alloc_fail: 0 > dev.em.1.dropped: 0 > dev.em.1.tx_dma_fail: 0 > dev.em.1.fc_high_water: 18432 > dev.em.1.fc_low_water: 16932 > dev.em.1.mac_stats.excess_coll: 0 > dev.em.1.mac_stats.symbol_errors: 0 > dev.em.1.mac_stats.sequence_errors: 0 > dev.em.1.mac_stats.defer_count: 0 > dev.em.1.mac_stats.missed_packets: 41522 > dev.em.1.mac_stats.recv_no_buff: 19 > dev.em.1.mac_stats.recv_errs: 0 > dev.em.1.mac_stats.crc_errs: 0 > dev.em.1.mac_stats.alignment_errs: 0 > dev.em.1.mac_stats.coll_ext_errs: 0 > dev.em.1.mac_stats.rx_overruns: 41398 > dev.em.1.mac_stats.watchdog_timeouts: 0 > dev.em.1.mac_stats.xon_recvd: 0 > dev.em.1.mac_stats.xon_txd: 0 > dev.em.1.mac_stats.xoff_recvd: 0 > dev.em.1.mac_stats.xoff_txd: 0 > dev.em.1.mac_stats.total_pkts_recvd: 95229129 > dev.em.1.mac_stats.good_pkts_recvd: 95187607 > dev.em.1.mac_stats.bcast_pkts_recvd: 79244 > dev.em.1.mac_stats.mcast_pkts_recvd: 0 > dev.em.1.mac_stats.rx_frames_64: 93680 > dev.em.1.mac_stats.rx_frames_65_127: 1516349 > dev.em.1.mac_stats.rx_frames_128_255: 4464941 > dev.em.1.mac_stats.rx_frames_256_511: 4024 > dev.em.1.mac_stats.rx_frames_512_1023: 2096067 > dev.em.1.mac_stats.rx_frames_1024_1522: 87012546 > dev.em.1.mac_stats.good_octets_recvd: 0 > dev.em.1.mac_stats.good_octest_txd: 0 > dev.em.1.mac_stats.total_pkts_txd: 66775098 > dev.em.1.mac_stats.good_pkts_txd: 66775098 > dev.em.1.mac_stats.bcast_pkts_txd: 509 > dev.em.1.mac_stats.mcast_pkts_txd: 7 > dev.em.1.mac_stats.tx_frames_64: 48038472 > dev.em.1.mac_stats.tx_frames_65_127: 13402833 > dev.em.1.mac_stats.tx_frames_128_255: 5324413 > dev.em.1.mac_stats.tx_frames_256_511: 957 > dev.em.1.mac_stats.tx_frames_512_1023: 319 > dev.em.1.mac_stats.tx_frames_1024_1522: 8104 > dev.em.1.mac_stats.tso_txd: 1069 > dev.em.1.mac_stats.tso_ctx_fail: 0 > dev.em.1.interrupts.asserts: 0 > dev.em.1.interrupts.rx_pkt_timer: 0 > dev.em.1.interrupts.rx_abs_timer: 0 > dev.em.1.interrupts.tx_pkt_timer: 0 > dev.em.1.interrupts.tx_abs_timer: 0 > dev.em.1.interrupts.tx_queue_empty: 0 > dev.em.1.interrupts.tx_queue_min_thresh: 0 > dev.em.1.interrupts.rx_desc_min_thresh: 0 > dev.em.1.interrupts.rx_overrun: 0 > dev.em.1.host.breaker_tx_pkt: 0 > dev.em.1.host.host_tx_pkt_discard: 0 > dev.em.1.host.rx_pkt: 0 > dev.em.1.host.breaker_rx_pkts: 0 > dev.em.1.host.breaker_rx_pkt_drop: 0 > dev.em.1.host.tx_good_pkt: 0 > dev.em.1.host.breaker_tx_pkt_drop: 0 > dev.em.1.host.rx_good_bytes: 0 > dev.em.1.host.tx_good_bytes: 0 > dev.em.1.host.length_errors: 0 > dev.em.1.host.serdes_violation_pkt: 0 > dev.em.1.host.header_redir_missed: 0 > > ifconfig down/up just panics or locks up the box when its in this state. I > also have IPMI enabled on this nic, but it shows the same issue with it > disabled. > > ---Mike > > > > -------------------------------------------------------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet since 1994 www.sentex.net > Cambridge, Ontario Canada www.sentex.net/mike > > From owner-freebsd-stable@FreeBSD.ORG Sun Sep 26 23:43:52 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E8456106566B for ; Sun, 26 Sep 2010 23:43:52 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2-6.sentex.ca [IPv6:2607:f3e0:80:80::2]) by mx1.freebsd.org (Postfix) with ESMTP id 89CE78FC15 for ; Sun, 26 Sep 2010 23:43:52 +0000 (UTC) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by smarthost2.sentex.ca (8.14.4/8.14.4) with ESMTP id o8QNhiB6029892 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sun, 26 Sep 2010 19:43:44 -0400 (EDT) (envelope-from mike@sentex.net) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.14.4/8.14.4) with ESMTP id o8QNhgDG012676; Sun, 26 Sep 2010 19:43:42 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <201009262343.o8QNhgDG012676@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Sun, 26 Sep 2010 19:43:03 -0400 To: Jack Vogel From: Mike Tancsa In-Reply-To: References: <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <201008171955.o7HJt67T087902@lava.sentex.ca> <20100817200020.GE6482@michelle.cdnetworks.com> <201009141759.o8EHxcZ0013539@lava.sentex.ca> <201009262157.o8QLvR0L012171@lava.sentex.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Scanned-By: MIMEDefang 2.67 on 205.211.164.50 Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: RELENG_7 em problems (and RELENG_8) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 26 Sep 2010 23:43:53 -0000 At 06:19 PM 9/26/2010, Jack Vogel wrote: >Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm >not sure whats broken from what you show here. I will try to get the new >driver out shortly for you to try. With this particular NIC, it will wedge under high load. I tried 2 different motherboards and chipsets the same behaviour. ---Mike >Jack > > >On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa ><mike@sentex.net> wrote: >At 06:36 PM 9/24/2010, Jack Vogel wrote: >There is a new revision of the em driver coming next week, its going thru some >stress pounding over the weekend, if no issues show up I'll put it into HEAD. > >Yongari's changes in TX context handling which effects checksum and tso >are added. I've also decided that multiple queues in 82574 just are a source >of problems without a lot of benefit, so it still uses MSIX but with >only 3 vectors, >meaning it seperates TX and RX but has a single queue. > > >Thanks, looking forward to trying it out! With respect to the >multiple queues, I thought the driver already used just the one on >RELENG_8 ? If not, is there a way to force the existing driver to >use just the one queue ? > >On the box that has the NIC locking up, it shows > >em1@pci0:9:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 >rev=0x00 hdr=0x00 > > vendor = 'Intel Corporation' > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > >and > >vmstat -i shows > >irq256: em0 5129063 353 >irq257: em1 531251 36 > >in a wedged state, stats look like > >dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5 >dev.em.1.%driver: em >dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART >dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 >subdevice=0x34ec class=0x020000 >dev.em.1.%parent: pci9 >dev.em.1.nvm: -1 >dev.em.1.rx_int_delay: 0 >dev.em.1.tx_int_delay: 66 >dev.em.1.rx_abs_int_delay: 66 >dev.em.1.tx_abs_int_delay: 66 >dev.em.1.rx_processing_limit: 100 >dev.em.1.link_irq: 0 >dev.em.1.mbuf_alloc_fail: 0 >dev.em.1.cluster_alloc_fail: 0 >dev.em.1.dropped: 0 >dev.em.1.tx_dma_fail: 0 >dev.em.1.fc_high_water: 18432 >dev.em.1.fc_low_water: 16932 >dev.em.1.mac_stats.excess_coll: 0 >dev.em.1.mac_stats.symbol_errors: 0 >dev.em.1.mac_stats.sequence_errors: 0 >dev.em.1.mac_stats.defer_count: 0 >dev.em.1.mac_stats.missed_packets: 41522 >dev.em.1.mac_stats.recv_no_buff: 19 >dev.em.1.mac_stats.recv_errs: 0 >dev.em.1.mac_stats.crc_errs: 0 >dev.em.1.mac_stats.alignment_errs: 0 >dev.em.1.mac_stats.coll_ext_errs: 0 >dev.em.1.mac_stats.rx_overruns: 41398 >dev.em.1.mac_stats.watchdog_timeouts: 0 >dev.em.1.mac_stats.xon_recvd: 0 >dev.em.1.mac_stats.xon_txd: 0 >dev.em.1.mac_stats.xoff_recvd: 0 >dev.em.1.mac_stats.xoff_txd: 0 >dev.em.1.mac_stats.total_pkts_recvd: 95229129 >dev.em.1.mac_stats.good_pkts_recvd: 95187607 >dev.em.1.mac_stats.bcast_pkts_recvd: 79244 >dev.em.1.mac_stats.mcast_pkts_recvd: 0 >dev.em.1.mac_stats.rx_frames_64: 93680 >dev.em.1.mac_stats.rx_frames_65_127: 1516349 >dev.em.1.mac_stats.rx_frames_128_255: 4464941 >dev.em.1.mac_stats.rx_frames_256_511: 4024 >dev.em.1.mac_stats.rx_frames_512_1023: 2096067 >dev.em.1.mac_stats.rx_frames_1024_1522: 87012546 >dev.em.1.mac_stats.good_octets_recvd: 0 >dev.em.1.mac_stats.good_octest_txd: 0 >dev.em.1.mac_stats.total_pkts_txd: 66775098 >dev.em.1.mac_stats.good_pkts_txd: 66775098 >dev.em.1.mac_stats.bcast_pkts_txd: 509 >dev.em.1.mac_stats.mcast_pkts_txd: 7 >dev.em.1.mac_stats.tx_frames_64: 48038472 >dev.em.1.mac_stats.tx_frames_65_127: 13402833 >dev.em.1.mac_stats.tx_frames_128_255: 5324413 >dev.em.1.mac_stats.tx_frames_256_511: 957 >dev.em.1.mac_stats.tx_frames_512_1023: 319 >dev.em.1.mac_stats.tx_frames_1024_1522: 8104 >dev.em.1.mac_stats.tso_txd: 1069 >dev.em.1.mac_stats.tso_ctx_fail: 0 >dev.em.1.interrupts.asserts: 0 >dev.em.1.interrupts.rx_pkt_timer: 0 >dev.em.1.interrupts.rx_abs_timer: 0 >dev.em.1.interrupts.tx_pkt_timer: 0 >dev.em.1.interrupts.tx_abs_timer: 0 >dev.em.1.interrupts.tx_queue_empty: 0 >dev.em.1.interrupts.tx_queue_min_thresh: 0 >dev.em.1.interrupts.rx_desc_min_thresh: 0 >dev.em.1.interrupts.rx_overrun: 0 >dev.em.1.host.breaker_tx_pkt: 0 >dev.em.1.host.host_tx_pkt_discard: 0 >dev.em.1.host.rx_pkt: 0 >dev.em.1.host.breaker_rx_pkts: 0 >dev.em.1.host.breaker_rx_pkt_drop: 0 >dev.em.1.host.tx_good_pkt: 0 >dev.em.1.host.breaker_tx_pkt_drop: 0 >dev.em.1.host.rx_good_bytes: 0 >dev.em.1.host.tx_good_bytes: 0 >dev.em.1.host.length_errors: 0 >dev.em.1.host.serdes_violation_pkt: 0 >dev.em.1.host.header_redir_missed: 0 > >ifconfig down/up just panics or locks up the box when its in this >state. I also have IPMI enabled on this nic, but it shows the same >issue with it disabled. > > ---Mike > > > >-------------------------------------------------------------------- >Mike Tancsa, tel +1 519 651 3400 >Sentex >Communications, >mike@sentex.net >Providing Internet since >1994 www.sentex.net >Cambridge, Ontario >Canada www.sentex.net/mike > -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 00:00:23 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F29841065670 for ; Mon, 27 Sep 2010 00:00:23 +0000 (UTC) (envelope-from jfvogel@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 7D3268FC12 for ; Mon, 27 Sep 2010 00:00:22 +0000 (UTC) Received: by wyb33 with SMTP id 33so5665881wyb.13 for ; Sun, 26 Sep 2010 17:00:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=6y8KE2gFvHZafQhj8znBGYq8Z4Yssuc1FZ2hrL0Q+ok=; b=JGdtx+eQ0Mmg/wFU09gnL9NZQTWVos9OBoEtHx0tyNytvxZupv95LLxFmkTUx79x31 f0gpdzpMcCzpKFuf3gecCkxGwX9+qL3KGwe6iIIl9DL2ZK6ctLPQ2VMdIolrmj8KYsLJ QKcUvkOxuFTlyZJyb6nFRSA5Qf2ensAPRKLGM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=sid8FPM9neSsvDo16xZnVLAqPZ1EkOX12EBlpFbbF/R7as0+lb/yUtuSNk0rvGE1rs GLu5rrrF6bwo6ZyXLHrPwibBbuPjRtGejlXfUkUsSe38wtmdA7GPjwAtLg0IPHdqbghO Jh1Sbz1TLF3f69Q6QAKkQ3HHBeS9TyrSsQNs4= MIME-Version: 1.0 Received: by 10.227.152.18 with SMTP id e18mr5719427wbw.1.1285545621740; Sun, 26 Sep 2010 17:00:21 -0700 (PDT) Received: by 10.216.48.20 with HTTP; Sun, 26 Sep 2010 17:00:21 -0700 (PDT) In-Reply-To: <201009262343.o8QNhgDG012676@lava.sentex.ca> References: <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <201008171955.o7HJt67T087902@lava.sentex.ca> <20100817200020.GE6482@michelle.cdnetworks.com> <201009141759.o8EHxcZ0013539@lava.sentex.ca> <201009262157.o8QLvR0L012171@lava.sentex.ca> <201009262343.o8QNhgDG012676@lava.sentex.ca> Date: Sun, 26 Sep 2010 17:00:21 -0700 Message-ID: From: Jack Vogel To: Mike Tancsa Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: RELENG_7 em problems (and RELENG_8) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 00:00:24 -0000 The system I've had stress tests running on has 82574 LOMs, so I hope it will solve the problem, will see tomorrow morning at how things have held up... Jack On Sun, Sep 26, 2010 at 4:43 PM, Mike Tancsa wrote: > At 06:19 PM 9/26/2010, Jack Vogel wrote: > >> Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm >> not sure whats broken from what you show here. I will try to get the new >> driver out shortly for you to try. >> > > With this particular NIC, it will wedge under high load. I tried 2 > different motherboards and chipsets the same behaviour. > > ---Mike > > > Jack >> >> >> >> On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa < >> mike@sentex.net> wrote: >> At 06:36 PM 9/24/2010, Jack Vogel wrote: >> There is a new revision of the em driver coming next week, its going thru >> some >> stress pounding over the weekend, if no issues show up I'll put it into >> HEAD. >> >> Yongari's changes in TX context handling which effects checksum and tso >> are added. I've also decided that multiple queues in 82574 just are a >> source >> of problems without a lot of benefit, so it still uses MSIX but with only >> 3 vectors, >> meaning it seperates TX and RX but has a single queue. >> >> >> Thanks, looking forward to trying it out! With respect to the multiple >> queues, I thought the driver already used just the one on RELENG_8 ? If >> not, is there a way to force the existing driver to use just the one queue ? >> >> On the box that has the NIC locking up, it shows >> >> em1@pci0:9:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 rev=0x00 >> hdr=0x00 >> >> vendor = 'Intel Corporation' >> device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' >> class = network >> subclass = ethernet >> cap 01[c8] = powerspec 2 supports D0 D3 current D0 >> cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message >> cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) >> >> and >> >> vmstat -i shows >> >> irq256: em0 5129063 353 >> irq257: em1 531251 36 >> >> in a wedged state, stats look like >> >> dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5 >> dev.em.1.%driver: em >> dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART >> dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 >> subdevice=0x34ec class=0x020000 >> dev.em.1.%parent: pci9 >> dev.em.1.nvm: -1 >> dev.em.1.rx_int_delay: 0 >> dev.em.1.tx_int_delay: 66 >> dev.em.1.rx_abs_int_delay: 66 >> dev.em.1.tx_abs_int_delay: 66 >> dev.em.1.rx_processing_limit: 100 >> dev.em.1.link_irq: 0 >> dev.em.1.mbuf_alloc_fail: 0 >> dev.em.1.cluster_alloc_fail: 0 >> dev.em.1.dropped: 0 >> dev.em.1.tx_dma_fail: 0 >> dev.em.1.fc_high_water: 18432 >> dev.em.1.fc_low_water: 16932 >> dev.em.1.mac_stats.excess_coll: 0 >> dev.em.1.mac_stats.symbol_errors: 0 >> dev.em.1.mac_stats.sequence_errors: 0 >> dev.em.1.mac_stats.defer_count: 0 >> dev.em.1.mac_stats.missed_packets: 41522 >> dev.em.1.mac_stats.recv_no_buff: 19 >> dev.em.1.mac_stats.recv_errs: 0 >> dev.em.1.mac_stats.crc_errs: 0 >> dev.em.1.mac_stats.alignment_errs: 0 >> dev.em.1.mac_stats.coll_ext_errs: 0 >> dev.em.1.mac_stats.rx_overruns: 41398 >> dev.em.1.mac_stats.watchdog_timeouts: 0 >> dev.em.1.mac_stats.xon_recvd: 0 >> dev.em.1.mac_stats.xon_txd: 0 >> dev.em.1.mac_stats.xoff_recvd: 0 >> dev.em.1.mac_stats.xoff_txd: 0 >> dev.em.1.mac_stats.total_pkts_recvd: 95229129 >> dev.em.1.mac_stats.good_pkts_recvd: 95187607 >> dev.em.1.mac_stats.bcast_pkts_recvd: 79244 >> dev.em.1.mac_stats.mcast_pkts_recvd: 0 >> dev.em.1.mac_stats.rx_frames_64: 93680 >> dev.em.1.mac_stats.rx_frames_65_127: 1516349 >> dev.em.1.mac_stats.rx_frames_128_255: 4464941 >> dev.em.1.mac_stats.rx_frames_256_511: 4024 >> dev.em.1.mac_stats.rx_frames_512_1023: 2096067 >> dev.em.1.mac_stats.rx_frames_1024_1522: 87012546 >> dev.em.1.mac_stats.good_octets_recvd: 0 >> dev.em.1.mac_stats.good_octest_txd: 0 >> dev.em.1.mac_stats.total_pkts_txd: 66775098 >> dev.em.1.mac_stats.good_pkts_txd: 66775098 >> dev.em.1.mac_stats.bcast_pkts_txd: 509 >> dev.em.1.mac_stats.mcast_pkts_txd: 7 >> dev.em.1.mac_stats.tx_frames_64: 48038472 >> dev.em.1.mac_stats.tx_frames_65_127: 13402833 >> dev.em.1.mac_stats.tx_frames_128_255: 5324413 >> dev.em.1.mac_stats.tx_frames_256_511: 957 >> dev.em.1.mac_stats.tx_frames_512_1023: 319 >> dev.em.1.mac_stats.tx_frames_1024_1522: 8104 >> dev.em.1.mac_stats.tso_txd: 1069 >> dev.em.1.mac_stats.tso_ctx_fail: 0 >> dev.em.1.interrupts.asserts: 0 >> dev.em.1.interrupts.rx_pkt_timer: 0 >> dev.em.1.interrupts.rx_abs_timer: 0 >> dev.em.1.interrupts.tx_pkt_timer: 0 >> dev.em.1.interrupts.tx_abs_timer: 0 >> dev.em.1.interrupts.tx_queue_empty: 0 >> dev.em.1.interrupts.tx_queue_min_thresh: 0 >> dev.em.1.interrupts.rx_desc_min_thresh: 0 >> dev.em.1.interrupts.rx_overrun: 0 >> dev.em.1.host.breaker_tx_pkt: 0 >> dev.em.1.host.host_tx_pkt_discard: 0 >> dev.em.1.host.rx_pkt: 0 >> dev.em.1.host.breaker_rx_pkts: 0 >> dev.em.1.host.breaker_rx_pkt_drop: 0 >> dev.em.1.host.tx_good_pkt: 0 >> dev.em.1.host.breaker_tx_pkt_drop: 0 >> dev.em.1.host.rx_good_bytes: 0 >> dev.em.1.host.tx_good_bytes: 0 >> dev.em.1.host.length_errors: 0 >> dev.em.1.host.serdes_violation_pkt: 0 >> dev.em.1.host.header_redir_missed: 0 >> >> ifconfig down/up just panics or locks up the box when its in this state. >> I also have IPMI enabled on this nic, but it shows the same issue with it >> disabled. >> >> ---Mike >> >> >> >> -------------------------------------------------------------------- >> Mike Tancsa, tel +1 519 651 3400 >> Sentex Communications, mike@sentex.net >> Providing Internet since 1994 >> www.sentex.net >> Cambridge, Ontario Canada < >> http://www.sentex.net/mike>www.sentex.net/mike >> >> > -------------------------------------------------------------------- > Mike Tancsa, tel +1 519 651 3400 > Sentex Communications, mike@sentex.net > Providing Internet since 1994 www.sentex.net > Cambridge, Ontario Canada www.sentex.net/mike > > From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 08:04:14 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8A808106564A for ; Mon, 27 Sep 2010 08:04:14 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from sola.nimnet.asn.au (paqi.nimnet.asn.au [115.70.110.159]) by mx1.freebsd.org (Postfix) with ESMTP id 0266C8FC25 for ; Mon, 27 Sep 2010 08:04:13 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by sola.nimnet.asn.au (8.14.2/8.14.2) with ESMTP id o8R84APt033650; Mon, 27 Sep 2010 18:04:10 +1000 (EST) (envelope-from smithi@nimnet.asn.au) Date: Mon, 27 Sep 2010 18:04:09 +1000 (EST) From: Ian Smith To: Vitaly Magerya In-Reply-To: <4C9DB6F5.6010305@gmail.com> Message-ID: <20100927170317.I90633@sola.nimnet.asn.au> References: <20100224165203.GA10423@zod.isi.edu> <20100225152711.M16250@sola.nimnet.asn.au> <20100226013551.GA67689@zod.isi.edu> <20100922181029.D11124@sola.nimnet.asn.au> <20100922171008.GA92070@zod.isi.edu> <20100925181038.T11124@sola.nimnet.asn.au> <4C9DB6F5.6010305@gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-stable@freebsd.org, Ted Faber Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 08:04:14 -0000 On Sat, 25 Sep 2010, Vitaly Magerya wrote: > Ian Smith wrote: > > On Wed, 22 Sep 2010, Ted Faber wrote: > > > > > Sorry, Ian, I don't have anything new, wrt the ATA. > > > > Thanks Ted. Interesting that nobody else seems to have run into this > > issue, must be a (some?) Thinkpads thing .. > > FWIW, my Thinkpad T40 does the same thing on 8.1: after resume from S3 > it's unusably slow (it takes seconds between a key is pressed and it's > displayed on screen), and then after a while it works ok. That's interesting; since my original report last December I discovered that same behaviour with 8.0-R. During the 60s resume stall period, iff I'd suspended from a VTY, I found I could slowly (like maybe 3 seconds per character echoed) type a command, and some commands - possibly those cached? as there's no HD access - would run after another few seconds. In this way I discovered that 'date' commands reported the time some seconds after the resume (perhaps hours ago, or yesterday) until the stall ended, disk light flashed and normality resumed, sometimes with "calcru: time went backwards .." messages, most often for devd. Since upgrading to 8.1-STABLE that clue? has gone; nothing typed is echoed. Are you referring to 8.1-RELEASE or to 8-STABLE as at some date? > This has been like this in 7.0 too (except I don't know if it ever > recovered the speed; I remember shutting it down as soon as I saw how > slow it is). That's a difference then; 7.0-R then 7.2-STABLE (late December, anyway) had no such issues here on my T23. When it clears up after a wet week and I have some spare power again I'll try building a debug kernel, perhaps omitting and kldoading USB, and do some more tests before reporting further, probably in mobile@ and acpi@ again. I'll copy you and Ted when I do so. Thanks, Ian From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 09:01:55 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 095E0106566B for ; Mon, 27 Sep 2010 09:01:55 +0000 (UTC) (envelope-from smithi@nimnet.asn.au) Received: from sola.nimnet.asn.au (paqi.nimnet.asn.au [115.70.110.159]) by mx1.freebsd.org (Postfix) with ESMTP id 7F7B18FC1C for ; Mon, 27 Sep 2010 09:01:54 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by sola.nimnet.asn.au (8.14.2/8.14.2) with ESMTP id o8R91qsY037037; Mon, 27 Sep 2010 19:01:53 +1000 (EST) (envelope-from smithi@nimnet.asn.au) Date: Mon, 27 Sep 2010 19:01:52 +1000 (EST) From: Ian Smith To: Vitaly Magerya In-Reply-To: <20100927170317.I90633@sola.nimnet.asn.au> Message-ID: <20100927185838.R90633@sola.nimnet.asn.au> References: <20100224165203.GA10423@zod.isi.edu> <20100225152711.M16250@sola.nimnet.asn.au> <20100226013551.GA67689@zod.isi.edu> <20100922181029.D11124@sola.nimnet.asn.au> <20100922171008.GA92070@zod.isi.edu> <20100925181038.T11124@sola.nimnet.asn.au> <4C9DB6F5.6010305@gmail.com> <20100927170317.I90633@sola.nimnet.asn.au> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: Ted Faber , freebsd-stable@freebsd.org Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 09:01:55 -0000 On Mon, 27 Sep 2010, Ian Smith wrote: > In this way I discovered that 'date' commands reported the time some > seconds after the resume (perhaps hours ago, or yesterday) until the Sorry, that should say 'seconds after the _suspend_', not the resume. > stall ended, disk light flashed and normality resumed, sometimes with > "calcru: time went backwards .." messages, most often for devd. Since > upgrading to 8.1-STABLE that clue? has gone; nothing typed is echoed. cheers, Ian From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 11:23:52 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 12C911065675 for ; Mon, 27 Sep 2010 11:23:52 +0000 (UTC) (envelope-from serguey-grigoriev@yandex.ru) Received: from forward2.mail.yandex.net (forward2.mail.yandex.net [77.88.46.7]) by mx1.freebsd.org (Postfix) with ESMTP id B6DB58FC12 for ; Mon, 27 Sep 2010 11:23:51 +0000 (UTC) Received: from web44.yandex.ru (web44.yandex.ru [77.88.47.183]) by forward2.mail.yandex.net (Yandex) with ESMTP id A2F7D367807C for ; Mon, 27 Sep 2010 15:23:49 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1285586629; bh=BYYr2eCGMCIA84iudl/FfMgIBIS1lEvab2HvrjjbX9g=; h=From:To:Subject:MIME-Version:Message-Id:Date: Content-Transfer-Encoding:Content-Type; b=Fh+26rTilRQoIW4RRfxzTP9M9m/fB9Wbsv0pg1PZK9HW62voL0J2dBodAT3iqrmHR 7/G4mIT+6TZG4RpGwo9k4UBTgVh7Ws8dGpKX/Ofmnu6orswDZkqIgwjIjG3Y4bPB/+ gVUIOQsVVD/EZzAZFnUegS5Jxk1G+YUwsi3KLqBw= Received: from localhost (localhost.localdomain [127.0.0.1]) by web44.yandex.ru (Yandex) with ESMTP id 9E1DE9A0211 for ; Mon, 27 Sep 2010 15:23:49 +0400 (MSD) X-Yandex-Spam: 0 X-Yandex-Front: web44.yandex.ru X-Yandex-TimeMark: 1285586629 Received: from netman.spbcity.net (netman.spbcity.net [77.244.18.5]) by mail.yandex.ru with HTTP; Mon, 27 Sep 2010 15:23:48 +0400 From: S.N.Grigoriev To: freebsd-stable@freebsd.org MIME-Version: 1.0 Message-Id: <78821285586629@web44.yandex.ru> Date: Mon, 27 Sep 2010 15:23:48 +0400 X-Mailer: Yamail [ http://yandex.ru ] 5.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain Subject: snd_hda: how to duplicate output X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 11:23:52 -0000 Hi list, this is an exerpt from 8.1R detailed Release Notes: 2.2.2.1 Multimedia Support [snipped] The snd_hda(4) driver now supports multichannel (4.0 and 7.1) playback support. The 5.1 mode support is disabled now due to unidentified synchronization problem. Devices which supports the 7.1 mode can handle the 5.1 operation via software upmix done by sound(4). Note that stereo stream is no longer duplicated to all ports. I'm interesting in what way can I restore the old behaviour, when stereo stream is duplicated to the black and grey sound card output connectors? Thanks, Serguey. From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 11:45:12 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B419010656E0 for ; Mon, 27 Sep 2010 11:45:12 +0000 (UTC) (envelope-from luke@digital-crocus.com) Received: from mail.digital-crocus.com (node2.digital-crocus.com [91.209.244.128]) by mx1.freebsd.org (Postfix) with ESMTP id 3A6078FC17 for ; Mon, 27 Sep 2010 11:45:11 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dkselector; d=hybrid-logic.co.uk; h=Received:Received:Subject:From:Reply-To:To:Content-Type:Organization:Date:Message-ID:Mime-Version:X-Mailer:X-Spam-Score:X-Digital-Crocus-Maillimit:X-Authenticated-Sender:X-Complaints:X-Admin:X-Abuse; b=Hju8T57E4Jx8AwSra5PXJY1zzV2KznnAcibm5UhvaTXgkxUvcjiMjzwHbq0UwqdX7Kxty9HHY5sI6UYM8My4r5ZZh7ufSsrttQnC0loA0JWezcmKzvbp+440Wz2jEUew; Received: from luke by mail.digital-crocus.com with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P0C4F-00048k-NF for stable@freebsd.org; Mon, 27 Sep 2010 12:40:23 +0100 Received: from office.moo.com ([83.244.232.179] helo=[10.0.0.69]) by mail.digital-crocus.com with esmtpa (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P0C4F-00048b-E7 for stable@freebsd.org; Mon, 27 Sep 2010 12:40:23 +0100 From: Luke Marsden To: stable@freebsd.org Content-Type: multipart/mixed; boundary="=-GXYOe9B75dxSqdOgXpqB" Organization: Hybrid Web Cluster Date: Mon, 27 Sep 2010 12:45:10 +0100 Message-ID: <1285587910.31122.633.camel@pow> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 X-Spam-Score: -0.7 X-Digital-Crocus-Maillimit: done X-Authenticated-Sender: luke X-Complaints: abuse@digital-crocus.com X-Admin: admin@digital-crocus.com X-Abuse: abuse@digital-crocus.com (Please include full headers in abuse reports) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: luke@hybrid-logic.co.uk List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 11:45:12 -0000 --=-GXYOe9B75dxSqdOgXpqB Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Hi FreeBSD-stable, I'm having problems booting 8.1R on a KVM virtualised host backed on AMD hardware. It works flawlessly on Intel backed KVM. Please find attached the message I get on boot. This loops endlessly. Can anyone give me any advice on how to start tracking this down? I'm happy to give developers access to some paid instances on ElasticHosts where this problem occurs, if it helps debugging it. (It works fine in ElasticHosts' lon-p and sat-p data centres, but not in lon-b, and Intel vs. AMD is apparently the difference). -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 --=-GXYOe9B75dxSqdOgXpqB-- From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 12:24:51 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 339A51065672 for ; Mon, 27 Sep 2010 12:24:51 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 7E8C28FC1F for ; Mon, 27 Sep 2010 12:24:50 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA08511; Mon, 27 Sep 2010 15:24:45 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA08D0D.4030406@icyb.net.ua> Date: Mon, 27 Sep 2010 15:24:45 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: luke@hybrid-logic.co.uk References: <1285587910.31122.633.camel@pow> In-Reply-To: <1285587910.31122.633.camel@pow> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Luke Marsden Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 12:24:51 -0000 on 27/09/2010 14:45 Luke Marsden said the following: > Hi FreeBSD-stable, > > I'm having problems booting 8.1R on a KVM virtualised host backed on AMD > hardware. It works flawlessly on Intel backed KVM. Please find attached ------------------------------------------------------------^^^^^^^^^^^^^ I tried but I can't. Maybe post a link? > the message I get on boot. This loops endlessly. > > Can anyone give me any advice on how to start tracking this down? I'm > happy to give developers access to some paid instances on ElasticHosts > where this problem occurs, if it helps debugging it. > > (It works fine in ElasticHosts' lon-p and sat-p data centres, but not in > lon-b, and Intel vs. AMD is apparently the difference). -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 13:53:58 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F03B106566B for ; Mon, 27 Sep 2010 13:53:58 +0000 (UTC) (envelope-from luke@digital-crocus.com) Received: from mail.digital-crocus.com (node2.digital-crocus.com [91.209.244.128]) by mx1.freebsd.org (Postfix) with ESMTP id B85DB8FC12 for ; Mon, 27 Sep 2010 13:53:57 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dkselector; d=hybrid-logic.co.uk; h=Received:Received:Subject:From:Reply-To:To:In-Reply-To:References:Content-Type:Organization:Date:Message-ID:Mime-Version:X-Mailer:Content-Transfer-Encoding:X-Spam-Score:X-Digital-Crocus-Maillimit:X-Authenticated-Sender:X-Complaints:X-Admin:X-Abuse; b=lX2sgaQg3A8lps/5qAM9HXk8E3YSXcHhIz/P1RT5iWAr2LiYksTalxBdQIO5XPuwtp0jHl3G//9qCNuTxfTnqQQZ6kJbF8qUtmgHR2eAlzTL5Xsr7azZK7q5qPntVOtP; Received: from luke by mail.digital-crocus.com with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P0E4q-000CMx-U5 for stable@freebsd.org; Mon, 27 Sep 2010 14:49:08 +0100 Received: from office.moo.com ([83.244.232.179] helo=[10.0.0.69]) by mail.digital-crocus.com with esmtpa (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P0E4n-000CMK-1X for stable@freebsd.org; Mon, 27 Sep 2010 14:49:05 +0100 From: Luke Marsden To: stable@freebsd.org In-Reply-To: <4CA08D0D.4030406@icyb.net.ua> References: <1285587910.31122.633.camel@pow> <4CA08D0D.4030406@icyb.net.ua> Content-Type: text/plain; charset="UTF-8" Organization: Hybrid Web Cluster Date: Mon, 27 Sep 2010 14:53:51 +0100 Message-ID: <1285595631.31122.809.camel@pow> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Content-Transfer-Encoding: 7bit X-Spam-Score: -1.0 X-Digital-Crocus-Maillimit: done X-Authenticated-Sender: luke X-Complaints: abuse@digital-crocus.com X-Admin: admin@digital-crocus.com X-Abuse: abuse@digital-crocus.com (Please include full headers in abuse reports) Cc: Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: luke@hybrid-logic.co.uk List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 13:53:58 -0000 On Mon, 2010-09-27 at 15:24 +0300, Andriy Gapon wrote: > on 27/09/2010 14:45 Luke Marsden said the following: > > Hi FreeBSD-stable, > > > > I'm having problems booting 8.1R on a KVM virtualised host backed on AMD > > hardware. It works flawlessly on Intel backed KVM. Please find attached > ------------------------------------------------------------^^^^^^^^^^^^^ > I tried but I can't. > Maybe post a link? Sorry, I guess the mailing list filters out attachments. Here you go: http://lukemarsden.net/8.1R-KVM-AMD-failure.png > > the message I get on boot. This loops endlessly. > > > > Can anyone give me any advice on how to start tracking this down? I'm > > happy to give developers access to some paid instances on ElasticHosts > > where this problem occurs, if it helps debugging it. > > > > (It works fine in ElasticHosts' lon-p and sat-p data centres, but not in > > lon-b, and Intel vs. AMD is apparently the difference). > -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 14:12:46 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D3DDB106564A for ; Mon, 27 Sep 2010 14:12:46 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 244368FC29 for ; Mon, 27 Sep 2010 14:12:45 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA10681; Mon, 27 Sep 2010 17:12:43 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA0A65A.90505@icyb.net.ua> Date: Mon, 27 Sep 2010 17:12:42 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: luke@hybrid-logic.co.uk References: <1285587910.31122.633.camel@pow> <4CA08D0D.4030406@icyb.net.ua> <1285595631.31122.809.camel@pow> In-Reply-To: <1285595631.31122.809.camel@pow> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Luke Marsden Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 14:12:46 -0000 on 27/09/2010 16:53 Luke Marsden said the following: > On Mon, 2010-09-27 at 15:24 +0300, Andriy Gapon wrote: >> on 27/09/2010 14:45 Luke Marsden said the following: >>> Hi FreeBSD-stable, >>> >>> I'm having problems booting 8.1R on a KVM virtualised host backed on AMD >>> hardware. It works flawlessly on Intel backed KVM. Please find attached >> ------------------------------------------------------------^^^^^^^^^^^^^ >> I tried but I can't. >> Maybe post a link? > > Sorry, I guess the mailing list filters out attachments. Here you go: > > http://lukemarsden.net/8.1R-KVM-AMD-failure.png Do you have DDB+KDB options in your kernel? Are you able to capture any output before panic happens? Perhaps via serial console. P.S. how many CPUs are configured on the systems? Both total physical in hardware and visible to FreeBSD. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 14:16:07 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E068C106566C for ; Mon, 27 Sep 2010 14:16:07 +0000 (UTC) (envelope-from nickolasbug@gmail.com) Received: from mail-ey0-f182.google.com (mail-ey0-f182.google.com [209.85.215.182]) by mx1.freebsd.org (Postfix) with ESMTP id 70F5F8FC08 for ; Mon, 27 Sep 2010 14:16:07 +0000 (UTC) Received: by eyx24 with SMTP id 24so1497494eyx.13 for ; Mon, 27 Sep 2010 07:16:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=lkYELPmcYqEk0KM09+j+fp0xn/q8cZrwTZd9rMaNOpo=; b=SwFaGnxRJWLYRkGobbaX3sX+WlNi1oGxUb5CDcfMVT8J/8zSLP2BaILnL9Z2rhcQj+ TDzxJAhZXcKyCCobpF3J+y3+l4UBrrHww31YRlVbkSSfZWqnWUflXUU4mph1c9H6DnSI 3MMf10cBuX1f6zSL/A+4ZXV5yfi5r3W0OhVw0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=gMBJaN9IYjIG4VPpyWPaST8+RNtA+tR3Am3gD5sJE1VdUNqqoWZtAq/hMdos4QM1bP F0qE8fpy5oRycWPVkZExpkX4fWc8JQrVH4JpuExnRTcWPdh7csoudYRbPiOwght8EOum voTiYwXhUZ8HOED7mpoMLsVlWdQXvjAj1Vwo8= MIME-Version: 1.0 Received: by 10.213.32.135 with SMTP id c7mr6141096ebd.2.1285596966109; Mon, 27 Sep 2010 07:16:06 -0700 (PDT) Received: by 10.213.113.142 with HTTP; Mon, 27 Sep 2010 07:16:06 -0700 (PDT) In-Reply-To: <1285595631.31122.809.camel@pow> References: <1285587910.31122.633.camel@pow> <4CA08D0D.4030406@icyb.net.ua> <1285595631.31122.809.camel@pow> Date: Mon, 27 Sep 2010 17:16:06 +0300 Message-ID: From: nickolasbug@gmail.com To: luke@hybrid-logic.co.uk Content-Type: text/plain; charset=ISO-8859-1 Cc: stable@freebsd.org Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 14:16:08 -0000 > http://lukemarsden.net/8.1R-KVM-AMD-failure.png This picture is useless. No debug symbols == no useful information. 1. Please, build your kernel with debug symbols. Your kernel configuration file should contain string makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols 2. Show kgdb output, as described here: http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug-gdb.html From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 14:55:53 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 856C01065672 for ; Mon, 27 Sep 2010 14:55:53 +0000 (UTC) (envelope-from mavbsd@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0A7F58FC16 for ; Mon, 27 Sep 2010 14:55:52 +0000 (UTC) Received: by bwz15 with SMTP id 15so4455597bwz.13 for ; Mon, 27 Sep 2010 07:55:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:sender:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :x-enigmail-version:content-type:content-transfer-encoding; bh=6wngrMuKh8b2pZCbVGueCIJpSrs1grR3jWU1/7Yc9qo=; b=uHpD3m03mJD+DzcSm9/h3+uBbGTw/SFlRoBwTf8tC3Rh/CcLmgyu5UnqC0fIJeYhGg fEIzDC5tyqsZRrxU+Z4Q0TvnnXWY8l78a4GpJXyP2h72a9tPyU2q1JOf54k6935IJH8c SkEOCoVUipOrvFk8AgGD7fkJc/7idAK12ziKQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=sender:message-id:date:from:user-agent:mime-version:to:subject :references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; b=ukRPISVRY7P/CgkuATCIsglGYOx5ofp2ih9R6o6V3lfEc1f6PsfldrUlN9ETC54d6d WijRZ4Yi0xd3cQ+rvx3cPuMzvUC6o7gBlJJK/tq901u9xrqrUuYYEXZpYlyQnkjc9eob SwcLyXm5URpIzKD2QFf1NKzpdZnPrfZzkzpPg= Received: by 10.204.98.66 with SMTP id p2mr5301099bkn.178.1285599351845; Mon, 27 Sep 2010 07:55:51 -0700 (PDT) Received: from mavbook.mavhome.dp.ua (pc.mavhome.dp.ua [212.86.226.226]) by mx.google.com with ESMTPS id 24sm4484853bkr.19.2010.09.27.07.55.49 (version=SSLv3 cipher=RC4-MD5); Mon, 27 Sep 2010 07:55:50 -0700 (PDT) Sender: Alexander Motin Message-ID: <4CA0B070.8080906@FreeBSD.org> Date: Mon, 27 Sep 2010 17:55:44 +0300 From: Alexander Motin User-Agent: Thunderbird 2.0.0.24 (X11/20100402) MIME-Version: 1.0 To: "S.N.Grigoriev" , FreeBSD Stable References: In-Reply-To: X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: Subject: Re: snd_hda: how to duplicate output X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 14:55:53 -0000 S.N.Grigoriev wrote: > this is an exerpt from 8.1R detailed Release Notes: > > 2.2.2.1 Multimedia Support > [snipped] > The snd_hda(4) driver now supports multichannel (4.0 and 7.1) > playback support. The 5.1 mode support is disabled now due to > unidentified synchronization problem. Devices which supports > the 7.1 mode can handle the 5.1 operation via software upmix > done by sound(4). Note that stereo stream is no longer duplicated > to all ports. > > I'm interesting in what way can I restore the old behaviour, when > stereo stream is duplicated to the black and grey sound card > output connectors? The only way is to change channel mapping (chmap array) for stereo streams in hdac_stream_setup() function. -- Alexander Motin From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 15:29:29 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5F3A51065670 for ; Mon, 27 Sep 2010 15:29:29 +0000 (UTC) (envelope-from luke@digital-crocus.com) Received: from mail.digital-crocus.com (node2.digital-crocus.com [91.209.244.128]) by mx1.freebsd.org (Postfix) with ESMTP id 167748FC20 for ; Mon, 27 Sep 2010 15:29:28 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dkselector; d=hybrid-logic.co.uk; h=Received:Received:Subject:From:Reply-To:To:Cc:In-Reply-To:References:Content-Type:Organization:Date:Message-ID:Mime-Version:X-Mailer:Content-Transfer-Encoding:X-Spam-Score:X-Digital-Crocus-Maillimit:X-Authenticated-Sender:X-Complaints:X-Admin:X-Abuse; b=VS62ra2fZNgmApMkNZBzdicnAIYWjP2zgwYwU+aC9aDwlK+Iq2Y81yWS4UC1TFqIuxQC2rccNcx1AeH7cNE1BSa7hGWzwXDZGvQPEeKycOYHATS7Tel19C5BqVMayAlj; Received: from luke by mail.digital-crocus.com with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P0FZI-000Dk4-IC for stable@freebsd.org; Mon, 27 Sep 2010 16:24:40 +0100 Received: from office.moo.com ([83.244.232.179] helo=[10.0.0.69]) by mail.digital-crocus.com with esmtpa (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P0FZI-000Djy-7G; Mon, 27 Sep 2010 16:24:40 +0100 From: Luke Marsden To: stable@freebsd.org In-Reply-To: References: <1285587910.31122.633.camel@pow> <4CA08D0D.4030406@icyb.net.ua> <1285595631.31122.809.camel@pow> Content-Type: text/plain; charset="UTF-8" Organization: Hybrid Web Cluster Date: Mon, 27 Sep 2010 16:29:27 +0100 Message-ID: <1285601367.31122.909.camel@pow> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Content-Transfer-Encoding: 7bit X-Spam-Score: -1.0 X-Digital-Crocus-Maillimit: done X-Authenticated-Sender: luke X-Complaints: abuse@digital-crocus.com X-Admin: admin@digital-crocus.com X-Abuse: abuse@digital-crocus.com (Please include full headers in abuse reports) Cc: team@hybrid-logic.co.uk Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: luke@hybrid-logic.co.uk List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 15:29:29 -0000 Hi all, Thanks for your responses. > 1. Please, build your kernel with debug symbols. > 2. Show kgdb output I will build a debug kernel as per your instructions and post the results as soon as I can. Likely in the next couple of days. I have secured us test hardware at ElasticHosts to debug this as necessary. As a reference point, 8.0R runs fine on this particular infrastructure: Linux KVM on AMD hardware. More detail to follow. Thank you. -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 18:06:59 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A6288106566B; Mon, 27 Sep 2010 18:06:59 +0000 (UTC) (envelope-from serguey-grigoriev@yandex.ru) Received: from forward13.mail.yandex.net (forward13.mail.yandex.net [95.108.130.120]) by mx1.freebsd.org (Postfix) with ESMTP id 553B48FC13; Mon, 27 Sep 2010 18:06:59 +0000 (UTC) Received: from web152.yandex.ru (web152.yandex.ru [95.108.131.165]) by forward13.mail.yandex.net (Yandex) with ESMTP id 8B97B108155C; Mon, 27 Sep 2010 22:06:57 +0400 (MSD) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1285610817; bh=wZfOS8xEKVUclWSRk/kaS6QPxHOhqaXXo2Bq8L172vY=; h=From:To:Cc:In-Reply-To:References:Subject:MIME-Version:Message-Id: Date:Content-Transfer-Encoding:Content-Type; b=Kf992h3q8ulIIDp+X0fKhkm3kssZmmom40me6U8/JZeMnnTVxNoLI9BQrrkvDHTZa QB4xYtA6N+XbglfZkK37QBTSTEYvo+AB+iQl7kcmy1gT0RK/t346bpjcnXk2xbmKeq J1iQPhaJMdSaFJdCXDc/036HObLHBC4qYCvd+eHk= Received: from localhost (localhost.localdomain [127.0.0.1]) by web152.yandex.ru (Yandex) with ESMTP id 84F345668008; Mon, 27 Sep 2010 22:06:57 +0400 (MSD) X-Yandex-Spam: 0 X-Yandex-Front: web152.yandex.ru X-Yandex-TimeMark: 1285610817 Received: from [188.134.22.116] ([188.134.22.116]) by mail.yandex.ru with HTTP; Mon, 27 Sep 2010 22:06:45 +0400 From: S.N.Grigoriev To: Alexander Motin In-Reply-To: <4CA0B070.8080906@FreeBSD.org> References: <4CA0B070.8080906@FreeBSD.org> MIME-Version: 1.0 Message-Id: <182991285610805@web152.yandex.ru> Date: Mon, 27 Sep 2010 22:06:45 +0400 X-Mailer: Yamail [ http://yandex.ru ] 5.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain Cc: FreeBSD Stable Subject: Re: snd_hda: how to duplicate output X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 18:06:59 -0000 Alexander Motin wrote: > S.N.Grigoriev wrote: > > this is an exerpt from 8.1R detailed Release Notes: > > > > 2.2.2.1 Multimedia Support > > [snipped] > > The snd_hda(4) driver now supports multichannel (4.0 and 7.1) > > playback support. The 5.1 mode support is disabled now due to > > unidentified synchronization problem. Devices which supports > > the 7.1 mode can handle the 5.1 operation via software upmix > > done by sound(4). Note that stereo stream is no longer duplicated > > to all ports. > > > > I'm interesting in what way can I restore the old behaviour, when > > stereo stream is duplicated to the black and grey sound card > > output connectors? > > The only way is to change channel mapping (chmap array) for stereo > streams in hdac_stream_setup() function. > Alexander, thank you for your responce. Regards, Serguey. From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 18:57:00 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 69B421065696 for ; Mon, 27 Sep 2010 18:57:00 +0000 (UTC) (envelope-from vmagerya@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id E36158FC27 for ; Mon, 27 Sep 2010 18:56:59 +0000 (UTC) Received: by bwz15 with SMTP id 15so4743663bwz.13 for ; Mon, 27 Sep 2010 11:56:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=cst8FGDMS94ijGTvR/Zz35CO2UwVyoHx/zDMAavVlcI=; b=psUic+cbnA+9Jwp/5EqfB97AWGQxBsOT7J+/rZ03SqkkZ8vV1tFMsWW0Y7qJoRzcyM T0YKb87L+Uo9HLZwlrFyQ0dT1rGSwQBt05vf3m0z0GDfC0o3oACygzHk/xWJLKzmHTUx xS1GI3KxMCU02gm736NzPs4jrqMbiDfJ/p/WY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=evBQf9pAm//fw+DKEUEG1R485WhQ7sVB/z3ygZKfWDlxdE0WTrsuTupi6Y2jZyqYnK 5c1xm6uRlzSbcWko4xwYEIdjDg+i3kkc2bQfdGzN3hfGI+QLFaTRXujO2nWTkBUp5H6m /OsPndZLGQbA9oMyFBCa8f1y97P4gVbp0w+N4= Received: by 10.204.57.130 with SMTP id c2mr5656152bkh.144.1285613818563; Mon, 27 Sep 2010 11:56:58 -0700 (PDT) Received: from [172.16.0.6] (tx97.net [85.198.160.156]) by mx.google.com with ESMTPS id g12sm4748682bkb.2.2010.09.27.11.56.56 (version=TLSv1/SSLv3 cipher=RC4-MD5); Mon, 27 Sep 2010 11:56:57 -0700 (PDT) Message-ID: <4CA0E892.4010204@gmail.com> Date: Mon, 27 Sep 2010 21:55:14 +0300 From: Vitaly Magerya User-Agent: Thunderbird MIME-Version: 1.0 To: Ian Smith References: <20100224165203.GA10423@zod.isi.edu> <20100225152711.M16250@sola.nimnet.asn.au> <20100226013551.GA67689@zod.isi.edu> <20100922181029.D11124@sola.nimnet.asn.au> <20100922171008.GA92070@zod.isi.edu> <20100925181038.T11124@sola.nimnet.asn.au> <4C9DB6F5.6010305@gmail.com> <20100927170317.I90633@sola.nimnet.asn.au> In-Reply-To: <20100927170317.I90633@sola.nimnet.asn.au> Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org, Ted Faber Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 18:57:00 -0000 Ian Smith wrote: > [...] During the 60s resume stall period, iff > I'd suspended from a VTY, I found I could slowly (like maybe 3 seconds > per character echoed) type a command, and some commands - possibly those > cached? as there's no HD access - would run after another few seconds. > > In this way I discovered that 'date' commands reported the time some > seconds after the resume (perhaps hours ago, or yesterday) until the > stall ended, disk light flashed and normality resumed, sometimes with > "calcru: time went backwards .." messages, most often for devd. Yes, same here. I must add that some peripherals do not work normally after the resume: - the mouse doesn't work until I restart moused manually - the network doesn't work: there's a message in dmesg about em0 going down before the sleep, and although ifconfig says that it's UP, only after a manual "ifconfig em0 up" it starts working again (except for host name resolution, which I can't repair for some reason) - if there's a flash drive inserted, it fails to reattach, sometimes saying something like this: usbus3: port reset timeout uhub_reattach_port: port 1 reset failed, error=USB_ERR_TIMEOUT uhub_reattach_port: device problem (USB_ERR_TIMEOUT), disabling port 1 Sometimes there's no message about USB timeout, but mounting that drive still fails with this error: mount_msdosfs: /dev/da0: Input/output error And this appears in the dmesg output: (da0:umass-sim0:0:0:0): AutoSense failed If I remove and insert the drive again, everything works though. I also often (but not always) have this in dmesg: acpi_ec0: warning: EC done before starting even wait I don't know if the above information will be useful to anyone, but if someone wants to look into it, I can provide any further information on request. > Are you referring to 8.1-RELEASE or to 8-STABLE as at some date? 8.1-RELEASE-p1. > > This has been like this in 7.0 too (except I don't know if it ever > > recovered the speed; I remember shutting it down as soon as I saw how > > slow it is). > > That's a difference then; 7.0-R then 7.2-STABLE (late December, anyway) > had no such issues here on my T23. That may have been a separate issue, but I can't recall the exact symptoms now; I've been under impression that sleep will never work on my laptop so I didn't experiment much (the fact that it does sort of work now is news to me). > When it clears up after a wet week and I have some spare power again > I'll try building a debug kernel, perhaps omitting and kldoading USB, > and do some more tests before reporting further, probably in mobile@ > and acpi@ again. I'll copy you and Ted when I do so. Please do. From owner-freebsd-stable@FreeBSD.ORG Mon Sep 27 20:21:32 2010 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from [127.0.0.1] (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by hub.freebsd.org (Postfix) with ESMTP id 601601065693; Mon, 27 Sep 2010 20:21:31 +0000 (UTC) (envelope-from jkim@FreeBSD.org) From: Jung-uk Kim To: freebsd-stable@FreeBSD.org Date: Mon, 27 Sep 2010 16:21:15 -0400 User-Agent: KMail/1.6.2 References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> In-Reply-To: <4CA0E892.4010204@gmail.com> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201009271621.17669.jkim@FreeBSD.org> Cc: Ted Faber , Vitaly Magerya , Ian Smith Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Sep 2010 20:21:32 -0000 On Monday 27 September 2010 02:55 pm, Vitaly Magerya wrote: > Ian Smith wrote: > > [...] During the 60s resume stall period, iff > > I'd suspended from a VTY, I found I could slowly (like maybe 3 > > seconds per character echoed) type a command, and some commands - > > possibly those cached? as there's no HD access - would run after > > another few seconds. > > > > In this way I discovered that 'date' commands reported the time > > some seconds after the resume (perhaps hours ago, or yesterday) > > until the stall ended, disk light flashed and normality resumed, > > sometimes with "calcru: time went backwards .." messages, most > > often for devd. > > Yes, same here. I must add that some peripherals do not work > normally after the resume: > - the mouse doesn't work until I restart moused manually --- >8 --- SNIP!!! --- >8 --- If the mouse is connected to PS/2 port, the following device flags may help. psm(4): bit 13 HOOKRESUME The built-in PS/2 pointing device of some laptop computers is somehow not operable immediately after the system `resumes' from the power saving mode, though it will eventually become available. There are reports that stimulating the device by performing I/O will help waking up the device quickly. This flag will enable a piece of code in the psm driver to hook the `resume' event and exercise some harmless I/O operations on the device. bit 14 INITAFTERSUSPEND This flag adds more drastic action for the above problem. It will cause the psm driver to reset and re-initialize the pointing device after the `resume' event. It has no effect unless the HOOKRESUME flag is set as well. I always use hint.psm.0.flags="0x6000" in /boot/loader.conf, i.e., turn on both HOOKRESUME and INITAFTERSUSPEND, to work around similar problem on different laptop. Can you please report other problems in the appropriate ML? em -> freebsd-net@ usb -> freebsd-usb@ acpi_ec -> freebsd-acpi@ BTW, USB stack issue is known problem AFAIK. Jung-uk Kim From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 04:44:04 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1E977106566B for ; Tue, 28 Sep 2010 04:44:04 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 9C4D68FC12 for ; Tue, 28 Sep 2010 04:44:03 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8S4PA0d058131 for ; Mon, 27 Sep 2010 21:25:14 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009280425.o8S4PA0d058131@gw.catspoiler.org> Date: Mon, 27 Sep 2010 21:25:10 -0700 (PDT) From: Don Lewis To: stable@FreeBSD.org MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: Subject: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 04:44:04 -0000 CPU time accounting is broken on one of my machines running 8-STABLE. I ran a test with a simple program that just loops and consumes CPU time: % time ./a.out 94.544u 0.000s 19:14.10 8.1% 62+2054k 0+0io 0pf+0w The display in top shows the process with WCPU at 100%, but TIME increments very slowly. Several hours after booting, I got a bunch of "calcru: runtime went backwards" messages, but they stopped right away and never appeared again. Aug 23 13:40:07 scratch ntpd[1159]: ntpd 4.2.4p5-a (1) Aug 23 13:43:18 scratch ntpd[1160]: kernel time sync status change 2001 Aug 23 18:05:57 scratch dbus-daemon: [system] Reloaded configuration Aug 23 18:06:16 scratch dbus-daemon: [system] Reloaded configuration Aug 23 18:12:40 scratch ntpd[1160]: time reset +18.059948 s [snip] Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6836685136 usec to 5425839798 usec for pid 1526 (csh) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 4747 usec to 2403 usec for pid 1519 (csh) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 5265 usec to 2594 usec for pid 1494 (hald-addon-mouse-sy) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 7818 usec to 3734 usec for pid 1488 (console-kit-daemon) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 977 usec to 459 usec for pid 1480 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 958 usec to 450 usec for pid 1479 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 957 usec to 449 usec for pid 1478 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 952 usec to 447 usec for pid 1477 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 959 usec to 450 usec for pid 1476 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 975 usec to 458 usec for pid 1475 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1026 usec to 482 usec for pid 1474 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1333 usec to 626 usec for pid 1473 (getty) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 2469 usec to 1160 usec for pid 1440 (inetd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 719 usec to 690 usec for pid 1402 (sshd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 120486 usec to 56770 usec for pid 1360 (cupsd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6204 usec to 2914 usec for pid 1289 (dbus-daemon) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 179 usec to 84 usec for pid 1265 (moused) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 22156 usec to 10407 usec for pid 1041 (nfsd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1292 usec to 607 usec for pid 1032 (mountd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 8801 usec to 4134 usec for pid 664 (devd) Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 19 usec to 9 usec for pid 9 (sctp_iterator) If I reboot and run the test again, the CPU time accounting seems to be working correctly. % time ./a.out 1144.226u 0.000s 19:06.62 99.7% 5+168k 0+0io 0pf+0w I'm not sure how long this problem has been present. I do remember seeing the calcru messages with an August 23rd kernel. I have not seen the calcru messages when running -CURRENT on the same hardware. I also have not seen this same problem on my other Athlon 64 box running the August 23rd kernel. Before reboot: # sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 4294967295 kern.timecounter.tc.i8254.counter: 3534 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 8685335 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 2204228369 kern.timecounter.tc.TSC.frequency: 2500018183 kern.timecounter.tc.TSC.quality: 800 kern.timecounter.invariant_tsc: 0 After reboot: % sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 4294967295 kern.timecounter.tc.i8254.counter: 2241 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 4636239 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1429996208 kern.timecounter.tc.TSC.frequency: 2500020459 kern.timecounter.tc.TSC.quality: 800 kern.timecounter.invariant_tsc: 0 Here's my kernel config file (uni-processor i386 kernel running on an Athlon 64 x2 CPU): include GENERIC nocpu I486_CPU nocpu I586_CPU nooptions SCHED_4BSD # 4BSD scheduler options SCHED_ULE # Debugging for use in -current options KDB # Enable kernel debugger support. options DDB # Support DDB. options GDB # Support remote GDB. nooptions SMP # Symmetric MultiProcessor Kernel nodevice apic # I/O APIC nodevice atapicd # ATAPI CDROM drives device atapicam # emulate ATAPI devices as SCSI ditto via CAM /var/run/dmesg.boot: Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.1-STABLE #6: Thu Sep 23 16:03:29 PDT 2010 dl@scratch.catspoiler.org:/usr/obj/usr/src/sys/GENERICDDB i386 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 4800+ (2500.02-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0x60fb1 Family = f Model = 6b Stepping = 1 Features=0x178bfbff Features2=0x2001 AMD Features=0xea500800 AMD Features2=0x11f real memory = 4294967296 (4096 MB) avail memory = 3607351296 (3440 MB) kbd1 at kbdmux0 ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0x0000000000000000/0x1 (20100331/tbfadt-655) acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, dbdf0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: on acpi0 acpi_hpet0: iomem 0xfeff0000-0xfeff03ff on acpi0 device_attach: acpi_hpet0 attach returned 12 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.0 (no driver attached) isab0: at device 1.0 on pci0 isa0: on isab0 pci0: at device 1.1 (no driver attached) pci0: at device 1.2 (no driver attached) ohci0: mem 0xfe02f000-0xfe02ffff irq 10 at device 2.0 on pci0 ohci0: [ITHREAD] usbus0: on ohci0 ehci0: mem 0xfe02e000-0xfe02e0ff irq 11 at device 2.1 on pci0 ehci0: [ITHREAD] usbus1: EHCI version 1.0 usbus1: on ehci0 ohci1: mem 0xfe02d000-0xfe02dfff irq 5 at device 4.0 on pci0 ohci1: [ITHREAD] usbus2: on ohci1 ehci1: mem 0xfe02c000-0xfe02c0ff irq 10 at device 4.1 on pci0 ehci1: [ITHREAD] usbus3: EHCI version 1.0 usbus3: on ehci1 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 6.0 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 7.0 (no driver attached) pcib1: at device 8.0 on pci0 pci1: on pcib1 fwohci0: mem 0xfd0ff000-0xfd0ff7ff,0xfd0f8000-0xfd0fbfff irq 11 at device 7.0 on pci1 fwohci0: [ITHREAD] fwohci0: OHCI version 1.10 (ROM=1) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 00:50:8d:00:00:99:f0:69 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0x1090000 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 02:50:8d:99:f0:69 fwe0: Ethernet address: 02:50:8d:99:f0:69 fwip0: on firewire0 fwip0: Firewire address: 00:50:8d:00:00:99:f0:69 @ 0xfffe00000000, S400, maxrec 2048 fwohci0: Initiate bus reset fwohci0: fwohci_intr_core: BUS reset fwohci0: fwohci_intr_core: node_id=0x00000000, SelfID Count=1, CYCLEMASTER mode ahc0: port 0xcc00-0xccff mem 0xfd0fe000-0xfd0fefff irq 11 at device 9.0 on pci1 ahc0: [ITHREAD] aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs atapci1: port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xdc00-0xdc0f mem 0xfe026000-0xfe027fff irq 10 at device 9.0 on pci0 atapci1: [ITHREAD] atapci1: AHCI v1.10 controller with 4 3Gbps ports, PM supported ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] ata4: on atapci1 ata4: [ITHREAD] ata5: on atapci1 ata5: [ITHREAD] nfe0: port 0xd800-0xd807 mem 0xfe02b000-0xfe02bfff,0xfe02a000-0xfe02a0ff,0xfe029000-0xfe02900f irq 15 at device 10.0 on pci0 miibus0: on nfe0 e1000phy0: PHY 1 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto nfe0: Ethernet address: 00:50:8d:9f:6d:e3 nfe0: [FILTER] pcib2: at device 11.0 on pci0 pci2: on pcib2 pcib3: at device 12.0 on pci0 pci3: on pcib3 atapci2: port 0xac00-0xac7f mem 0xfdcff000-0xfdcff07f,0xfdcf8000-0xfdcfbfff irq 5 at device 0.0 on pci3 atapci2: [ITHREAD] ata6: on atapci2 ata6: [ITHREAD] ata7: on atapci2 ata7: [ITHREAD] pcib4: at device 13.0 on pci0 pci4: on pcib4 pcib5: at device 14.0 on pci0 pci5: on pcib5 pcib6: at device 15.0 on pci0 pci6: on pcib6 pcib7: at device 16.0 on pci0 pci7: on pcib7 pcib8: at device 17.0 on pci0 pci8: on pcib8 vgapci0: mem 0xfb000000-0xfbffffff,0xe0000000-0xefffffff,0xfc000000-0xfcffffff irq 10 at device 18.0 on pci0 acpi_tz0: on acpi0 atrtc0: port 0x70-0x73 on acpi0 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model MouseMan+, device ID 0 pmtimer0 on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ppc0: parallel port not found. powernow0: on cpu0 Timecounter "TSC" frequency 2500018183 Hz quality 800 Timecounters tick every 1.000 msec firewire0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me) firewire0: bus manager 0 usbus0: 12Mbps Full Speed USB v1.0 usbus1: 480Mbps High Speed USB v2.0 usbus2: 12Mbps Full Speed USB v1.0 usbus3: 480Mbps High Speed USB v2.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 ugen2.1: at usbus2 uhub2: on usbus2 ugen3.1: at usbus3 uhub3: on usbus3 uhub0: 6 ports with 6 removable, self powered uhub2: 6 ports with 6 removable, self powered ad6: 476940MB at ata3-master UDMA100 SATA 3Gb/s uhub1: 6 ports with 6 removable, self powered uhub3: 6 ports with 6 removable, self powered unknown: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 sks=0x40 0x00 0x01 cd0 at ata0 bus 0 scbus1 target 0 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: 3.300MB/s transfers cd0: cd present [138590 x 2048 byte records] da0 at ahc0 bus 0 scbus0 target 0 lun 0 da0: Fixed Direct Access SCSI-3 device da0: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) da0: Command Queueing enabled da0: 35003MB (71687370 512 byte sectors: 255H 63S/T 4462C) Trying to mount root from ufs:/dev/ad6s1a nfe0: link state changed to UP From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 05:11:57 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D7F3B106566C for ; Tue, 28 Sep 2010 05:11:57 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta13.westchester.pa.mail.comcast.net (qmta13.westchester.pa.mail.comcast.net [76.96.59.243]) by mx1.freebsd.org (Postfix) with ESMTP id 821498FC19 for ; Tue, 28 Sep 2010 05:11:56 +0000 (UTC) Received: from omta24.westchester.pa.mail.comcast.net ([76.96.62.76]) by qmta13.westchester.pa.mail.comcast.net with comcast id C4uU1f0061ei1Bg5D5BxsL; Tue, 28 Sep 2010 05:11:57 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta24.westchester.pa.mail.comcast.net with comcast id C5Bw1f0083LrwQ23k5Bw9a; Tue, 28 Sep 2010 05:11:57 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id DACCC9B418; Mon, 27 Sep 2010 22:11:54 -0700 (PDT) Date: Mon, 27 Sep 2010 22:11:54 -0700 From: Jeremy Chadwick To: Don Lewis Message-ID: <20100928051154.GA73859@icarus.home.lan> References: <201009280425.o8S4PA0d058131@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201009280425.o8S4PA0d058131@gw.catspoiler.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@FreeBSD.org Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 05:11:57 -0000 On Mon, Sep 27, 2010 at 09:25:10PM -0700, Don Lewis wrote: > CPU time accounting is broken on one of my machines running 8-STABLE. I > ran a test with a simple program that just loops and consumes CPU time: > > % time ./a.out > 94.544u 0.000s 19:14.10 8.1% 62+2054k 0+0io 0pf+0w > > The display in top shows the process with WCPU at 100%, but TIME > increments very slowly. > > Several hours after booting, I got a bunch of "calcru: runtime went > backwards" messages, but they stopped right away and never appeared > again. > > Aug 23 13:40:07 scratch ntpd[1159]: ntpd 4.2.4p5-a (1) > Aug 23 13:43:18 scratch ntpd[1160]: kernel time sync status change 2001 > Aug 23 18:05:57 scratch dbus-daemon: [system] Reloaded configuration > Aug 23 18:06:16 scratch dbus-daemon: [system] Reloaded configuration > Aug 23 18:12:40 scratch ntpd[1160]: time reset +18.059948 s > [snip] > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6836685136 usec to 5425839798 usec for pid 1526 (csh) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 4747 usec to 2403 usec for pid 1519 (csh) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 5265 usec to 2594 usec for pid 1494 (hald-addon-mouse-sy) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 7818 usec to 3734 usec for pid 1488 (console-kit-daemon) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 977 usec to 459 usec for pid 1480 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 958 usec to 450 usec for pid 1479 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 957 usec to 449 usec for pid 1478 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 952 usec to 447 usec for pid 1477 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 959 usec to 450 usec for pid 1476 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 975 usec to 458 usec for pid 1475 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1026 usec to 482 usec for pid 1474 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1333 usec to 626 usec for pid 1473 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 2469 usec to 1160 usec for pid 1440 (inetd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 719 usec to 690 usec for pid 1402 (sshd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 120486 usec to 56770 usec for pid 1360 (cupsd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6204 usec to 2914 usec for pid 1289 (dbus-daemon) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 179 usec to 84 usec for pid 1265 (moused) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 22156 usec to 10407 usec for pid 1041 (nfsd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1292 usec to 607 usec for pid 1032 (mountd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 8801 usec to 4134 usec for pid 664 (devd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 19 usec to 9 usec for pid 9 (sctp_iterator) > > > If I reboot and run the test again, the CPU time accounting seems to be > working correctly. > % time ./a.out > 1144.226u 0.000s 19:06.62 99.7% 5+168k 0+0io 0pf+0w > > > I'm not sure how long this problem has been present. I do remember > seeing the calcru messages with an August 23rd kernel. I have not seen > the calcru messages when running -CURRENT on the same hardware. I also > have not seen this same problem on my other Athlon 64 box running the > August 23rd kernel. > > Before reboot: > # sysctl kern.timecounter > kern.timecounter.tick: 1 > kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) > kern.timecounter.hardware: ACPI-fast > kern.timecounter.stepwarnings: 0 > kern.timecounter.tc.i8254.mask: 4294967295 > kern.timecounter.tc.i8254.counter: 3534 > kern.timecounter.tc.i8254.frequency: 1193182 > kern.timecounter.tc.i8254.quality: 0 > kern.timecounter.tc.ACPI-fast.mask: 16777215 > kern.timecounter.tc.ACPI-fast.counter: 8685335 > kern.timecounter.tc.ACPI-fast.frequency: 3579545 > kern.timecounter.tc.ACPI-fast.quality: 1000 > kern.timecounter.tc.TSC.mask: 4294967295 > kern.timecounter.tc.TSC.counter: 2204228369 > kern.timecounter.tc.TSC.frequency: 2500018183 > kern.timecounter.tc.TSC.quality: 800 > kern.timecounter.invariant_tsc: 0 > > After reboot: > % sysctl kern.timecounter > kern.timecounter.tick: 1 > kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) > kern.timecounter.hardware: ACPI-fast > kern.timecounter.stepwarnings: 0 > kern.timecounter.tc.i8254.mask: 4294967295 > kern.timecounter.tc.i8254.counter: 2241 > kern.timecounter.tc.i8254.frequency: 1193182 > kern.timecounter.tc.i8254.quality: 0 > kern.timecounter.tc.ACPI-fast.mask: 16777215 > kern.timecounter.tc.ACPI-fast.counter: 4636239 > kern.timecounter.tc.ACPI-fast.frequency: 3579545 > kern.timecounter.tc.ACPI-fast.quality: 1000 > kern.timecounter.tc.TSC.mask: 4294967295 > kern.timecounter.tc.TSC.counter: 1429996208 > kern.timecounter.tc.TSC.frequency: 2500020459 > kern.timecounter.tc.TSC.quality: 800 > kern.timecounter.invariant_tsc: 0 > > > > Here's my kernel config file (uni-processor i386 kernel running on an > Athlon 64 x2 CPU): > include GENERIC > > nocpu I486_CPU > nocpu I586_CPU > > nooptions SCHED_4BSD # 4BSD scheduler > options SCHED_ULE > > # Debugging for use in -current > options KDB # Enable kernel debugger support. > options DDB # Support DDB. > options GDB # Support remote GDB. > > nooptions SMP # Symmetric MultiProcessor Kernel > nodevice apic # I/O APIC > > nodevice atapicd # ATAPI CDROM drives > device atapicam # emulate ATAPI devices as SCSI ditto via CAM > > > /var/run/dmesg.boot: > Copyright (c) 1992-2010 The FreeBSD Project. > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 > The Regents of the University of California. All rights reserved. > FreeBSD is a registered trademark of The FreeBSD Foundation. > FreeBSD 8.1-STABLE #6: Thu Sep 23 16:03:29 PDT 2010 > dl@scratch.catspoiler.org:/usr/obj/usr/src/sys/GENERICDDB i386 > Timecounter "i8254" frequency 1193182 Hz quality 0 > CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 4800+ (2500.02-MHz 686-class CPU) > Origin = "AuthenticAMD" Id = 0x60fb1 Family = f Model = 6b Stepping = 1 > Features=0x178bfbff > Features2=0x2001 > AMD Features=0xea500800 > AMD Features2=0x11f > real memory = 4294967296 (4096 MB) > avail memory = 3607351296 (3440 MB) > kbd1 at kbdmux0 > ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0x0000000000000000/0x1 (20100331/tbfadt-655) > acpi0: on motherboard > acpi0: [ITHREAD] > acpi0: Power Button (fixed) > acpi0: reservation of 0, a0000 (3) failed > acpi0: reservation of 100000, dbdf0000 (3) failed > Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 > acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 > cpu0: on acpi0 > acpi_hpet0: iomem 0xfeff0000-0xfeff03ff on acpi0 > device_attach: acpi_hpet0 attach returned 12 > acpi_button0: on acpi0 > pcib0: port 0xcf8-0xcff on acpi0 > pci0: on pcib0 > pci0: at device 0.0 (no driver attached) > isab0: at device 1.0 on pci0 > isa0: on isab0 > pci0: at device 1.1 (no driver attached) > pci0: at device 1.2 (no driver attached) > ohci0: mem 0xfe02f000-0xfe02ffff irq 10 at device 2.0 on pci0 > ohci0: [ITHREAD] > usbus0: on ohci0 > ehci0: mem 0xfe02e000-0xfe02e0ff irq 11 at device 2.1 on pci0 > ehci0: [ITHREAD] > usbus1: EHCI version 1.0 > usbus1: on ehci0 > ohci1: mem 0xfe02d000-0xfe02dfff irq 5 at device 4.0 on pci0 > ohci1: [ITHREAD] > usbus2: on ohci1 > ehci1: mem 0xfe02c000-0xfe02c0ff irq 10 at device 4.1 on pci0 > ehci1: [ITHREAD] > usbus3: EHCI version 1.0 > usbus3: on ehci1 > atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xf000-0xf00f at device 6.0 on pci0 > ata0: on atapci0 > ata0: [ITHREAD] > ata1: on atapci0 > ata1: [ITHREAD] > pci0: at device 7.0 (no driver attached) > pcib1: at device 8.0 on pci0 > pci1: on pcib1 > fwohci0: mem 0xfd0ff000-0xfd0ff7ff,0xfd0f8000-0xfd0fbfff irq 11 at device 7.0 on pci1 > fwohci0: [ITHREAD] > fwohci0: OHCI version 1.10 (ROM=1) > fwohci0: No. of Isochronous channels is 4. > fwohci0: EUI64 00:50:8d:00:00:99:f0:69 > fwohci0: Phy 1394a available S400, 2 ports. > fwohci0: Link S400, max_rec 2048 bytes. > firewire0: on fwohci0 > dcons_crom0: on firewire0 > dcons_crom0: bus_addr 0x1090000 > fwe0: on firewire0 > if_fwe0: Fake Ethernet address: 02:50:8d:99:f0:69 > fwe0: Ethernet address: 02:50:8d:99:f0:69 > fwip0: on firewire0 > fwip0: Firewire address: 00:50:8d:00:00:99:f0:69 @ 0xfffe00000000, S400, maxrec 2048 > fwohci0: Initiate bus reset > fwohci0: fwohci_intr_core: BUS reset > fwohci0: fwohci_intr_core: node_id=0x00000000, SelfID Count=1, CYCLEMASTER mode > ahc0: port 0xcc00-0xccff mem 0xfd0fe000-0xfd0fefff irq 11 at device 9.0 on pci1 > ahc0: [ITHREAD] > aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs > atapci1: port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xdc00-0xdc0f mem 0xfe026000-0xfe027fff irq 10 at device 9.0 on pci0 > atapci1: [ITHREAD] > atapci1: AHCI v1.10 controller with 4 3Gbps ports, PM supported > ata2: on atapci1 > ata2: [ITHREAD] > ata3: on atapci1 > ata3: [ITHREAD] > ata4: on atapci1 > ata4: [ITHREAD] > ata5: on atapci1 > ata5: [ITHREAD] > nfe0: port 0xd800-0xd807 mem 0xfe02b000-0xfe02bfff,0xfe02a000-0xfe02a0ff,0xfe029000-0xfe02900f irq 15 at device 10.0 on pci0 > miibus0: on nfe0 > e1000phy0: PHY 1 on miibus0 > e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto > nfe0: Ethernet address: 00:50:8d:9f:6d:e3 > nfe0: [FILTER] > pcib2: at device 11.0 on pci0 > pci2: on pcib2 > pcib3: at device 12.0 on pci0 > pci3: on pcib3 > atapci2: port 0xac00-0xac7f mem 0xfdcff000-0xfdcff07f,0xfdcf8000-0xfdcfbfff irq 5 at device 0.0 on pci3 > atapci2: [ITHREAD] > ata6: on atapci2 > ata6: [ITHREAD] > ata7: on atapci2 > ata7: [ITHREAD] > pcib4: at device 13.0 on pci0 > pci4: on pcib4 > pcib5: at device 14.0 on pci0 > pci5: on pcib5 > pcib6: at device 15.0 on pci0 > pci6: on pcib6 > pcib7: at device 16.0 on pci0 > pci7: on pcib7 > pcib8: at device 17.0 on pci0 > pci8: on pcib8 > vgapci0: mem 0xfb000000-0xfbffffff,0xe0000000-0xefffffff,0xfc000000-0xfcffffff irq 10 at device 18.0 on pci0 > acpi_tz0: on acpi0 > atrtc0: port 0x70-0x73 on acpi0 > fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 > fdc0: [FILTER] > atkbdc0: port 0x60,0x64 irq 1 on acpi0 > atkbd0: irq 1 on atkbdc0 > kbd0 at atkbd0 > atkbd0: [GIANT-LOCKED] > atkbd0: [ITHREAD] > psm0: irq 12 on atkbdc0 > psm0: [GIANT-LOCKED] > psm0: [ITHREAD] > psm0: model MouseMan+, device ID 0 > pmtimer0 on isa0 > sc0: at flags 0x100 on isa0 > sc0: VGA <16 virtual consoles, flags=0x300> > vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 > ppc0: parallel port not found. > powernow0: on cpu0 > Timecounter "TSC" frequency 2500018183 Hz quality 800 > Timecounters tick every 1.000 msec > firewire0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me) > firewire0: bus manager 0 > usbus0: 12Mbps Full Speed USB v1.0 > usbus1: 480Mbps High Speed USB v2.0 > usbus2: 12Mbps Full Speed USB v1.0 > usbus3: 480Mbps High Speed USB v2.0 > ugen0.1: at usbus0 > uhub0: on usbus0 > ugen1.1: at usbus1 > uhub1: on usbus1 > ugen2.1: at usbus2 > uhub2: on usbus2 > ugen3.1: at usbus3 > uhub3: on usbus3 > uhub0: 6 ports with 6 removable, self powered > uhub2: 6 ports with 6 removable, self powered > ad6: 476940MB at ata3-master UDMA100 SATA 3Gb/s > uhub1: 6 ports with 6 removable, self powered > uhub3: 6 ports with 6 removable, self powered > unknown: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 sks=0x40 0x00 0x01 > cd0 at ata0 bus 0 scbus1 target 0 lun 0 > cd0: Removable CD-ROM SCSI-0 device > cd0: 3.300MB/s transfers > cd0: cd present [138590 x 2048 byte records] > da0 at ahc0 bus 0 scbus0 target 0 lun 0 > da0: Fixed Direct Access SCSI-3 device > da0: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit) > da0: Command Queueing enabled > da0: 35003MB (71687370 512 byte sectors: 255H 63S/T 4462C) > Trying to mount root from ufs:/dev/ad6s1a > nfe0: link state changed to UP Do you have something called Cool'n'Quiet enabled in your BIOS? Solely as a test, try disabling it. If this solves the problem, add "device cpufreq" to your kernel configuration, buildkernel/installkernel, and re-enable the option in your BIOS. It's worth a shot anyway. Some reference threads which are old and may be centralised around Intel CPUs but I imagine the same problem can happen on any platform which adjusts clock frequencies. http://lists.freebsd.org/pipermail/freebsd-questions/2006-October/133253.html http://www.mail-archive.com/freebsd-stable@freebsd.org/msg95530.html -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 08:05:01 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EB79B106566C for ; Tue, 28 Sep 2010 08:05:01 +0000 (UTC) (envelope-from jurgen@ish.com.au) Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net [59.167.240.32]) by mx1.freebsd.org (Postfix) with ESMTP id 7384C8FC13 for ; Tue, 28 Sep 2010 08:05:01 +0000 (UTC) Received: from [10.29.65.2] (port=57172) by fish.ish.com.au with esmtpa (Exim 4.69) (envelope-from ) id 1P0V0w-0003qa-2Z for freebsd-stable@freebsd.org; Tue, 28 Sep 2010 17:54:15 +1000 Message-ID: <4CA19F27.6050903@ish.com.au> Date: Tue, 28 Sep 2010 17:54:15 +1000 From: Jurgen Weber User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 08:05:02 -0000 Hello List We have been having issues with some firewall machines of ours using pfSense. FreeBSD smash01.ish.com.au 7.2-RELEASE-p5 FreeBSD 7.2-RELEASE-p5 #0: Sun Dec 6 23:20:31 EST 2009 sullrich@FreeBSD_7.2_pfSense_1.2.3_snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.7 i386 MotherBoard: http://www.supermicro.com/products/motherboard/Xeon3000/3200/X7SBi-LN4.cfm Originally the systems started out by showing a lot of packet loss, the system time would fall behind, and the value of "#vmstat -i | grep timer" was dropping below 2000. I was lead to believe by the guys at pfSense that this is where the value should sit. I would also receive errors in messages that looked like " kernel: calcru: runtime went backwards from 244314 usec to 236341". We tried a variety of things, disabling USB, turning off the Intel Speed Step in the BIOS, disabling ACPI, etc, etc. All having little to no effect. The only thing that would right it is restarting the box but over time it would degrade again. I talked to the SuperMicro and they said that this is a FreeBSD issue and pretty much washed their hands of it. After a couple of months of dealing with this and just rebooting the systems reguarly, the symptoms slowly but surely disappeared. eg. The kernel messages went away, the system time was not falling behind and I was experiencing no packet loss but the "#vmstat -i | grep timer" value would continue to decrease over time. Eventually I think, when it finally got the 0 the machine restarted (I am only guessing here). After this restart it worked again for a couple of hours and then it restarted again. After the second time the system has not missed a beat, it has been fine and the "#vmstat -i | grep timer" value remained near the 2000 mark... We setup some zabbix monitoring to watch it. As mentioned it was fine for about a month. Until today. Today the value has dropped to 0, but the system has not restarted and over the last couple of hours the value has increased to 47. This machine is mission critical, we have two in a fail over scenario (using pfSense's CARP features) and it seems unfortunate that we have an issue with two brand new SuperMicro boxes that affect both machines. While at the moment everything seems fine I want to ensure that I have no further issues. Does anyone have any suggestions? Lastly I have double check both of the below: http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#CALCRU-NEGATIVE-RUNTIME We disabled EIST. http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#COMPUTER-CLOCK-SKEW # dmesg | grep Timecounter Timecounter "i8254" frequency 1193182 Hz quality 0 Timecounters tick every 1.000 msec # sysctl kern.timecounter.hardware kern.timecounter.hardware: i8254 Only have one timer to choose from. Thanks Jurgen From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 09:30:54 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 126701065670 for ; Tue, 28 Sep 2010 09:30:54 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from QMTA11.westchester.pa.mail.comcast.net (qmta11.westchester.pa.mail.comcast.net [76.96.59.211]) by mx1.freebsd.org (Postfix) with ESMTP id B38878FC0C for ; Tue, 28 Sep 2010 09:30:53 +0000 (UTC) Received: from omta03.westchester.pa.mail.comcast.net ([76.96.62.27]) by QMTA11.westchester.pa.mail.comcast.net with comcast id C9LC1f0030bG4ec5B9WtK5; Tue, 28 Sep 2010 09:30:53 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta03.westchester.pa.mail.comcast.net with comcast id C9Ws1f0083LrwQ23P9Wt6F; Tue, 28 Sep 2010 09:30:53 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 359719B418; Tue, 28 Sep 2010 02:30:51 -0700 (PDT) Date: Tue, 28 Sep 2010 02:30:51 -0700 From: Jeremy Chadwick To: Jurgen Weber Message-ID: <20100928093051.GA59282@icarus.home.lan> References: <4CA19F27.6050903@ish.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA19F27.6050903@ish.com.au> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 09:30:54 -0000 On Tue, Sep 28, 2010 at 05:54:15PM +1000, Jurgen Weber wrote: > Hello List > > We have been having issues with some firewall machines of ours using > pfSense. > > FreeBSD smash01.ish.com.au 7.2-RELEASE-p5 FreeBSD 7.2-RELEASE-p5 #0: > Sun Dec 6 23:20:31 EST 2009 sullrich@FreeBSD_7.2_pfSense_1.2.3_snaps.pfsense.org:/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.7 > i386 > > MotherBoard: http://www.supermicro.com/products/motherboard/Xeon3000/3200/X7SBi-LN4.cfm > > Originally the systems started out by showing a lot of packet loss, > the system time would fall behind, and the value of "#vmstat -i | > grep timer" was dropping below 2000. I was lead to believe by the > guys at pfSense that this is where the value should sit. I would > also receive errors in messages that looked like " kernel: calcru: > runtime went backwards from 244314 usec to 236341". > > We tried a variety of things, disabling USB, turning off the Intel > Speed Step in the BIOS, disabling ACPI, etc, etc. All having little > to no effect. The only thing that would right it is restarting the > box but over time it would degrade again. I talked to the SuperMicro > and they said that this is a FreeBSD issue and pretty much washed > their hands of it. > > After a couple of months of dealing with this and just rebooting the > systems reguarly, the symptoms slowly but surely disappeared. eg. > The kernel messages went away, the system time was not falling > behind and I was experiencing no packet loss but the "#vmstat -i | > grep timer" value would continue to decrease over time. Eventually I > think, when it finally got the 0 the machine restarted (I am only > guessing here). > > After this restart it worked again for a couple of hours and then it > restarted again. > > After the second time the system has not missed a beat, it has been > fine and the "#vmstat -i | grep timer" value remained near the 2000 > mark... We setup some zabbix monitoring to watch it. As mentioned it > was fine for about a month. Until today. Today the value has dropped > to 0, but the system has not restarted and over the last couple of > hours the value has increased to 47. > > This machine is mission critical, we have two in a fail over > scenario (using pfSense's CARP features) and it seems unfortunate > that we have an issue with two brand new SuperMicro boxes that > affect both machines. While at the moment everything seems fine I > want to ensure that I have no further issues. Does anyone have any > suggestions? > > Lastly I have double check both of the below: > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#CALCRU-NEGATIVE-RUNTIME > We disabled EIST. > > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#COMPUTER-CLOCK-SKEW > > # dmesg | grep Timecounter > Timecounter "i8254" frequency 1193182 Hz quality 0 > Timecounters tick every 1.000 msec > # sysctl kern.timecounter.hardware > kern.timecounter.hardware: i8254 > > Only have one timer to choose from. I have a subrevision of this motherboard in use in production, which ran RELENG_7 and now runs RELENG_8, without any of the problems you describe. I don't have any experience with the -LN4 submodel though, although I do have experience with the X7SBA-LN4. Our hardware in question: http://www.supermicro.com/products/system/1U/5015/SYS-5015B-MT.cfm The machine in question consists of 4 disks (1 OS, 3 ZFS raidz1), uses both NICs (two separate networks) at gigE rates, handles nightly backups for all other servers, acts as an NFS server, a time source (ntpd) for other servers on the network, and a serial console head. Oh, it also has EIST enabled, and runs powerd with some minor (well-known) tunings in loader.conf for it. Secondly, here's our sysctl kern.timecounter tree on our system, in addition to our SMBIOS details (proving the system is what I say it is). Note that we have multiple timecounter choices, and APCI-fast is chosen. I would expect problems if i8254 was chosen, but the question is why this is being chosen on your systems and why alternate timecounter choices aren't available. You said you tried booting with ACPI disabled, which might explain why ACPI-fast or ACPI-safe are missing. $ sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) ACPI-fast(1000) i8254(0) dummy(-1000000) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 47135 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 188736 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 2830682562 kern.timecounter.tc.TSC.frequency: 2333508681 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 1 $ kenv | grep smbios smbios.bios.reldate="07/24/2009" smbios.bios.vendor="Phoenix Technologies LTD" smbios.bios.version="1.30 " smbios.chassis.maker="Supermicro" smbios.chassis.serial="0123456789" smbios.chassis.tag=" " smbios.chassis.version="0123456789" smbios.memory.enabled="8388608" smbios.planar.maker="Supermicro" smbios.planar.product="X7SBi" smbios.planar.serial="0123456789" smbios.planar.version="PCB Version" smbios.socket.enabled="1" smbios.socket.populated="1" smbios.system.maker="Supermicro" smbios.system.product="X7SBi" smbios.system.serial="0123456789" smbios.system.uuid="XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX" smbios.system.version="0123456789" smbios.version="2.5" Fourthly, here's our BIOS settings (using BIOS 1.30, which is referred to as "R 1.3a" on Supermicro's site): -------------------- Supermicro SuperServer 5015B-MT BIOS Settings ============================================= Current BIOS: 1.30 ============================================= Reset to Factory Defaults, then change: * Main * Date --> Set to GMT, not local time! * Serial ATA --> Native Mode Operation --> Serial ATA --> SATA AHCI Enable --> Enabled * Advanced * Boot Features --> Quiet Boot --> Disabled * I/O Device Configuration --> Serial port B --> Disabled --> Parallel port --> Disabled * Console Redirection --> Com Port Address --> On-board COM A --> Baud Rate --> 115.2K --> Console Type --> VT100+ --> Continue C.R. after POST --> On (SEE NOTE #2) NOTE #2: CR after POST ======================== If the system is running RELENG_7, ***do not*** enable this option. The bootloader and thus kernel appear to get confused by who controls the interrupt, and you end up without *any* serial console output period. RELENG_8 has addressed this problem, and you *should* enable this feature when using that OS. This will allow you to see LAN option ROM messages during PXE booting, or boot0 (if you use it; usually we don't). -------------------- Since you have two systems with the same problem, I really don't know what to tell you. What I can tell you is that we've run RELENG_7 and RELENG_8 on all of the following hardware without any problems: * Supermicro SuperServer 5015B-MTB http://www.supermicro.com/products/system/1U/5015/SYS-5015B-MT.cfm * Supermicro SuperServer 5015M-T+B http://www.supermicro.com/products/system/1U/5015/SYS-5015M-T_.cfm * Supermicro X7SBA http://www.supermicro.com/products/motherboard/Xeon3000/3210/X7SBA.cfm * Supermicro X7SBL-LN2 http://www.supermicro.com/products/motherboard/Xeon3000/3200/X7SBL-LN2.cfm Can you provide any tuning you do in loader.conf or sysctl.conf, as well as your kernel configuration? Otherwise, if you continue to have problems of this nature, I would strongly recommend replacing the hardware. Clock skew of this nature, at least based on what I've seen at my day/night job, is usually the sign of a crystal going bad on the motherboard. Yes, I realise you have two systems which are exhibiting the same behaviour, but for all I know a manufacturer (not Supermicro) released a batch of bad crystals into the market. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 10:07:31 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 364BC106564A for ; Tue, 28 Sep 2010 10:07:31 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 69E1B8FC0C for ; Tue, 28 Sep 2010 10:07:30 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA00785; Tue, 28 Sep 2010 13:07:22 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA1BE59.7060906@icyb.net.ua> Date: Tue, 28 Sep 2010 13:07:21 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jurgen Weber References: <4CA19F27.6050903@ish.com.au> In-Reply-To: <4CA19F27.6050903@ish.com.au> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 10:07:31 -0000 on 28/09/2010 10:54 Jurgen Weber said the following: > # dmesg | grep Timecounter > Timecounter "i8254" frequency 1193182 Hz quality 0 > Timecounters tick every 1.000 msec > # sysctl kern.timecounter.hardware > kern.timecounter.hardware: i8254 > > Only have one timer to choose from. Can you provide a little bit more of "hard" data than the above? Specifically, the following sysctls: kern.timecounter dev.cpu Output of vmstat -i. _Verbose_ boot dmesg. Please do not disable ACPI when taking this data. Preferably, upload it somewhere and post a link to it. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 11:02:21 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 76F37106566B for ; Tue, 28 Sep 2010 11:02:21 +0000 (UTC) (envelope-from vf1100c@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id 080C48FC15 for ; Tue, 28 Sep 2010 11:02:20 +0000 (UTC) Received: by wyb33 with SMTP id 33so7542273wyb.13 for ; Tue, 28 Sep 2010 04:02:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:references:in-reply-to :mime-version:content-type:message-id:content-transfer-encoding:cc :x-mailer:from:subject:date:to; bh=zUiL5cH4xpCQLACPVp+MmtUItuJICLSn5qAFxViP3kA=; b=a45xQCv0gaxyzrtbHFCbf7wYHL0dSjv5veVHKrm503BnYTugIaZFO1BOzi7LvGv/Ml +6ANuNyzxRRJaXUyR7PYYfQAqg79nzofq54FDuoUqRYaFgwtMUtIa9xoY+UA8y2ggLuM 2p/hD6klBkPYYxiBXMUlrHj74X4orBTI3XzWI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=references:in-reply-to:mime-version:content-type:message-id :content-transfer-encoding:cc:x-mailer:from:subject:date:to; b=ejHxGO/tr6i0gz6bh8nKUYkumgY0EYSdemUHJlMQdAGFJvN9uK4XJWIwieSjgxthcp E/Cupopr3gGwJq6p/AgYJoAyv6jXb9vsEm4BbrhiCjbMMbi9Ij7B/+oPWi1gL9d9K0Tr Pao90OQoax1mcGxQ3W+nEOTYK6EJJTDZM4e6s= Received: by 10.227.69.195 with SMTP id a3mr7743238wbj.58.1285669987900; Tue, 28 Sep 2010 03:33:07 -0700 (PDT) Received: from [10.8.55.4] ([85.118.193.147]) by mx.google.com with ESMTPS id g9sm5845975wbh.19.2010.09.28.03.33.04 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 28 Sep 2010 03:33:06 -0700 (PDT) References: <4CA19F27.6050903@ish.com.au> In-Reply-To: <4CA19F27.6050903@ish.com.au> Mime-Version: 1.0 (iPhone Mail 8B117) Content-Type: text/plain; charset=us-ascii Message-Id: Content-Transfer-Encoding: quoted-printable X-Mailer: iPhone Mail (8B117) From: borislav nikolov Date: Tue, 28 Sep 2010 13:33:48 +0300 To: Jurgen Weber Cc: "freebsd-stable@freebsd.org" Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 11:02:21 -0000 On 28.09.2010, at 10:54, Jurgen Weber wrote: > Hello List >=20 > We have been having issues with some firewall machines of ours using pfSen= se. >=20 > FreeBSD smash01.ish.com.au 7.2-RELEASE-p5 FreeBSD 7.2-RELEASE-p5 #0: Sun D= ec 6 23:20:31 EST 2009 sullrich@FreeBSD_7.2_pfSense_1.2.3_snaps.pfsense.org= :/usr/obj.pfSense/usr/pfSensesrc/src/sys/pfSense_SMP.7 i386 >=20 > MotherBoard: http://www.supermicro.com/products/motherboard/Xeon3000/3200/= X7SBi-LN4.cfm >=20 > Originally the systems started out by showing a lot of packet loss, the sy= stem time would fall behind, and the value of "#vmstat -i | grep timer" was d= ropping below 2000. I was lead to believe by the guys at pfSense that this i= s where the value should sit. I would also receive errors in messages that l= ooked like " kernel: calcru: runtime went backwards from 244314 usec to 2363= 41". >=20 > We tried a variety of things, disabling USB, turning off the Intel Speed S= tep in the BIOS, disabling ACPI, etc, etc. All having little to no effect. T= he only thing that would right it is restarting the box but over time it wou= ld degrade again. I talked to the SuperMicro and they said that this is a Fre= eBSD issue and pretty much washed their hands of it. >=20 > After a couple of months of dealing with this and just rebooting the syste= ms reguarly, the symptoms slowly but surely disappeared. eg. The kernel mess= ages went away, the system time was not falling behind and I was experiencin= g no packet loss but the "#vmstat -i | grep timer" value would continue to d= ecrease over time. Eventually I think, when it finally got the 0 the machine= restarted (I am only guessing here). >=20 > After this restart it worked again for a couple of hours and then it resta= rted again. >=20 > After the second time the system has not missed a beat, it has been fine a= nd the "#vmstat -i | grep timer" value remained near the 2000 mark... We set= up some zabbix monitoring to watch it. As mentioned it was fine for about a m= onth. Until today. Today the value has dropped to 0, but the system has not r= estarted and over the last couple of hours the value has increased to 47. >=20 > This machine is mission critical, we have two in a fail over scenario (usi= ng pfSense's CARP features) and it seems unfortunate that we have an issue w= ith two brand new SuperMicro boxes that affect both machines. While at the m= oment everything seems fine I want to ensure that I have no further issues. D= oes anyone have any suggestions? >=20 > Lastly I have double check both of the below: > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#CAL= CRU-NEGATIVE-RUNTIME > We disabled EIST. >=20 > http://www.freebsd.org/doc/en_US.ISO8859-1/books/faq/troubleshoot.html#COM= PUTER-CLOCK-SKEW >=20 > # dmesg | grep Timecounter > Timecounter "i8254" frequency 1193182 Hz quality 0 > Timecounters tick every 1.000 msec > # sysctl kern.timecounter.hardware > kern.timecounter.hardware: i8254 >=20 > Only have one timer to choose from. >=20 > Thanks >=20 > Jurgen >=20 > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" Hello, vmsat -i calculates interrupt rate based on interrupt count/uptime, and the i= nterrupt count is 32 bit integer.=20 With high values of kern.hz it will overflow in few days (with kern.hz=3D400= 0 it will happen every 12 days or so). If that is the case, use systat -vmstat 1 to get accurate interrupt rate. That is just fyi, because i was confused once and it scared me abit, and i s= tarted changing counters untill i noticed this. p.s. please forgive my poor english= From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 11:24:38 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 622EE1065670; Tue, 28 Sep 2010 11:24:38 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id F003C8FC22; Tue, 28 Sep 2010 11:24:37 +0000 (UTC) Received: from localhost (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id F0615153434; Tue, 28 Sep 2010 13:24:36 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by localhost (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d7i5zrXYXYnO; Tue, 28 Sep 2010 13:24:31 +0200 (CEST) Received: from [127.0.0.1] (opteron [192.168.10.67]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id 08986153433; Tue, 28 Sep 2010 13:24:31 +0200 (CEST) Message-ID: <4CA1D06C.9050305@digiware.nl> Date: Tue, 28 Sep 2010 13:24:28 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: stable@freebsd.org, fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 11:24:38 -0000 Hi, This is with stable as of yesterday,but with an un-tunned ZFS box I was still able to generate a kmem exhausted panic. Hard panic, just 3 lines. The box contains 12Gb memory, runs on a 6 core (with HT) xeon. 6* 2T WD black caviar in raidz2 with 2*512Mb mirrored log. The box died while rsyncing 5.8T from its partnering system. (that was the only activity on the box) So the obvious would to conclude that auto-tuning voor ZFS on 8.1-Stable is not yet quite there. So I guess that we still need tuning advice even for 8.1. And thus prevent a hard panic. At the moment trying to 'zfs send | rsh zfs receive' the stuff. Which seems to run at about 40Mb/sec, and is a lot faster than the rsync stuff. --WjW From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 11:50:50 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2050F106572F for ; Tue, 28 Sep 2010 11:50:50 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta07.westchester.pa.mail.comcast.net (qmta07.westchester.pa.mail.comcast.net [76.96.62.64]) by mx1.freebsd.org (Postfix) with ESMTP id C1AE58FC0A for ; Tue, 28 Sep 2010 11:50:49 +0000 (UTC) Received: from omta10.westchester.pa.mail.comcast.net ([76.96.62.28]) by qmta07.westchester.pa.mail.comcast.net with comcast id CAiv1f0040cZkys57BqqGH; Tue, 28 Sep 2010 11:50:50 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta10.westchester.pa.mail.comcast.net with comcast id CBqo1f00P3LrwQ23WBqp1Z; Tue, 28 Sep 2010 11:50:50 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 5DA489B418; Tue, 28 Sep 2010 04:50:47 -0700 (PDT) Date: Tue, 28 Sep 2010 04:50:47 -0700 From: Jeremy Chadwick To: Willem Jan Withagen Message-ID: <20100928115047.GA62142@icarus.home.lan> References: <4CA1D06C.9050305@digiware.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA1D06C.9050305@digiware.nl> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@freebsd.org, fs@freebsd.org Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 11:50:50 -0000 On Tue, Sep 28, 2010 at 01:24:28PM +0200, Willem Jan Withagen wrote: > This is with stable as of yesterday,but with an un-tunned ZFS box I > was still able to generate a kmem exhausted panic. > Hard panic, just 3 lines. > > The box contains 12Gb memory, runs on a 6 core (with HT) xeon. > 6* 2T WD black caviar in raidz2 with 2*512Mb mirrored log. > > The box died while rsyncing 5.8T from its partnering system. > (that was the only activity on the box) It would help if you could provide output from the following commands (even after the box has rebooted): $ sysctl -a | egrep ^vm.kmem $ sysctl -a | egrep ^vfs.zfs.arc $ sysctl kstat.zfs.misc.arcstats > So the obvious would to conclude that auto-tuning voor ZFS on > 8.1-Stable is not yet quite there. > > So I guess that we still need tuning advice even for 8.1. > And thus prevent a hard panic. Andriy Gapon provides this general recommendation: http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/059114.html The advice I've given for RELENG_8 (as of the time of this writing), 8.1-STABLE, and 8.1-RELEASE, is that for amd64 you'll need to tune: vm.kmem_size vfs.zfs.arc_max An example machine: amd64, with 4GB physical RAM installed (3916MB available for use (verified via dmesg)) uses values: vm.kmem_size="4096M" vfs.zfs.arc_max="3584M" Another example machine: amd64, with 8GB physical RAM installed (7875MB available for use) uses values: vm.kmem_size="8192M" vfs.zfs.arc_max="6144M" I believe the trick -- Andriy, please correct me if I'm wrong -- is the tuning of vfs.zfs.arc_max, which is now a hard limit rather than a "high watermark". However, I believe there have been occasional reports of exhaustion panics despite both of these being set[1]. Those reports are being investigated on an individual basis. I set some other ZFS-related parameters as well (disabling prefetch, adjusting txg.timeout, etc.), but those shouldn't be necessary to gain stability at this point in time. I can't provide tuning advice for i386. [1]: http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/059109.html -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 12:16:21 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9E1C106566B; Tue, 28 Sep 2010 12:16:21 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id 81FA68FC18; Tue, 28 Sep 2010 12:16:21 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.4/8.14.3) with ESMTP id o8SCGKam025289; Tue, 28 Sep 2010 08:16:20 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.4/8.14.3/Submit) id o8SCGKdg025283; Tue, 28 Sep 2010 12:16:20 GMT (envelope-from tinderbox@freebsd.org) Date: Tue, 28 Sep 2010 12:16:20 GMT Message-Id: <201009281216.o8SCGKdg025283@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8_1 tinderbox] failure on powerpc/powerpc X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 12:16:21 -0000 TB --- 2010-09-28 12:06:50 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-09-28 12:06:50 - starting RELENG_8_1 tinderbox run for powerpc/powerpc TB --- 2010-09-28 12:06:50 - cleaning the object tree TB --- 2010-09-28 12:09:09 - cvsupping the source tree TB --- 2010-09-28 12:09:10 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup5.freebsd.org /tinderbox/RELENG_8_1/powerpc/powerpc/supfile TB --- 2010-09-28 12:16:20 - WARNING: /usr/bin/csup returned exit code 1 TB --- 2010-09-28 12:16:20 - ERROR: unable to cvsup the source tree TB --- 2010-09-28 12:16:20 - 2.59 user 202.72 system 570.10 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8_1-powerpc-powerpc.full From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 12:22:09 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7CDBA1065672 for ; Tue, 28 Sep 2010 12:22:09 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id BEFE78FC20 for ; Tue, 28 Sep 2010 12:22:07 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id PAA03464; Tue, 28 Sep 2010 15:22:02 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA1DDE9.8090107@icyb.net.ua> Date: Tue, 28 Sep 2010 15:22:01 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> In-Reply-To: <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 12:22:09 -0000 on 28/09/2010 14:50 Jeremy Chadwick said the following: > I believe the trick -- Andriy, please correct me if I'm wrong -- is the Wouldn't hurt to CC me, so that I could do it :-) > tuning of vfs.zfs.arc_max, which is now a hard limit rather than a "high > watermark". Not sure what you mean here. What is hard limit, what is high watermark, what is the difference and when is "now"? :-) I believe that "the trick" is to set vm.kmem_size high enough, eitehr using this tunable or vm.kmem_size_scale. > However, I believe there have been occasional reports of exhaustion > panics despite both of these being set[1]. Those reports are being > investigated on an individual basis. I don't believe that the report that you quote actually demonstrates what you say it does. Two quotes from it: "During these panics no tuning or /boot/loader.conf values where present." "Only after hitting this behaviour yesterday i created boot/loader.conf" > > [1]: http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/059109.html > -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 13:23:58 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3B3D81065672 for ; Tue, 28 Sep 2010 13:23:58 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta06.emeryville.ca.mail.comcast.net (qmta06.emeryville.ca.mail.comcast.net [76.96.30.56]) by mx1.freebsd.org (Postfix) with ESMTP id 20C338FC0A for ; Tue, 28 Sep 2010 13:23:57 +0000 (UTC) Received: from omta06.emeryville.ca.mail.comcast.net ([76.96.30.51]) by qmta06.emeryville.ca.mail.comcast.net with comcast id CCPM1f00116AWCUA6DPxLX; Tue, 28 Sep 2010 13:23:57 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta06.emeryville.ca.mail.comcast.net with comcast id CDPw1f0013LrwQ28SDPwRH; Tue, 28 Sep 2010 13:23:57 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id EC1849B418; Tue, 28 Sep 2010 06:23:55 -0700 (PDT) Date: Tue, 28 Sep 2010 06:23:55 -0700 From: Jeremy Chadwick To: Andriy Gapon Message-ID: <20100928132355.GA63149@icarus.home.lan> References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA1DDE9.8090107@icyb.net.ua> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 13:23:58 -0000 On Tue, Sep 28, 2010 at 03:22:01PM +0300, Andriy Gapon wrote: > on 28/09/2010 14:50 Jeremy Chadwick said the following: > > I believe the trick -- Andriy, please correct me if I'm wrong -- is the > > Wouldn't hurt to CC me, so that I could do it :-) > > > tuning of vfs.zfs.arc_max, which is now a hard limit rather than a "high > > watermark". > > Not sure what you mean here. > What is hard limit, what is high watermark, what is the difference and when is > "now"? :-) There was some speculation on the part of users a while back which lead to this understanding. Folks were seeing actual ARC usage higher than what vfs.zfs.arc_max was set to (automatically or administratively). I believe it started here: http://www.mailinglistarchive.com/freebsd-current@freebsd.org/msg28884.html With the "high-water mark" statements being here: http://www.mailinglistarchive.com/freebsd-current@freebsd.org/msg28887.html http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2010-04/msg00129.html The term implies that there is not an explicitly hard limit on the ARC utilisation/growth. As stated in the unix.derkeiler.com URL above, this behaviour was in fact changed. Why/when/how? I had to go digging up the commits -- this took me some time. Here they are, labelled r197816, for RELENG_8 and RELENG_7 respectively. These were both committed on 2010/01/08 UTC: http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.22.2.2 http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.15.2.6 In HEAD/CURRENT (yet to be MFC'd), it looks like above code got removed on 2010/09/17 UTC, citing they should be "enforced by actual calculations of delta": http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.46 http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.45 So what's this "delta" code piece that's mentioned? That appears to be have been committed to RELENG_8 on 2010/05/24 UTC (thus, between the above two dates): http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.22.2.4 (Side note: the "delta stuff" was never committed to RELENG_7 -- and that's fine. I'm pointing this out not out of retaliation or insult, but because people will almost certainly Google, find this post, and wonder if their 7.x machines might be affected.) This situation with the ARC, and all its changes over time, is one of the reasons why I rant aggressively about the need for more communication transparency (re: what the changes actually affect). Most SAs and users don't follow commits. > I believe that "the trick" is to set vm.kmem_size high enough, eitehr using this > tunable or vm.kmem_size_scale. Thanks for the clarification. I just wish I knew how vm.kmem_size_scale fit into the picture (meaning what it does, etc.). The sysctl description isn't very helpful. Again, my lack of VM knowledge... > > However, I believe there have been occasional reports of exhaustion > > panics despite both of these being set[1]. Those reports are being > > investigated on an individual basis. > > I don't believe that the report that you quote actually demonstrates what you say > it does. > Two quotes from it: > "During these panics no tuning or /boot/loader.conf values where present." > "Only after hitting this behaviour yesterday i created boot/loader.conf" > > > [1]: http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/059109.html You're right -- the report I'm quoting is not the one I thought it was. I'll see if I can dig up the correct mail/report. It could be that I'm thinking of something quite old (pre-ARC-changes (see above paragraphs)). I can barely keep track of all the changes going on. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 13:25:39 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 927601065673; Tue, 28 Sep 2010 13:25:39 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id DF9AC8FC17; Tue, 28 Sep 2010 13:25:38 +0000 (UTC) Received: from localhost (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id 47227153434; Tue, 28 Sep 2010 15:25:37 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by localhost (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Cri7K4Ph43q9; Tue, 28 Sep 2010 15:25:33 +0200 (CEST) Received: from [127.0.0.1] (unknown [192.168.254.10]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id 63A24153433; Tue, 28 Sep 2010 15:25:33 +0200 (CEST) Message-ID: <4CA1ECCC.4070801@digiware.nl> Date: Tue, 28 Sep 2010 15:25:32 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan> In-Reply-To: <20100928115047.GA62142@icarus.home.lan> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, "avg@icyb.net.ua >> Andriy Gapon" , fs@freebsd.org Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 13:25:39 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 28-9-2010 13:50, Jeremy Chadwick wrote: > On Tue, Sep 28, 2010 at 01:24:28PM +0200, Willem Jan Withagen wrote: >> This is with stable as of yesterday,but with an un-tunned ZFS box I >> was still able to generate a kmem exhausted panic. >> Hard panic, just 3 lines. >> >> The box contains 12Gb memory, runs on a 6 core (with HT) xeon. >> 6* 2T WD black caviar in raidz2 with 2*512Mb mirrored log. >> >> The box died while rsyncing 5.8T from its partnering system. >> (that was the only activity on the box) > > It would help if you could provide output from the following commands > (even after the box has rebooted): It is currently in the proces of zfs receive of that same 5.8T. > $ sysctl -a | egrep ^vm.kmem > $ sysctl -a | egrep ^vfs.zfs.arc > $ sysctl kstat.zfs.misc.arcstats > sysctl -a | egrep ^vm.kmem vm.kmem_size_scale: 3 vm.kmem_size_max: 329853485875 vm.kmem_size_min: 0 vm.kmem_size: 4156850176 > sysctl -a | egrep ^vfs.zfs.arc vfs.zfs.arc_meta_limit: 770777088 vfs.zfs.arc_meta_used: 33449648 vfs.zfs.arc_min: 385388544 vfs.zfs.arc_max: 3083108352 > sysctl kstat.zfs.misc.arcstats kstat.zfs.misc.arcstats.hits: 3119873 kstat.zfs.misc.arcstats.misses: 98710 kstat.zfs.misc.arcstats.demand_data_hits: 3043947 kstat.zfs.misc.arcstats.demand_data_misses: 3699 kstat.zfs.misc.arcstats.demand_metadata_hits: 67981 kstat.zfs.misc.arcstats.demand_metadata_misses: 90005 kstat.zfs.misc.arcstats.prefetch_data_hits: 121 kstat.zfs.misc.arcstats.prefetch_data_misses: 48 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 7824 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 4958 kstat.zfs.misc.arcstats.mru_hits: 34828 kstat.zfs.misc.arcstats.mru_ghost_hits: 21736 kstat.zfs.misc.arcstats.mfu_hits: 3077133 kstat.zfs.misc.arcstats.mfu_ghost_hits: 47605 kstat.zfs.misc.arcstats.allocated: 5507025 kstat.zfs.misc.arcstats.deleted: 5349715 kstat.zfs.misc.arcstats.stolen: 4468221 kstat.zfs.misc.arcstats.recycle_miss: 83995 kstat.zfs.misc.arcstats.mutex_miss: 231 kstat.zfs.misc.arcstats.evict_skip: 130461 kstat.zfs.misc.arcstats.evict_l2_cached: 0 kstat.zfs.misc.arcstats.evict_l2_eligible: 592200836608 kstat.zfs.misc.arcstats.evict_l2_ineligible: 11000092160 kstat.zfs.misc.arcstats.hash_elements: 20585 kstat.zfs.misc.arcstats.hash_elements_max: 150543 kstat.zfs.misc.arcstats.hash_collisions: 761847 kstat.zfs.misc.arcstats.hash_chains: 780 kstat.zfs.misc.arcstats.hash_chain_max: 6 kstat.zfs.misc.arcstats.p: 2266075295 kstat.zfs.misc.arcstats.c: 2410082200 kstat.zfs.misc.arcstats.c_min: 385388544 kstat.zfs.misc.arcstats.c_max: 3083108352 kstat.zfs.misc.arcstats.size: 2410286720 kstat.zfs.misc.arcstats.hdr_size: 7565040 kstat.zfs.misc.arcstats.data_size: 2394099200 kstat.zfs.misc.arcstats.other_size: 8622480 kstat.zfs.misc.arcstats.l2_hits: 0 kstat.zfs.misc.arcstats.l2_misses: 0 kstat.zfs.misc.arcstats.l2_feeds: 0 kstat.zfs.misc.arcstats.l2_rw_clash: 0 kstat.zfs.misc.arcstats.l2_read_bytes: 0 kstat.zfs.misc.arcstats.l2_write_bytes: 0 kstat.zfs.misc.arcstats.l2_writes_sent: 0 kstat.zfs.misc.arcstats.l2_writes_done: 0 kstat.zfs.misc.arcstats.l2_writes_error: 0 kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0 kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0 kstat.zfs.misc.arcstats.l2_evict_reading: 0 kstat.zfs.misc.arcstats.l2_free_on_write: 0 kstat.zfs.misc.arcstats.l2_abort_lowmem: 0 kstat.zfs.misc.arcstats.l2_cksum_bad: 0 kstat.zfs.misc.arcstats.l2_io_error: 0 kstat.zfs.misc.arcstats.l2_size: 0 kstat.zfs.misc.arcstats.l2_hdr_size: 0 kstat.zfs.misc.arcstats.memory_throttle_count: 0 kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0 kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0 kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0 kstat.zfs.misc.arcstats.l2_write_in_l2: 0 kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0 kstat.zfs.misc.arcstats.l2_write_not_cacheable: 85908 kstat.zfs.misc.arcstats.l2_write_full: 0 kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0 kstat.zfs.misc.arcstats.l2_write_pios: 0 kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0 kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0 kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0 >> So the obvious would to conclude that auto-tuning voor ZFS on >> 8.1-Stable is not yet quite there. >> >> So I guess that we still need tuning advice even for 8.1. >> And thus prevent a hard panic. > > Andriy Gapon provides this general recommendation: > > http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/059114.html > > The advice I've given for RELENG_8 (as of the time of this writing), > 8.1-STABLE, and 8.1-RELEASE, is that for amd64 you'll need to tune: Well advises seem to vary, and the latest I understood was that 8.1-stable did not need any tuning. (The other system with a much older kernel is tuned as to what most here are suggesting) And I was shure led to believe that even since 8.0 panics were no longer among us...... > > vm.kmem_size > vfs.zfs.arc_max real memory = 12889096192 (12292 MB) avail memory = 12408684544 (11833 MB) So that prompts vm.kmem_size=18G. Form the other post: > As to arc_max/arc_min, set them based your needs according to general > ZFS recommendations. I'm seriously at a loss what general recommendations would be. The other box has 8G loader.conf: vm.kmem_size="14G" # 2* phys RAM size for ZFS perf. vm.kmem_size_scale="1" vfs.zfs.arc_min="1G" vfs.zfs.arc_max="6G" So I'd select something like 11G for arc_max on a box with 12G mem. > I believe the trick -- Andriy, please correct me if I'm wrong -- is the > tuning of vfs.zfs.arc_max, which is now a hard limit rather than a "high > watermark". > I can't provide tuning advice for i386. This is amd64. - --WjW -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) iQEcBAEBAgAGBQJMoezMAAoJEP4k4K6R6rBhEScIAI/rZH5/VTmASMGyEYu4NZHU SSFo3TOSOkYPEJicd8/NgM7w7D3xgMA0Xse0fu3tQOsjX940Z6fUKvnM7LCX2OJK vvkW0LpGuKbv/9sFFvkklodjkArtRzzoptLtiCVsaYsoieRqnmYMpBxU9WFYCY2I HoRx1nMbArg2HvKPzeZjf9knnQaU6YOR/PUiFBo6YuHkDJ40noqRElewbPEiOVZz zqnUh90ZDFVdHMYNuZegOKtfSVCA1AifHR3e7+zn8jSco/+svESd7tBIxmHZWQ8u BA1AKyYVTHs+wKsTw2J7u1v8yg74HxJNyVqwPRP048Z8onoPlGgtnFCTWbl2ICU= =KiyH -----END PGP SIGNATURE----- From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 13:36:47 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C76E106564A; Tue, 28 Sep 2010 13:36:47 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 27D078FC0C; Tue, 28 Sep 2010 13:36:45 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA04596; Tue, 28 Sep 2010 16:36:41 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA1EF69.4040402@icyb.net.ua> Date: Tue, 28 Sep 2010 16:36:41 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> In-Reply-To: <20100928132355.GA63149@icarus.home.lan> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 13:36:47 -0000 on 28/09/2010 16:23 Jeremy Chadwick said the following: > On Tue, Sep 28, 2010 at 03:22:01PM +0300, Andriy Gapon wrote: >> on 28/09/2010 14:50 Jeremy Chadwick said the following: >>> I believe the trick -- Andriy, please correct me if I'm wrong -- is the >> >> Wouldn't hurt to CC me, so that I could do it :-) >> >>> tuning of vfs.zfs.arc_max, which is now a hard limit rather than a "high >>> watermark". >> >> Not sure what you mean here. >> What is hard limit, what is high watermark, what is the difference and when is >> "now"? :-) > > There was some speculation on the part of users a while back which lead > to this understanding. Folks were seeing actual ARC usage higher than > what vfs.zfs.arc_max was set to (automatically or administratively). I > believe it started here: > > http://www.mailinglistarchive.com/freebsd-current@freebsd.org/msg28884.html > > With the "high-water mark" statements being here: > > http://www.mailinglistarchive.com/freebsd-current@freebsd.org/msg28887.html > http://unix.derkeiler.com/Mailing-Lists/FreeBSD/stable/2010-04/msg00129.html > > The term implies that there is not an explicitly hard limit on the ARC > utilisation/growth. As stated in the unix.derkeiler.com URL above, this > behaviour was in fact changed. Why/when/how? I had to go digging up > the commits -- this took me some time. Here they are, labelled r197816, > for RELENG_8 and RELENG_7 respectively. These were both committed on > 2010/01/08 UTC: > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.22.2.2 > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.15.2.6 > > In HEAD/CURRENT (yet to be MFC'd), it looks like above code got removed > on 2010/09/17 UTC, citing they should be "enforced by actual > calculations of delta": > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.46 > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.45 > > So what's this "delta" code piece that's mentioned? That appears to be > have been committed to RELENG_8 on 2010/05/24 UTC (thus, between the > above two dates): > > http://www.freebsd.org/cgi/cvsweb.cgi/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c#rev1.22.2.4 > > (Side note: the "delta stuff" was never committed to RELENG_7 -- and > that's fine. I'm pointing this out not out of retaliation or insult, > but because people will almost certainly Google, find this post, and > wonder if their 7.x machines might be affected.) > > This situation with the ARC, and all its changes over time, is one of > the reasons why I rant aggressively about the need for more > communication transparency (re: what the changes actually affect). Most > SAs and users don't follow commits. Well, no time for me to dig through all that history. arc_max should be a hard limit and it is now. If it ever wasn't then it was a bug. Besides, "high watermark" is still an ambiguous term, for you it "implies" that it is not a hard limit, but for me it "implies" exactly a hard limit. Additionally, going from "non-hard limit" to a "hard limit" on ARC size should improve things memory-wise, not vice versa, right? :) P.S. All that I said above is a hint that this is a pointless branch of the thread :) -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 13:39:10 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 430CF10656B2; Tue, 28 Sep 2010 13:39:10 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5AD548FC1E; Tue, 28 Sep 2010 13:39:09 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA04620; Tue, 28 Sep 2010 16:39:06 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA1EFF9.1050802@icyb.net.ua> Date: Tue, 28 Sep 2010 16:39:05 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> In-Reply-To: <20100928132355.GA63149@icarus.home.lan> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 13:39:10 -0000 on 28/09/2010 16:23 Jeremy Chadwick said the following: > On Tue, Sep 28, 2010 at 03:22:01PM +0300, Andriy Gapon wrote: >> I believe that "the trick" is to set vm.kmem_size high enough, eitehr using this >> tunable or vm.kmem_size_scale. > > Thanks for the clarification. I just wish I knew how vm.kmem_size_scale > fit into the picture (meaning what it does, etc.). The sysctl > description isn't very helpful. Again, my lack of VM knowledge... > Roughly, vm.kmem_size would get set to divided by vm.kmem_size_scale. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 13:46:34 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 04837106566C; Tue, 28 Sep 2010 13:46:34 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 10D8B8FC08; Tue, 28 Sep 2010 13:46:32 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA04711; Tue, 28 Sep 2010 16:46:29 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA1F1B4.1020700@icyb.net.ua> Date: Tue, 28 Sep 2010 16:46:28 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Willem Jan Withagen References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan> <4CA1ECCC.4070801@digiware.nl> In-Reply-To: <4CA1ECCC.4070801@digiware.nl> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 13:46:34 -0000 on 28/09/2010 16:25 Willem Jan Withagen said the following: > Well advises seem to vary, and the latest I understood was that > 8.1-stable did not need any tuning. (The other system with a much older > kernel is tuned as to what most here are suggesting) > And I was shure led to believe that even since 8.0 panics were no longer > among us...... Well, now you have demonstrated yourself that it is not always so. >> vm.kmem_size >> vfs.zfs.arc_max > > real memory = 12889096192 (12292 MB) > avail memory = 12408684544 (11833 MB) > > So that prompts vm.kmem_size=18G. > > Form the other post: >> As to arc_max/arc_min, set them based your needs according to general >> ZFS recommendations. > > I'm seriously at a loss what general recommendations would be. Have you asked Mr. Google? :) - http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide Search for "Memory and Dynamic Reconfiguration Recommendation" - http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Limiting_the_ARC_Cache Short version - decide how much memory you need for everything else but ZFS ARC. If autotuned value suits you, then you don't need to change anything. > The other box has 8G > loader.conf: > vm.kmem_size="14G" # 2* phys RAM size for ZFS perf. > vm.kmem_size_scale="1" No need to set both of the above. vm.kmem_size overrides vm.kmem_size_scale. > vfs.zfs.arc_min="1G" > vfs.zfs.arc_max="6G" > > So I'd select something like 11G for arc_max on a box with 12G mem. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 14:02:33 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0DD92106567A; Tue, 28 Sep 2010 14:02:33 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id 721E38FC15; Tue, 28 Sep 2010 14:02:32 +0000 (UTC) Received: from localhost (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id C4C7515346A; Tue, 28 Sep 2010 16:02:31 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by localhost (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WAt0f0YQumbi; Tue, 28 Sep 2010 16:02:25 +0200 (CEST) Received: from [127.0.0.1] (unknown [192.168.254.10]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id 9B7D215346C; Tue, 28 Sep 2010 16:02:25 +0200 (CEST) Message-ID: <4CA1F570.6000602@digiware.nl> Date: Tue, 28 Sep 2010 16:02:24 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Andriy Gapon References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan> <4CA1ECCC.4070801@digiware.nl> <4CA1F1B4.1020700@icyb.net.ua> In-Reply-To: <4CA1F1B4.1020700@icyb.net.ua> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 14:02:33 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 28-9-2010 15:46, Andriy Gapon wrote: > on 28/09/2010 16:25 Willem Jan Withagen said the following: >> Well advises seem to vary, and the latest I understood was that >> 8.1-stable did not need any tuning. (The other system with a much >> older kernel is tuned as to what most here are suggesting) And I >> was shure led to believe that even since 8.0 panics were no longer >> among us...... > > Well, now you have demonstrated yourself that it is not always so. I thought I should share the knowledge. ;) Which is not a bad thing ofr those (starting to) use ZFS. I do not read commits, but do read a lot of FreeBSD groups. And for me there is still a shroud of black art over ZFS. Just glad that my main fileserver doesn't crash. (knock on wood). >>> vm.kmem_size vfs.zfs.arc_max >> >> real memory = 12889096192 (12292 MB) avail memory = 12408684544 >> (11833 MB) >> >> So that prompts vm.kmem_size=18G. >> >> Form the other post: >>> As to arc_max/arc_min, set them based your needs according to >>> general ZFS recommendations. >> >> I'm seriously at a loss what general recommendations would be. > > Have you asked Mr. Google? :) - > http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide > > Search for "Memory and Dynamic Reconfiguration Recommendation" > - > http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Limiting_the_ARC_Cache > > Short version - decide how much memory you need for everything else > but ZFS ARC. > If autotuned value suits you, then you don't need to change > anything. I do have (read) this document, but still that doesn't really give you guidelines for tuning on FreeBSD. It is a fileserver without any serious other apps. I was using "auto-tuned", and that crashed my box. That is what started this whole thread. - --WjW -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) iQEcBAEBAgAGBQJMofVwAAoJEP4k4K6R6rBhFaUH/3wahrGWO71+xBhHi/ayNoaf DfbOWMD262XfualJudPRgoji7xb9lGaRmd4emv7QBcDjqzmcsiyIeXskT5IYKj7P DvJDULIH66iKQrRZeIBouMXMhLfiLjjT85Lj1hE8fuGg8NAOv97dnUwvVIwC0/Ai yzeeEHYivCYbRmzBhISlAWjdpSXk7xVs6gZnaLUUp953+Uv/8KmNLeG+laoWn+Hn wdKHUG3kR0g/XwJIMc5dZzYvs2kdDPh47uLythoYGC0yaLCwtxLHqEGIPtb/Gypy nIIWxOGtueJo2HjpS0+HlX/pTRW8tfYzXTzKgFKDd90t9fDt2p18BPSexuJSLVc= =hSAg -----END PGP SIGNATURE----- From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 14:07:34 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D1CBA106566C; Tue, 28 Sep 2010 14:07:34 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id C6EA28FC1A; Tue, 28 Sep 2010 14:07:33 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA05060; Tue, 28 Sep 2010 17:07:29 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA1F6A0.20109@icyb.net.ua> Date: Tue, 28 Sep 2010 17:07:28 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Willem Jan Withagen References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan> <4CA1ECCC.4070801@digiware.nl> <4CA1F1B4.1020700@icyb.net.ua> <4CA1F570.6000602@digiware.nl> In-Reply-To: <4CA1F570.6000602@digiware.nl> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 14:07:34 -0000 on 28/09/2010 17:02 Willem Jan Withagen said the following: > I do have (read) this document, but still that doesn't really give you > guidelines for tuning on FreeBSD. It is a fileserver without any serious > other apps. > I was using "auto-tuned", and that crashed my box. That is what started > this whole thread. Well, as I've said, in my opinion FreeBSD-specific tuning ends at setting kmem size. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 14:09:06 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0B4DE106564A; Tue, 28 Sep 2010 14:09:06 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id 7A4AC8FC25; Tue, 28 Sep 2010 14:09:05 +0000 (UTC) Received: from localhost (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id D8026153434; Tue, 28 Sep 2010 16:09:04 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by localhost (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NQjP-WbBQvEU; Tue, 28 Sep 2010 16:09:02 +0200 (CEST) Received: from [127.0.0.1] (unknown [192.168.254.10]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id 1AE54153433; Tue, 28 Sep 2010 16:09:02 +0200 (CEST) Message-ID: <4CA1F6FD.5090807@digiware.nl> Date: Tue, 28 Sep 2010 16:09:01 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Andriy Gapon References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan> <4CA1ECCC.4070801@digiware.nl> <4CA1F1B4.1020700@icyb.net.ua> <4CA1F570.6000602@digiware.nl> <4CA1F6A0.20109@icyb.net.ua> In-Reply-To: <4CA1F6A0.20109@icyb.net.ua> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 14:09:06 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 28-9-2010 16:07, Andriy Gapon wrote: > on 28/09/2010 17:02 Willem Jan Withagen said the following: >> I do have (read) this document, but still that doesn't really give you >> guidelines for tuning on FreeBSD. It is a fileserver without any serious >> other apps. >> I was using "auto-tuned", and that crashed my box. That is what started >> this whole thread. > > Well, as I've said, in my opinion FreeBSD-specific tuning ends at setting kmem size. > I consider that a useful statement. - --WjW -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) iQEcBAEBAgAGBQJMofb9AAoJEP4k4K6R6rBhqaUH/iFd1GG/pGLEKY+savwCRQDA iitWtiBnUVfscP3Cfy81Mrg0m3SNik+lgRD2ywC03jsE+6sJbExuw52G46RjpExc EleJZTW74KvbLHBnVQd+gWUoULKfGx4sZSBuYlkFpANhbrucpYmyPftbpFzmpD7N IOeeY6H7iOa4vnb03DLYY0iErL+ak8NtiSKqYTLYqDA/UWqVfOsvdcRbywrMIOoV JoaoD+65ZQpFYkugiFr7/BtcxXA9GJNpsUI+vIADbDgr77XmhKfu0ky4/Ci5f/L9 8YbEzhobOtRBTjX4/JAl60ZC2ToPwyZ8F4Al7Kj8r7FJnpnhddw7XlVXqEouJxQ= =X2gD -----END PGP SIGNATURE----- From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 14:25:44 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C0A471065695; Tue, 28 Sep 2010 14:25:44 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id CC02B8FC13; Tue, 28 Sep 2010 14:25:43 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA05445; Tue, 28 Sep 2010 17:25:39 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA1FAE3.9090200@icyb.net.ua> Date: Tue, 28 Sep 2010 17:25:39 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Willem Jan Withagen References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan> <4CA1ECCC.4070801@digiware.nl> <4CA1F1B4.1020700@icyb.net.ua> <4CA1F570.6000602@digiware.nl> <4CA1F6A0.20109@icyb.net.ua> <4CA1F6FD.5090807@digiware.nl> In-Reply-To: <4CA1F6FD.5090807@digiware.nl> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 14:25:44 -0000 on 28/09/2010 17:09 Willem Jan Withagen said the following: > On 28-9-2010 16:07, Andriy Gapon wrote: >> on 28/09/2010 17:02 Willem Jan Withagen said the following: >>> I do have (read) this document, but still that doesn't really give you >>> guidelines for tuning on FreeBSD. It is a fileserver without any serious >>> other apps. >>> I was using "auto-tuned", and that crashed my box. That is what started >>> this whole thread. > >> Well, as I've said, in my opinion FreeBSD-specific tuning ends at setting kmem size. > > > I consider that a useful statement. Hm, looks like I've just given a bad advice. It seems that auto-tuned arc_max is based on kmem size. So if you use kmem size that is larger than available physical memory, then you better limit arc_max to the available memory minus 1GB or so, if the autotuned value is larger than that. I think this needs to be fixed in the code. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 14:30:28 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFA661065675; Tue, 28 Sep 2010 14:30:28 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (mail.ip6.digiware.nl [IPv6:2001:4cb8:1:106::2]) by mx1.freebsd.org (Postfix) with ESMTP id 395C28FC13; Tue, 28 Sep 2010 14:30:28 +0000 (UTC) Received: from localhost (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id 09F23153433; Tue, 28 Sep 2010 16:30:27 +0200 (CEST) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by localhost (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id I07McwiaIwDQ; Tue, 28 Sep 2010 16:30:24 +0200 (CEST) Received: from [127.0.0.1] (unknown [192.168.254.10]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id 87AB3153435; Tue, 28 Sep 2010 16:30:23 +0200 (CEST) Message-ID: <4CA1FBFE.3020107@digiware.nl> Date: Tue, 28 Sep 2010 16:30:22 +0200 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100915 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Andriy Gapon References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan> <4CA1ECCC.4070801@digiware.nl> <4CA1F1B4.1020700@icyb.net.ua> <4CA1F570.6000602@digiware.nl> <4CA1F6A0.20109@icyb.net.ua> <4CA1F6FD.5090807@digiware.nl> <4CA1FAE3.9090200@icyb.net.ua> In-Reply-To: <4CA1FAE3.9090200@icyb.net.ua> X-Enigmail-Version: 1.1.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 14:30:28 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 28-9-2010 16:25, Andriy Gapon wrote: > on 28/09/2010 17:09 Willem Jan Withagen said the following: >> On 28-9-2010 16:07, Andriy Gapon wrote: >>> on 28/09/2010 17:02 Willem Jan Withagen said the following: >>>> I do have (read) this document, but still that doesn't really give you >>>> guidelines for tuning on FreeBSD. It is a fileserver without any serious >>>> other apps. >>>> I was using "auto-tuned", and that crashed my box. That is what started >>>> this whole thread. >> >>> Well, as I've said, in my opinion FreeBSD-specific tuning ends at setting kmem size. >> >> >> I consider that a useful statement. > > Hm, looks like I've just given a bad advice. > It seems that auto-tuned arc_max is based on kmem size. > So if you use kmem size that is larger than available physical memory, then you > better limit arc_max to the available memory minus 1GB or so, if the autotuned > value is larger than that. > > I think this needs to be fixed in the code. So in my case (no other serious apps) with 12G phys mem: vm.kmem_size=17G vfs.zfs.arc_max=11G - --WjW -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (MingW32) iQEcBAEBAgAGBQJMofv9AAoJEP4k4K6R6rBhrksH/0L7EP9oSi4hhITZTB0uIk8q 0IEKnc2ltnPUSFJXS9wP1r9iLzNFJJXGqrO1ZvZUFcJeXXwSzSjhD+zbd237yf/r f5nQ7yBNPd7MxZlZjDkIXB9ZJYuE1u0KMfuQSxptzOWB7oin8MpXHa1YdX6CVE7A 3+hSykteHFFqs8qwUSzoUs47r0dW2WxXE2qAEurelL6VFn++K86d32F5WNv/SX4u aN43r+/CgrjiJVNrxG+gchoicEnIaI90jepkjzpEMp8M85VF4skIZbflZrSSNheY Wzi4LD2h8dFf/La+9EB5AYkMgRcTvXcgNkppIsZ94nf7oSyYNZFuxLYC3ilQetY= =WYzV -----END PGP SIGNATURE----- From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 14:32:19 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE0AE106564A; Tue, 28 Sep 2010 14:32:19 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id AE7CA8FC14; Tue, 28 Sep 2010 14:32:18 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id RAA05575; Tue, 28 Sep 2010 17:32:13 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA1FC6D.1060000@icyb.net.ua> Date: Tue, 28 Sep 2010 17:32:13 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Willem Jan Withagen References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142@icarus.home.lan> <4CA1ECCC.4070801@digiware.nl> <4CA1F1B4.1020700@icyb.net.ua> <4CA1F570.6000602@digiware.nl> <4CA1F6A0.20109@icyb.net.ua> <4CA1F6FD.5090807@digiware.nl> <4CA1FAE3.9090200@icyb.net.ua> <4CA1FBFE.3020107@digiware.nl> In-Reply-To: <4CA1FBFE.3020107@digiware.nl> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 14:32:19 -0000 on 28/09/2010 17:30 Willem Jan Withagen said the following: > So in my case (no other serious apps) with 12G phys mem: > > vm.kmem_size=17G > vfs.zfs.arc_max=11G > Should be good. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 14:40:01 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 89A39106566C for ; Tue, 28 Sep 2010 14:40:01 +0000 (UTC) (envelope-from sterling@camdensoftware.com) Received: from wh2.interactivevillages.com (wh2.interactivevillages.com [75.125.250.34]) by mx1.freebsd.org (Postfix) with ESMTP id 4C03B8FC14 for ; Tue, 28 Sep 2010 14:40:01 +0000 (UTC) Received: from 97-126-31-96.tukw.qwest.net ([97.126.31.96] helo=_HOSTNAME_) by wh2.interactivevillages.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from ) id 1P0bDo-0001ZT-QB for stable@FreeBSD.org; Tue, 28 Sep 2010 07:31:57 -0700 Received: by _HOSTNAME_ (sSMTP sendmail emulation); Tue, 28 Sep 2010 07:39:55 -0700 Date: Tue, 28 Sep 2010 07:39:55 -0700 From: Chip Camden To: stable@FreeBSD.org Message-ID: <20100928143955.GA33940@libertas.local.camdensoftware.com> Mail-Followup-To: stable@FreeBSD.org References: <201009280425.o8S4PA0d058131@gw.catspoiler.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="GvXjxJ+pjyke8COw" Content-Disposition: inline In-Reply-To: <201009280425.o8S4PA0d058131@gw.catspoiler.org> User-Agent: Mutt/1.4.2.3i Company: Camden Software Consulting URL: http://camdensoftware.com X-PGP-Key: http://pgp.mit.edu:11371/pks/lookup?search=0xD6DBAF91 X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - wh2.interactivevillages.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - camdensoftware.com Cc: Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 14:40:01 -0000 --GvXjxJ+pjyke8COw Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Quoth Don Lewis on Monday, 27 September 2010: > CPU time accounting is broken on one of my machines running 8-STABLE. I > ran a test with a simple program that just loops and consumes CPU time: >=20 > % time ./a.out > 94.544u 0.000s 19:14.10 8.1% 62+2054k 0+0io 0pf+0w >=20 > The display in top shows the process with WCPU at 100%, but TIME > increments very slowly. >=20 > Several hours after booting, I got a bunch of "calcru: runtime went > backwards" messages, but they stopped right away and never appeared > again. >=20 > Aug 23 13:40:07 scratch ntpd[1159]: ntpd 4.2.4p5-a (1) > Aug 23 13:43:18 scratch ntpd[1160]: kernel time sync status change 2001 > Aug 23 18:05:57 scratch dbus-daemon: [system] Reloaded configuration > Aug 23 18:06:16 scratch dbus-daemon: [system] Reloaded configuration > Aug 23 18:12:40 scratch ntpd[1160]: time reset +18.059948 s > [snip] > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 68366= 85136 usec to 5425839798 usec for pid 1526 (csh) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 4747 = usec to 2403 usec for pid 1519 (csh) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 5265 = usec to 2594 usec for pid 1494 (hald-addon-mouse-sy) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 7818 = usec to 3734 usec for pid 1488 (console-kit-daemon) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 977 u= sec to 459 usec for pid 1480 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 958 u= sec to 450 usec for pid 1479 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 957 u= sec to 449 usec for pid 1478 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 952 u= sec to 447 usec for pid 1477 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 959 u= sec to 450 usec for pid 1476 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 975 u= sec to 458 usec for pid 1475 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1026 = usec to 482 usec for pid 1474 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1333 = usec to 626 usec for pid 1473 (getty) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 2469 = usec to 1160 usec for pid 1440 (inetd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 719 u= sec to 690 usec for pid 1402 (sshd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 12048= 6 usec to 56770 usec for pid 1360 (cupsd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6204 = usec to 2914 usec for pid 1289 (dbus-daemon) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 179 u= sec to 84 usec for pid 1265 (moused) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 22156= usec to 10407 usec for pid 1041 (nfsd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1292 = usec to 607 usec for pid 1032 (mountd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 8801 = usec to 4134 usec for pid 664 (devd) > Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 19 us= ec to 9 usec for pid 9 (sctp_iterator) >=20 >=20 > If I reboot and run the test again, the CPU time accounting seems to be > working correctly. > % time ./a.out > 1144.226u 0.000s 19:06.62 99.7% 5+168k 0+0io 0pf+0w >=20 I notice that before the calcru messages, ntpd reset the clock by 18 seconds -- that probably accounts for that. I don't know if that has any connection to time(1) running slower -- but perhaps ntpd is aggressively adjusting your clock? --=20 Sterling (Chip) Camden | sterling@camdensoftware.com | 2048D/3A978E4F http://camdensoftware.com | http://chipstips.com | http://chipsquips= .com --GvXjxJ+pjyke8COw Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iQEcBAEBAgAGBQJMof47AAoJEIpckszW26+R6rsIAKXPA1hFi5teoxzc4L0qqZaH zZN1N/WW0hmZrkxi4XjgSg9v5J/8AEIdI49+4VTZpjKzLQ1bSdVgLA+5IAw8MTpk dqHmlEtZftX7Gg52WQzwrGZVtuI0+jHR399o8rl0oOs36m0UK6wpx2KrbmvTjvnS Q9tTsQOwXlWr7/8F37Kr3fHNnLaRbaw1Ga6RwbUN9j+b3j4BdjftAg0j6zq19b0o IZqfySIj8ur4TkrS6HMSut6yDr1qBuOQ+ntEuLMAx+kUt6H+FmdsPrwC/wvtYxbn 7Q5zxjlpc0QJciinkjZ2ZQmWL/xZ/cPCxrL49vwTkkV70CFxpdUBSQuiyG+xF9k= =Kry2 -----END PGP SIGNATURE----- --GvXjxJ+pjyke8COw-- From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 16:24:48 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A33D21065674; Tue, 28 Sep 2010 16:24:48 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 297328FC1B; Tue, 28 Sep 2010 16:24:47 +0000 (UTC) Received: from xykon.in.wanderview.com (xykon.in.wanderview.com [10.76.10.152]) (authenticated bits=0) by mail.wanderview.com (8.14.4/8.14.4) with ESMTP id o8SFo5c9027002 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 28 Sep 2010 15:50:06 GMT (envelope-from ben@wanderview.com) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Ben Kelly In-Reply-To: <4CA1EF69.4040402@icyb.net.ua> Date: Tue, 28 Sep 2010 11:50:05 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> To: Andriy Gapon X-Mailer: Apple Mail (2.1081) X-Spam-Score: -1.01 () ALL_TRUSTED,T_RP_MATCHES_RCVD X-Scanned-By: MIMEDefang 2.67 on 10.76.20.1 Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 16:24:48 -0000 On Sep 28, 2010, at 9:36 AM, Andriy Gapon wrote: > Well, no time for me to dig through all that history. > arc_max should be a hard limit and it is now. If it ever wasn't then = it was a bug. I believe the size of the arc could exceed the limit if your working set = was larger than arc_max. The arc can't (couldn't then, anyway) evict = data that is still referenced. A contributing factor at the time was that the page daemon did not take = into account back pressure from the arc when deciding which pages to = move from active to inactive, etc. So data was more likely to be = referenced and therefore forced to remain in the arc. I'm not sure if this is still the current state. I seem to remember = some changesets mentioning arc back pressure at some point, but I don't = know the details. - Ben= From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 16:30:16 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DC0DE1065694; Tue, 28 Sep 2010 16:30:16 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id E12CE8FC17; Tue, 28 Sep 2010 16:30:15 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA07743; Tue, 28 Sep 2010 19:30:02 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA21809.7090504@icyb.net.ua> Date: Tue, 28 Sep 2010 19:30:01 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Ben Kelly References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 16:30:17 -0000 on 28/09/2010 18:50 Ben Kelly said the following: > > On Sep 28, 2010, at 9:36 AM, Andriy Gapon wrote: >> Well, no time for me to dig through all that history. arc_max should be a >> hard limit and it is now. If it ever wasn't then it was a bug. > > I believe the size of the arc could exceed the limit if your working set was > larger than arc_max. The arc can't (couldn't then, anyway) evict data that is > still referenced. I think that you are correct and I was wrong. ARC would still allocate a new buffer even if it's at or above arc_max and can not re-use any exisiting buffer. But I think that this is more likely to happen with "tiny" ARC size. I have hard time imagining a workload at which gigabytes of data would be simultaneously and continuously used (see below for definition of "used"). > A contributing factor at the time was that the page daemon did not take into > account back pressure from the arc when deciding which pages to move from > active to inactive, etc. So data was more likely to be referenced and > therefore forced to remain in the arc. I don't think that this is what happened and I don't think that pagedaemon has anything to do with the discussed issue. I think that ARC buffers exist independently of pagedaemon and page cache. I think that they are held only during time when I/O is happening to or from them. > I'm not sure if this is still the current state. I seem to remember some > changesets mentioning arc back pressure at some point, but I don't know the > details. I think that backpressure has nothing to do with it. If ZFS truly does I/O with all existing buffers and it needs a new buffer, then the choices are limited: either block and wait, or go over the limit. Apparently ZFS designers went with the latter option. But as I've said, for non-tiny ARC sizes it's hard to imagine such amount of parallel I/O that would tie all ARC buffers. Given the adaptive nature of ARC I still see it happening, but only when ARC size is near its minimum, not when it is at maximum. It seems that kstat.zfs.misc.arcstats.recycle_miss is a counter of allocations when ARC refused to grow and no existing buffer could be recycled, but this is not the same as going above ARC maximum size. BTW, such allocation over the limit could be considered as a form of memory pressure from ARC on the rest of the system. P.S. The code is in arc_get_data_buf(). -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 16:46:44 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2144C106566C; Tue, 28 Sep 2010 16:46:44 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 9C4638FC1E; Tue, 28 Sep 2010 16:46:43 +0000 (UTC) Received: from xykon.in.wanderview.com (xykon.in.wanderview.com [10.76.10.152]) (authenticated bits=0) by mail.wanderview.com (8.14.4/8.14.4) with ESMTP id o8SGkc6j027489 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 28 Sep 2010 16:46:39 GMT (envelope-from ben@wanderview.com) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Ben Kelly In-Reply-To: <4CA21809.7090504@icyb.net.ua> Date: Tue, 28 Sep 2010 12:46:39 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> To: Andriy Gapon X-Mailer: Apple Mail (2.1081) X-Spam-Score: -1.01 () ALL_TRUSTED,T_RP_MATCHES_RCVD X-Scanned-By: MIMEDefang 2.67 on 10.76.20.1 Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 16:46:44 -0000 On Sep 28, 2010, at 12:30 PM, Andriy Gapon wrote: > on 28/09/2010 18:50 Ben Kelly said the following: >>=20 >> On Sep 28, 2010, at 9:36 AM, Andriy Gapon wrote: >>> Well, no time for me to dig through all that history. arc_max should = be a >>> hard limit and it is now. If it ever wasn't then it was a bug. >>=20 >> I believe the size of the arc could exceed the limit if your working = set was >> larger than arc_max. The arc can't (couldn't then, anyway) evict = data that is >> still referenced. >=20 > I think that you are correct and I was wrong. > ARC would still allocate a new buffer even if it's at or above arc_max = and can not > re-use any exisiting buffer. > But I think that this is more likely to happen with "tiny" ARC size. = I have hard > time imagining a workload at which gigabytes of data would be = simultaneously and > continuously used (see below for definition of "used"). >=20 >> A contributing factor at the time was that the page daemon did not = take into >> account back pressure from the arc when deciding which pages to move = from >> active to inactive, etc. So data was more likely to be referenced = and >> therefore forced to remain in the arc. >=20 > I don't think that this is what happened and I don't think that = pagedaemon has > anything to do with the discussed issue. > I think that ARC buffers exist independently of pagedaemon and page = cache. > I think that they are held only during time when I/O is happening to = or from them. Hmm. My server is currently idle with no I/O happening: kstat.zfs.misc.arcstats.c: 25165824 kstat.zfs.misc.arcstats.c_max: 46137344 kstat.zfs.misc.arcstats.size: 91863156 If what you say is true, this shouldn't happen, should it? This system = is an i386 machine with kmem max at 800M and arc set to 40M. This is = running head from April 6, 2010, so it is a bit old, though. At one point I had patches running on my system that triggered the = pagedaemon based on arc load and it did allow me to keep my arc below = the max. Or at least I thought it did. In any case, I've never really been able to wrap my head around the VFS = layer and how it interacts with zfs. So I'm more than willing to = believe I'm confused. Any insights are greatly appreciated. Thanks! - Ben= From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 17:15:45 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A7B6E1065673 for ; Tue, 28 Sep 2010 17:15:45 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 6B9A88FC16 for ; Tue, 28 Sep 2010 17:15:45 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8SHFY2W059599; Tue, 28 Sep 2010 10:15:38 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009281715.o8SHFY2W059599@gw.catspoiler.org> Date: Tue, 28 Sep 2010 10:15:34 -0700 (PDT) From: Don Lewis To: sterling@camdensoftware.com In-Reply-To: <20100928143955.GA33940@libertas.local.camdensoftware.com> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 17:15:45 -0000 On 28 Sep, Chip Camden wrote: > Quoth Don Lewis on Monday, 27 September 2010: >> CPU time accounting is broken on one of my machines running 8-STABLE. I >> ran a test with a simple program that just loops and consumes CPU time: >> >> % time ./a.out >> 94.544u 0.000s 19:14.10 8.1% 62+2054k 0+0io 0pf+0w >> >> The display in top shows the process with WCPU at 100%, but TIME >> increments very slowly. >> >> Several hours after booting, I got a bunch of "calcru: runtime went >> backwards" messages, but they stopped right away and never appeared >> again. >> >> Aug 23 13:40:07 scratch ntpd[1159]: ntpd 4.2.4p5-a (1) >> Aug 23 13:43:18 scratch ntpd[1160]: kernel time sync status change 2001 >> Aug 23 18:05:57 scratch dbus-daemon: [system] Reloaded configuration >> Aug 23 18:06:16 scratch dbus-daemon: [system] Reloaded configuration >> Aug 23 18:12:40 scratch ntpd[1160]: time reset +18.059948 s >> [snip] >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6836685136 usec to 5425839798 usec for pid 1526 (csh) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 4747 usec to 2403 usec for pid 1519 (csh) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 5265 usec to 2594 usec for pid 1494 (hald-addon-mouse-sy) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 7818 usec to 3734 usec for pid 1488 (console-kit-daemon) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 977 usec to 459 usec for pid 1480 (getty) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 958 usec to 450 usec for pid 1479 (getty) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 957 usec to 449 usec for pid 1478 (getty) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 952 usec to 447 usec for pid 1477 (getty) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 959 usec to 450 usec for pid 1476 (getty) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 975 usec to 458 usec for pid 1475 (getty) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1026 usec to 482 usec for pid 1474 (getty) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1333 usec to 626 usec for pid 1473 (getty) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 2469 usec to 1160 usec for pid 1440 (inetd) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 719 usec to 690 usec for pid 1402 (sshd) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 120486 usec to 56770 usec for pid 1360 (cupsd) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 6204 usec to 2914 usec for pid 1289 (dbus-daemon) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 179 usec to 84 usec for pid 1265 (moused) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 22156 usec to 10407 usec for pid 1041 (nfsd) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 1292 usec to 607 usec for pid 1032 (mountd) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 8801 usec to 4134 usec for pid 664 (devd) >> Aug 23 23:49:06 scratch kernel: calcru: runtime went backwards from 19 usec to 9 usec for pid 9 (sctp_iterator) >> >> >> If I reboot and run the test again, the CPU time accounting seems to be >> working correctly. >> % time ./a.out >> 1144.226u 0.000s 19:06.62 99.7% 5+168k 0+0io 0pf+0w >> > > > I notice that before the calcru messages, ntpd reset the clock by > 18 seconds -- that probably accounts for that. Interesting observation. Since this happened so early in the log, I thought that this time change was the initial time change after boot, but taking a closer look, the time change occurred about 4 1/2 hours after boot. The calcru messages occured another 5 1/2 hours after that. I also just noticed that this log info was from the August 23rd kernel, before I noticed the CPU time accounting problem, and not the latest occurance. Here's the latest log info: Sep 23 16:33:50 scratch ntpd[1144]: ntpd 4.2.4p5-a (1) Sep 23 16:37:03 scratch ntpd[1145]: kernel time sync status change 2001 Sep 23 17:43:47 scratch ntpd[1145]: time reset +276.133928 s Sep 23 17:43:47 scratch ntpd[1145]: kernel time sync status change 6001 Sep 23 17:47:15 scratch ntpd[1145]: kernel time sync status change 2001 Sep 23 19:02:48 scratch ntpd[1145]: time reset +291.507262 s Sep 23 19:02:48 scratch ntpd[1145]: kernel time sync status change 6001 Sep 23 19:06:37 scratch ntpd[1145]: kernel time sync status change 2001 Sep 24 00:03:36 scratch kernel: calcru: runtime went backwards from 1120690857 u sec to 367348485 usec for pid 1518 (csh) Sep 24 00:03:36 scratch kernel: calcru: runtime went backwards from 5403 usec to 466 usec for pid 1477 (hald-addon-mouse-sy) Sep 24 00:03:36 scratch kernel: calcru: runtime went backwards from 7511 usec to 1502 usec for pid 1472 (hald-runner) Sep 24 00:03:36 scratch kernel: calcru: runtime went backwards from 17323 usec t o 12470 usec for pid 1472 (hald-runner) [snip] The time jumps are even larger. There is still the large interval between the last ntp message and the first calcru message. My time source is another FreeBSD box with a GPS receiver on my LAN. My other client machine isn't seeing these time jumps. The only messages from ntp in its log from this period are these: Sep 23 04:12:23 mousie ntpd[1111]: kernel time sync status change 6001 Sep 23 04:29:29 mousie ntpd[1111]: kernel time sync status change 2001 Sep 24 03:55:24 mousie ntpd[1111]: kernel time sync status change 6001 Sep 24 04:12:28 mousie ntpd[1111]: kernel time sync status change 2001 > I don't know if that has any connection to time(1) running slower -- but > perhaps ntpd is aggressively adjusting your clock? It seems to be pretty stable when the machine is idle: % ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 8 64 377 0.168 -0.081 0.007 Not too much degradation under CPU load: % ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 40 64 377 0.166 -0.156 0.026 I/O (dd if=/dev/ad6 of=/dev/null bs=512) doesn't appear to bother it much, either. % ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 35 64 377 0.169 -0.106 0.009 From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 17:17:58 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 536271065695; Tue, 28 Sep 2010 17:17:58 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 667268FC08; Tue, 28 Sep 2010 17:17:56 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id UAA08291; Tue, 28 Sep 2010 20:17:44 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA22337.2010900@icyb.net.ua> Date: Tue, 28 Sep 2010 20:17:43 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Ben Kelly References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> In-Reply-To: <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 17:17:58 -0000 on 28/09/2010 19:46 Ben Kelly said the following: > Hmm. My server is currently idle with no I/O happening: > > kstat.zfs.misc.arcstats.c: 25165824 > kstat.zfs.misc.arcstats.c_max: 46137344 > kstat.zfs.misc.arcstats.size: 91863156 > > If what you say is true, this shouldn't happen, should it? This system is an i386 machine with kmem max at 800M and arc set to 40M. This is running head from April 6, 2010, so it is a bit old, though. Well, your system is a bit old indeed. And the branch is unknown, so I can't really see what sources you have. And I am not sure if I'll be able to say anything about those sources. As to the numbers - yes, with current code I'd expect arcstats.size to go down to arcstats.c when there is no I/O. arc_reclaim_thread should do that. > At one point I had patches running on my system that triggered the pagedaemon based on arc load and it did allow me to keep my arc below the max. Or at least I thought it did. > > In any case, I've never really been able to wrap my head around the VFS layer and how it interacts with zfs. So I'm more than willing to believe I'm confused. Any insights are greatly appreciated. ARC is a ZFS private cache. ZFS doesn't use unified buffer/page cache. So ARC is not directly affected by pagedaemon. But this is not exactly VFS layer thing. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 17:24:36 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 303E31065673; Tue, 28 Sep 2010 17:24:36 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 744F88FC14; Tue, 28 Sep 2010 17:24:33 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id UAA08370; Tue, 28 Sep 2010 20:24:21 +0300 (EEST) (envelope-from avg@icyb.net.ua) Message-ID: <4CA224C5.8000202@icyb.net.ua> Date: Tue, 28 Sep 2010 20:24:21 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100920 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Ben Kelly References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> <4CA22337.2010900@icyb.net.ua> In-Reply-To: <4CA22337.2010900@icyb.net.ua> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 17:24:36 -0000 on 28/09/2010 20:17 Andriy Gapon said the following: > on 28/09/2010 19:46 Ben Kelly said the following: >> If what you say is true, this shouldn't happen, should it? This system is an i386 machine with kmem max at 800M and arc set to 40M. This is running head from April 6, 2010, so it is a bit old, though. > > Well, your system is a bit old indeed. > And the branch is unknown, so I can't really see what sources you have. Apologies, missed "head" in your description of the system. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 17:43:27 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 29CDF106564A for ; Tue, 28 Sep 2010 17:43:27 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta09.emeryville.ca.mail.comcast.net (qmta09.emeryville.ca.mail.comcast.net [76.96.30.96]) by mx1.freebsd.org (Postfix) with ESMTP id 0E3778FC08 for ; Tue, 28 Sep 2010 17:43:26 +0000 (UTC) Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71]) by qmta09.emeryville.ca.mail.comcast.net with comcast id CDaN1f00B1Y3wxoA9HjSiM; Tue, 28 Sep 2010 17:43:26 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta15.emeryville.ca.mail.comcast.net with comcast id CHjR1f00C3LrwQ28bHjRSN; Tue, 28 Sep 2010 17:43:26 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 7231F9B418; Tue, 28 Sep 2010 10:43:25 -0700 (PDT) Date: Tue, 28 Sep 2010 10:43:25 -0700 From: Jeremy Chadwick To: Don Lewis Message-ID: <20100928174325.GA69044@icarus.home.lan> References: <20100928143955.GA33940@libertas.local.camdensoftware.com> <201009281715.o8SHFY2W059599@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201009281715.o8SHFY2W059599@gw.catspoiler.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 17:43:27 -0000 On Tue, Sep 28, 2010 at 10:15:34AM -0700, Don Lewis wrote: > My time source is another FreeBSD box with a GPS receiver on my LAN. My > other client machine isn't seeing these time jumps. The only messages > from ntp in its log from this period are these: > > Sep 23 04:12:23 mousie ntpd[1111]: kernel time sync status change 6001 > Sep 23 04:29:29 mousie ntpd[1111]: kernel time sync status change 2001 > Sep 24 03:55:24 mousie ntpd[1111]: kernel time sync status change 6001 > Sep 24 04:12:28 mousie ntpd[1111]: kernel time sync status change 2001 I'm speaking purely about ntpd below this point -- almost certainly a separate problem/issue, but I'll explain it anyway. I'm not under the impression that the calcru messages indicate RTC clock drift, but I'd need someone like John Baldwin to validate my statement. Back to ntpd: you can addressing the above messages by adding "maxpoll 9" to your "server" lines in ntp.conf. The comment we use in our ntp.conf that documents the well-known problem: # maxpoll 9 is used to work around PLL/FLL flipping, which happens at # exactly 1024 seconds (the default maxpoll value). Another FreeBSD # user recommended using 9 instead: # http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html > > I don't know if that has any connection to time(1) running slower -- but > > perhaps ntpd is aggressively adjusting your clock? > > It seems to be pretty stable when the machine is idle: > > % ntpq -c pe > remote refid st t when poll reach delay offset jitter > ============================================================================== > *gw.catspoiler.o .GPS. 1 u 8 64 377 0.168 -0.081 0.007 > > Not too much degradation under CPU load: > > % ntpq -c pe > remote refid st t when poll reach delay offset jitter > ============================================================================== > *gw.catspoiler.o .GPS. 1 u 40 64 377 0.166 -0.156 0.026 > > I/O (dd if=/dev/ad6 of=/dev/null bs=512) doesn't appear to bother it > much, either. > > % ntpq -c pe > remote refid st t when poll reach delay offset jitter > ============================================================================== > *gw.catspoiler.o .GPS. 1 u 35 64 377 0.169 -0.106 0.009 Still speaking purely about ntpd: The above doesn't indicate a single problem. The deltas shown in both delay, offset, and jitter are all 100% legitimate. A dd (to induce more interrupt use) isn't going to exacerbate the problem (depending on your system configuration, IRQ setup, local APIC, etc.). How about writing a small shell script that runs every minute in a cronjob that does vmstat -i >> /some/file.log? Then when you see calcru messages, look around the time frame where vmstat -i was run. Look for high interrupt rates, aside from those associated with cpuX devices. Next, you need to let ntpd run for quite a bit longer than what you did above. Your poll maximum is only 64, indicating ntpd had recently been restarted, or that your offset deviates greatly (my guess is ntpd being restarted). poll will increase over time (64, 128, 256, 512, and usually max out at 1024), depending on how "stable" the clock is. when is a counter that increments, and does clock syncing (if needed) once it reaches poll. You'd see unstable system clock indications in your syslog as well (indicated by actual +/- clock drift lines occurring regularly. These aren't the same as 2001/6001 PLL/FLL mode flipping). Sorry if this is a bit paragraph/much to take in. You might also try stopping ntpd, removing /var/db/ntpd.drift, and restarting ntpd -- then check back in about 48 hours (no I'm not kidding). This is especially necessary if you've replaced the motherboard or taken the disks from System A and stuck them in System B. All that said: I'm not convinced ntpd has anything to do with your problem. EIST or EIST-like capabilities (such as Cool'n'Quiet) are often the source of the problem. "device cpufreq" might solve your issue entirely, hard to say. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 18:30:33 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F148E1065693 for ; Tue, 28 Sep 2010 18:30:33 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id B3D998FC24 for ; Tue, 28 Sep 2010 18:30:33 +0000 (UTC) Received: from elsa.codelab.cz (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 9D6C019E036 for ; Tue, 28 Sep 2010 20:12:03 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 0D79919E030 for ; Tue, 28 Sep 2010 20:12:00 +0200 (CEST) Message-ID: <4CA22FF0.8060303@quip.cz> Date: Tue, 28 Sep 2010 20:12:00 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.13) Gecko/20100914 SeaMonkey/2.0.8 MIME-Version: 1.0 To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit Subject: fetch: Non-recoverable resolver failure X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 18:30:34 -0000 Hi, we are using fetch command from cron to run PHP scripts periodically and sometimes cron sends error e-mails like this: fetch: https://hiden.example.com/cron/fiveminutes: Non-recoverable resolver failure The exact lines from crontab are: */5 * * * * fetch -qo /dev/null "https://hiden.example.com/cron/fiveminutes" */5 * * * * fetch -qo /dev/null "http://another.example.com/wd.php?hash=cslhakjs87LJ3rysalj79" Network is working without problems, resolvers are working fine too. I also tried to use local instance of named at 127.0.0.1 but it did not fix the issue so it seems there is some problem with fetch in phase of resolving address. Note: target domains are hosted on the server it-self and named too. The system is FreeBSD 7.3-RELEASE-p2 i386 GENERIC Can somebody help me to diagnose this random fetch+resolver issue? Miroslav Lachman From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 18:40:07 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E64A9106566C; Tue, 28 Sep 2010 18:40:06 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 9EB6F8FC1D; Tue, 28 Sep 2010 18:40:06 +0000 (UTC) Received: from xykon.in.wanderview.com (xykon.in.wanderview.com [10.76.10.152]) (authenticated bits=0) by mail.wanderview.com (8.14.4/8.14.4) with ESMTP id o8SIdx1R028419 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 28 Sep 2010 18:40:00 GMT (envelope-from ben@wanderview.com) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Ben Kelly In-Reply-To: <4CA22337.2010900@icyb.net.ua> Date: Tue, 28 Sep 2010 14:40:00 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> <4CA22337.2010900@icyb.net.ua> To: Andriy Gapon X-Mailer: Apple Mail (2.1081) X-Spam-Score: -1.01 () ALL_TRUSTED,T_RP_MATCHES_RCVD X-Scanned-By: MIMEDefang 2.67 on 10.76.20.1 Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 18:40:07 -0000 On Sep 28, 2010, at 1:17 PM, Andriy Gapon wrote: > on 28/09/2010 19:46 Ben Kelly said the following: >> Hmm. My server is currently idle with no I/O happening: >>=20 >> kstat.zfs.misc.arcstats.c: 25165824 >> kstat.zfs.misc.arcstats.c_max: 46137344 >> kstat.zfs.misc.arcstats.size: 91863156 >>=20 >> If what you say is true, this shouldn't happen, should it? This = system is an i386 machine with kmem max at 800M and arc set to 40M. = This is running head from April 6, 2010, so it is a bit old, though. >=20 > Well, your system is a bit old indeed. > And the branch is unknown, so I can't really see what sources you = have. > And I am not sure if I'll be able to say anything about those sources. Quite old. I've been intending to update, but haven't found the time = lately. I'll try to do the upgrade this weekend and see if it changes = anything. > As to the numbers - yes, with current code I'd expect arcstats.size to = go down to > arcstats.c when there is no I/O. arc_reclaim_thread should do that. Thats what I thought as well, but when I debugged it a year or two ago I = found that the buffers were still referenced and thus could not be = reclaimed. As far as I can remember they needed a vfs/vnops like = zfs_vnops_inactive or zfs_vnops_reclaim to be executed in order to free = the reference. What is responsible for making those calls? >=20 >> At one point I had patches running on my system that triggered the = pagedaemon based on arc load and it did allow me to keep my arc below = the max. Or at least I thought it did. >>=20 >> In any case, I've never really been able to wrap my head around the = VFS layer and how it interacts with zfs. So I'm more than willing to = believe I'm confused. Any insights are greatly appreciated. >=20 > ARC is a ZFS private cache. > ZFS doesn't use unified buffer/page cache. > So ARC is not directly affected by pagedaemon. > But this is not exactly VFS layer thing. Can you explain the difference in how the vfs/vnode operations are = called or used for those two situations? I thought that the buffer cache was used by filesystems to implement = these operations. So that the buffer cache was below the vfs/vnops = layer. So while zfs implemented its operations in terms of the arc, = things like UFS implemented vfs/vnops in terms of the buffer cache. I = thought the layers further up the chain like the page daemon did not = distinguish that much between these two implementation due to the VFS = interface layer. (Although there seems to be a layering violation in = that the buffer cache signals directly to the upper page daemon layer to = trigger page reclamation.) The old (ancient) patch I tried previously to help reduce the arc = working set and allow it to shrink is here: http://www.wanderview.com/svn/public/misc/zfs/zfs_kmem_limit.diff Unfortunately, there are a couple ideas on fighting fragmentation mixed = into that patch. See the part about arc_reclaim_pages(). This patch = did seem to allow my arc to stay under the target maximum even when = under load that previously caused the system to exceed the maximum. = When I update this weekend I'll try a stripped down version of the patch = to see if it helps or not with the latest zfs. Thanks for your help in understanding this stuff! - Ben= From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 18:43:45 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C546110656A3 for ; Tue, 28 Sep 2010 18:43:45 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.westchester.pa.mail.comcast.net (qmta03.westchester.pa.mail.comcast.net [76.96.62.32]) by mx1.freebsd.org (Postfix) with ESMTP id 733C88FC12 for ; Tue, 28 Sep 2010 18:43:45 +0000 (UTC) Received: from omta23.westchester.pa.mail.comcast.net ([76.96.62.74]) by qmta03.westchester.pa.mail.comcast.net with comcast id CB5s1f00A1c6gX853Jjl8N; Tue, 28 Sep 2010 18:43:45 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta23.westchester.pa.mail.comcast.net with comcast id CJjk1f0063LrwQ23jJjk3B; Tue, 28 Sep 2010 18:43:45 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 21C7D9B418; Tue, 28 Sep 2010 11:43:43 -0700 (PDT) Date: Tue, 28 Sep 2010 11:43:43 -0700 From: Jeremy Chadwick To: Miroslav Lachman <000.fbsd@quip.cz> Message-ID: <20100928184343.GA70384@icarus.home.lan> References: <4CA22FF0.8060303@quip.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA22FF0.8060303@quip.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: fetch: Non-recoverable resolver failure X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 18:43:45 -0000 On Tue, Sep 28, 2010 at 08:12:00PM +0200, Miroslav Lachman wrote: > Hi, > > we are using fetch command from cron to run PHP scripts periodically > and sometimes cron sends error e-mails like this: > > fetch: https://hiden.example.com/cron/fiveminutes: Non-recoverable > resolver failure > > The exact lines from crontab are: > > */5 * * * * fetch -qo /dev/null > "https://hiden.example.com/cron/fiveminutes" > > */5 * * * * fetch -qo /dev/null > "http://another.example.com/wd.php?hash=cslhakjs87LJ3rysalj79" > > Network is working without problems, resolvers are working fine too. > I also tried to use local instance of named at 127.0.0.1 but it did > not fix the issue so it seems there is some problem with fetch in > phase of resolving address. > > Note: target domains are hosted on the server it-self and named too. > > The system is FreeBSD 7.3-RELEASE-p2 i386 GENERIC > > Can somebody help me to diagnose this random fetch+resolver issue? The error in question comes from the resolver library returning EAI_FAIL. This return code can be returned to all sorts of applications (not just fetch), although how each app handles it may differ. So, chances are you really do have something going on upstream from you (one of the nameservers you use might not be available at all times), and it probably clears very quickly (before you have a chance to manually/interactively investigate it). You're probably going to have to set up a combination of scripts that do tcpdump logging, and ktrace -t+ -i (and probably -a) logging (ex. ktrace -t+ -i -a -f /var/log/ktrace.fetch.out fetch -qo ...) to find out what's going on behind the scenes. The irregularity of the problem (re: "sometimes") warrants such. I'd recommend using something other than 127.0.0.1 as your resolver if you need to do tcpdump. Providing contents of your /etc/resolv.conf, as well as details about your network configuration on the machine (specifically if any firewall stacks (pf or ipfw) are in place) would help too. Some folks might want netstat -m output as well. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 19:04:27 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 390C81065670; Tue, 28 Sep 2010 19:04:27 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id EB1EF8FC18; Tue, 28 Sep 2010 19:04:26 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.4/8.14.3) with ESMTP id o8SJ4Pot095905; Tue, 28 Sep 2010 15:04:25 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.4/8.14.3/Submit) id o8SJ4Pdt095904; Tue, 28 Sep 2010 19:04:25 GMT (envelope-from tinderbox@freebsd.org) Date: Tue, 28 Sep 2010 19:04:25 GMT Message-Id: <201009281904.o8SJ4Pdt095904@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8 tinderbox] failure on i386/pc98 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 19:04:27 -0000 TB --- 2010-09-28 18:55:35 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-09-28 18:55:35 - starting RELENG_8 tinderbox run for i386/pc98 TB --- 2010-09-28 18:55:35 - cleaning the object tree TB --- 2010-09-28 18:58:07 - cvsupping the source tree TB --- 2010-09-28 18:58:07 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup5.freebsd.org /tinderbox/RELENG_8/i386/pc98/supfile TB --- 2010-09-28 19:04:25 - WARNING: /usr/bin/csup returned exit code 1 TB --- 2010-09-28 19:04:25 - ERROR: unable to cvsup the source tree TB --- 2010-09-28 19:04:25 - 2.27 user 163.56 system 530.08 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8-i386-pc98.full From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 19:58:50 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C7EA0106574A for ; Tue, 28 Sep 2010 19:58:50 +0000 (UTC) (envelope-from vmagerya@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id DBC178FC1B for ; Tue, 28 Sep 2010 19:58:49 +0000 (UTC) Received: by bwz15 with SMTP id 15so68462bwz.13 for ; Tue, 28 Sep 2010 12:58:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=3MYjhy42AEf5GgJoGZsLOhByas9vBG3xiMlNgpaLvls=; b=EYnWXS5TEJkvrOV9DCQwZGh4LC1xW9YT3h/XAk2rIN9c91G7/zIgnhnO9DxUc5QrwJ bL0YqOWW/QVwtJEGkPsZDCti26kMmTNhCndRFq/CSIPk3Cakk7MRKrpLlhhJmmAxfLY3 Kgy2mbkPEbCIgoVQGRfsvv27R9k0u8DdPHazI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=Tev5m3JEf1Uljsg/OcEwtmSRppRJyZrzKbYfbXfRkPtiuvBzgGzDdxxCLRTj3C5l9h H5Ah27YUCguvxlmTjMO4PuGbqVqworLofK+QF2bytrKfb5e6QZM3HJvaTJkijgZkBBzF 9TzS2XAfAyQJinYRyPrKECLxBP81sV79YgzbE= Received: by 10.204.82.137 with SMTP id b9mr380566bkl.127.1285703928524; Tue, 28 Sep 2010 12:58:48 -0700 (PDT) Received: from [172.16.0.6] (tx97.net [85.198.160.156]) by mx.google.com with ESMTPS id x13sm5950483bki.0.2010.09.28.12.58.46 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 28 Sep 2010 12:58:47 -0700 (PDT) Message-ID: <4CA2488D.7000101@gmail.com> Date: Tue, 28 Sep 2010 22:57:01 +0300 From: Vitaly Magerya User-Agent: Thunderbird MIME-Version: 1.0 To: Jung-uk Kim References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> In-Reply-To: <201009271621.17669.jkim@FreeBSD.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Ted Faber , freebsd-stable@FreeBSD.org, Ian Smith Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 19:58:50 -0000 Jung-uk Kim wrote: >> - the mouse doesn't work until I restart moused manually > > I always use hint.psm.0.flags="0x6000" in /boot/loader.conf, i.e., > turn on both HOOKRESUME and INITAFTERSUSPEND, to work around similar > problem on different laptop. Yes, that helps (after the stall period). > Can you please report other problems in the appropriate ML? > > em -> freebsd-net@ > usb -> freebsd-usb@ > acpi_ec -> freebsd-acpi@ I will try to do so. I'm not sure about acpi_ec issue though; it's only a warning, and it doesn't cause me any troubles. I also have this kernel message once in a few hours (seemingly random) if I used sleep/resume before: MCA: Bank 1, Status 0xe2000000000001f5 MCA: Global Cap 0x0000000000000005, Status 0x0000000000000000 MCA: Vendor "GenuineIntel", ID 0x695, APIC ID 0 MCA: CPU 0 UNCOR PCC OVER DCACHE L1 ??? error But once again, it doesn't really cause any problems. From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 20:05:59 2010 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3D2531065675 for ; Tue, 28 Sep 2010 20:05:59 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from asmtpout029.mac.com (asmtpout029.mac.com [17.148.16.104]) by mx1.freebsd.org (Postfix) with ESMTP id 22A8F8FC1E for ; Tue, 28 Sep 2010 20:05:58 +0000 (UTC) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; charset=us-ascii Received: from cswiger1.apple.com ([17.209.4.71]) by asmtp029.mac.com (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTPSA id <0L9H00C7K3TH9O70@asmtp029.mac.com> for freebsd-stable@FreeBSD.org; Tue, 28 Sep 2010 13:05:42 -0700 (PDT) X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=6.0.2-1004200000 definitions=main-1009280148 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.0.10011,1.0.148,0.0.0000 definitions=2010-09-28_13:2010-09-28, 2010-09-28, 1970-01-01 signatures=0 From: Chuck Swiger In-reply-to: <4CA2488D.7000101@gmail.com> Date: Tue, 28 Sep 2010 13:05:41 -0700 Message-id: <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> To: Vitaly Magerya X-Mailer: Apple Mail (2.1081) Cc: "freebsd-stable@freebsd.org List" Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 20:05:59 -0000 On Sep 28, 2010, at 12:57 PM, Vitaly Magerya wrote: > I also have this kernel message once in a few hours (seemingly random) > if I used sleep/resume before: > > MCA: Bank 1, Status 0xe2000000000001f5 > MCA: Global Cap 0x0000000000000005, Status 0x0000000000000000 > MCA: Vendor "GenuineIntel", ID 0x695, APIC ID 0 > MCA: CPU 0 UNCOR PCC OVER DCACHE L1 ??? error > > But once again, it doesn't really cause any problems. That is very likely to be a matter of luck. If I translate this MCA right, it looks to be an uncorrected error in L1 data cache on the CPU. Try to run something like prime95's torture test mode and see whether it fails overnight.... Regards, -- -Chuck From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 20:43:05 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 74B7A106564A for ; Tue, 28 Sep 2010 20:43:05 +0000 (UTC) (envelope-from vmagerya@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id F07518FC16 for ; Tue, 28 Sep 2010 20:43:04 +0000 (UTC) Received: by bwz15 with SMTP id 15so111889bwz.13 for ; Tue, 28 Sep 2010 13:43:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=x4SWs9Tm1C49vqXNcUZX/37SUBTx+glLsHYPLVvDbtQ=; b=PnmC2JK/0iJNNL/oFoKs+3kEumWrp3WP/oPpu9HJBsXyUufHdtxsI9BG8LwjEyynLr Kf3RooM0Up8MPEV16NIUN9BWCRqjsP2s55oCkjTeN1BRLPF3sScf+o/aaBo3DmVK8uiy 3C82vu8CqcQoWFN8H6e2VhGMx5Xei8zAodyy4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=XXJvIe/UwJHEwBURc8luq7maheX1PXZFCzDWtFxDv1C7YzBTsrKfvYoXm1pvdp+E8T OrMBYv8YZcKoZqh+o8e671Oo2Zvyk/fhNF3lTFaSU+GclUhrHGzOd50vrLGHjHvB5vUz 19zjV3Lh/aHDDtboplvY+qUkt7VZrrP0w0x6A= Received: by 10.204.76.205 with SMTP id d13mr440300bkk.93.1285706583291; Tue, 28 Sep 2010 13:43:03 -0700 (PDT) Received: from [172.16.0.6] (tx97.net [85.198.160.156]) by mx.google.com with ESMTPS id f18sm5998540bkf.15.2010.09.28.13.43.01 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 28 Sep 2010 13:43:02 -0700 (PDT) Message-ID: <4CA252F3.1080904@gmail.com> Date: Tue, 28 Sep 2010 23:41:23 +0300 From: Vitaly Magerya User-Agent: Thunderbird MIME-Version: 1.0 To: Chuck Swiger References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> In-Reply-To: <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 20:43:05 -0000 Chuck Swiger wrote: >> MCA: Bank 1, Status 0xe2000000000001f5 >> MCA: Global Cap 0x0000000000000005, Status 0x0000000000000000 >> MCA: Vendor "GenuineIntel", ID 0x695, APIC ID 0 >> MCA: CPU 0 UNCOR PCC OVER DCACHE L1 ??? error > > That is very likely to be a matter of luck. If I translate this MCA right, > it looks to be an uncorrected error in L1 data cache on the CPU. Try to run > something like prime95's torture test mode and see whether it fails overnight.... OK, started the test (it's math/mprime, for those who wonder). From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 20:59:10 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0B9310656A5 for ; Tue, 28 Sep 2010 20:59:10 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id 636D28FC2D for ; Tue, 28 Sep 2010 20:59:10 +0000 (UTC) Received: from elsa.codelab.cz (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 1A3E919E030; Tue, 28 Sep 2010 22:59:08 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 827CC19E02D; Tue, 28 Sep 2010 22:59:05 +0200 (CEST) Message-ID: <4CA25718.2000101@quip.cz> Date: Tue, 28 Sep 2010 22:59:04 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.13) Gecko/20100914 SeaMonkey/2.0.8 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA22FF0.8060303@quip.cz> <20100928184343.GA70384@icarus.home.lan> In-Reply-To: <20100928184343.GA70384@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: fetch: Non-recoverable resolver failure X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 20:59:10 -0000 Jeremy Chadwick wrote: > On Tue, Sep 28, 2010 at 08:12:00PM +0200, Miroslav Lachman wrote: >> Hi, >> >> we are using fetch command from cron to run PHP scripts periodically >> and sometimes cron sends error e-mails like this: >> >> fetch: https://hiden.example.com/cron/fiveminutes: Non-recoverable >> resolver failure [...] >> Note: target domains are hosted on the server it-self and named too. >> >> The system is FreeBSD 7.3-RELEASE-p2 i386 GENERIC >> >> Can somebody help me to diagnose this random fetch+resolver issue? > > The error in question comes from the resolver library returning > EAI_FAIL. This return code can be returned to all sorts of applications > (not just fetch), although how each app handles it may differ. So, > chances are you really do have something going on upstream from you (one > of the nameservers you use might not be available at all times), and it > probably clears very quickly (before you have a chance to > manually/interactively investigate it). The strange thing is that I have only one nameserver listed in resolv.conf and it is the local one! (127.0.0.1) (there were two "remote" nameservers, but I tried to switch to local one to rule out remote nameservers / network problems) > You're probably going to have to set up a combination of scripts that do > tcpdump logging, and ktrace -t+ -i (and probably -a) logging (ex. ktrace > -t+ -i -a -f /var/log/ktrace.fetch.out fetch -qo ...) to find out what's > going on behind the scenes. The irregularity of the problem (re: > "sometimes") warrants such. I'd recommend using something other than > 127.0.0.1 as your resolver if you need to do tcpdump. I will try it... there will be a lot of output as there are many cronjobs and relativelly high traffic on the webserver. But fetch resolver failure occurred only few times a day. > Providing contents of your /etc/resolv.conf, as well as details about > your network configuration on the machine (specifically if any > firewall stacks (pf or ipfw) are in place) would help too. Some folks > might want netstat -m output as well. There is nothing special in the network, the machine is Sun Fire X2100 M2 with bge1 NIC connected to Cisco Linksys switch (100Mbps port) with uplink (1Gbps port) connected to Cisco router with dual 10Gbps connectivity. No firewalls in the path. There are more than 10 other servers in the rack and we have no problems / error messages in logs from other services / daemons related to DNS. # cat /etc/resolv.conf nameserver 127.0.0.1 /# netstat -m 279/861/1140 mbufs in use (current/cache/total) 257/553/810/25600 mbuf clusters in use (current/cache/total/max) 257/313 mbuf+clusters out of packet secondary zone in use (current/cache) 5/306/311/12800 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) 603K/2545K/3149K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 13/470/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 3351782 requests for I/O initiated by sendfile 0 calls to protocol drain routines (real IPs were replaced) # ifconfig bge1 bge1: flags=8843 metric 0 mtu 1500 options=9b ether 00:1e:68:2f:71:ab inet 1.2.3.40 netmask 0xffffff80 broadcast 1.2.3.127 inet 1.2.3.41 netmask 0xffffffff broadcast 1.2.3.41 inet 1.2.3.42 netmask 0xffffffff broadcast 1.2.3.42 media: Ethernet autoselect (100baseTX ) status: active NIC is: bge1@pci0:6:4:1: class=0x020000 card=0x534c108e chip=0x167814e4 rev=0xa3 hdr=0x00 vendor = 'Broadcom Corporation' device = 'BCM5715C 10/100/100 PCIe Ethernet Controller' class = network subclass = ethernet There is PF with some basic rules, mostly blocking incomming packets, allowing all outgoing and scrubbing: scrub in on bge1 all fragment reassemble scrub out on bge1 all no-df random-id min-ttl 24 max-mss 1492 fragment reassemble pass out on bge1 inet proto udp all keep state pass out on bge1 inet proto tcp from 1.2.3.40 to any flags S/SA modulate state pass out on bge1 inet proto tcp from 1.2.3.41 to any flags S/SA modulate state pass out on bge1 inet proto tcp from 1.2.3.42 to any flags S/SA modulate state modified PF options: set timeout { frag 15, interval 5 } set limit { frags 2500, states 5000 } set optimization aggressive set block-policy drop set loginterface bge1 # Let loopback and internal interface traffic flow without restrictions set skip on lo0 Thank you for your suggestions Miroslav Lachman From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 21:00:19 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A2B37106567A for ; Tue, 28 Sep 2010 21:00:19 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 661EA8FC18 for ; Tue, 28 Sep 2010 21:00:18 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8SL04CY060428; Tue, 28 Sep 2010 14:00:08 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009282100.o8SL04CY060428@gw.catspoiler.org> Date: Tue, 28 Sep 2010 14:00:04 -0700 (PDT) From: Don Lewis To: freebsd@jdc.parodius.com In-Reply-To: <20100928174325.GA69044@icarus.home.lan> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 21:00:19 -0000 On 28 Sep, Jeremy Chadwick wrote: > On Tue, Sep 28, 2010 at 10:15:34AM -0700, Don Lewis wrote: >> My time source is another FreeBSD box with a GPS receiver on my LAN. My >> other client machine isn't seeing these time jumps. The only messages >> from ntp in its log from this period are these: >> >> Sep 23 04:12:23 mousie ntpd[1111]: kernel time sync status change 6001 >> Sep 23 04:29:29 mousie ntpd[1111]: kernel time sync status change 2001 >> Sep 24 03:55:24 mousie ntpd[1111]: kernel time sync status change 6001 >> Sep 24 04:12:28 mousie ntpd[1111]: kernel time sync status change 2001 > > I'm speaking purely about ntpd below this point -- almost certainly a > separate problem/issue, but I'll explain it anyway. I'm not under the > impression that the calcru messages indicate RTC clock drift, but I'd > need someone like John Baldwin to validate my statement. I don't think the problems are directly related. I think the calcru messages get triggered by clcok frequency changes that get detected and change the tick to usec conversion ratio. > Back to ntpd: you can addressing the above messages by adding "maxpoll > 9" to your "server" lines in ntp.conf. The comment we use in our > ntp.conf that documents the well-known problem: Thanks I'll try that. > # maxpoll 9 is used to work around PLL/FLL flipping, which happens at > # exactly 1024 seconds (the default maxpoll value). Another FreeBSD > # user recommended using 9 instead: > # http://lists.freebsd.org/pipermail/freebsd-stable/2006-December/031512.html > >> > I don't know if that has any connection to time(1) running slower -- but >> > perhaps ntpd is aggressively adjusting your clock? >> >> It seems to be pretty stable when the machine is idle: >> >> % ntpq -c pe >> remote refid st t when poll reach delay offset jitter >> ============================================================================== >> *gw.catspoiler.o .GPS. 1 u 8 64 377 0.168 -0.081 0.007 >> >> Not too much degradation under CPU load: >> >> % ntpq -c pe >> remote refid st t when poll reach delay offset jitter >> ============================================================================== >> *gw.catspoiler.o .GPS. 1 u 40 64 377 0.166 -0.156 0.026 >> >> I/O (dd if=/dev/ad6 of=/dev/null bs=512) doesn't appear to bother it >> much, either. >> >> % ntpq -c pe >> remote refid st t when poll reach delay offset jitter >> ============================================================================== >> *gw.catspoiler.o .GPS. 1 u 35 64 377 0.169 -0.106 0.009 > > Still speaking purely about ntpd: > > The above doesn't indicate a single problem. The deltas shown in both > delay, offset, and jitter are all 100% legitimate. A dd (to induce more > interrupt use) isn't going to exacerbate the problem (depending on your > system configuration, IRQ setup, local APIC, etc.). I was hoping to do something to provoke clock interrupt loss. I don't see any problems when this machine is idle. The last two times that the calcru messages have occured where when I booted this machine to build a bunch of ports. I don't see any problems when this machine is idle. Offset and jitter always look really good whenever I've looked. > How about writing a small shell script that runs every minute in a > cronjob that does vmstat -i >> /some/file.log? Then when you see calcru > messages, look around the time frame where vmstat -i was run. Look for > high interrupt rates, aside from those associated with cpuX devices. Ok, I'll give this a try. Just for reference, this is what is currently reported: % vmstat -i interrupt total rate irq0: clk 60683442 1000 irq1: atkbd0 6 0 irq8: rtc 7765537 127 irq9: acpi0 13 0 irq10: ohci0 ehci1+ 10275064 169 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 21 0 irq14: ata0 90982 1 irq15: nfe0 ata1 18363 0 I'm not sure why I'm getting USB interrupts. There aren't any USB devices plugged into this machine. # usbconfig dump_info ugen0.1: at usbus0, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON ugen1.1: at usbus1, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON ugen2.1: at usbus2, cfg=0 md=HOST spd=FULL (12Mbps) pwr=ON ugen3.1: at usbus3, cfg=0 md=HOST spd=HIGH (480Mbps) pwr=ON > Next, you need to let ntpd run for quite a bit longer than what you did > above. Your poll maximum is only 64, indicating ntpd had recently been > restarted, or that your offset deviates greatly (my guess is ntpd being > restarted). poll will increase over time (64, 128, 256, 512, and > usually max out at 1024), depending on how "stable" the clock is. when > is a counter that increments, and does clock syncing (if needed) once it > reaches poll. You'd see unstable system clock indications in your > syslog as well (indicated by actual +/- clock drift lines occurring > regularly. These aren't the same as 2001/6001 PLL/FLL mode flipping). > Sorry if this is a bit paragraph/much to take in. Yes, these readings were shortly after I rebooted the machine. It's been up a while longer now: % uptime 1:48PM up 16:44, 2 users, load averages: 0.00, 0.00, 0.00 % ntpq -c pe remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 179 256 377 0.159 0.202 0.053 > You might also try stopping ntpd, removing /var/db/ntpd.drift, and > restarting ntpd -- then check back in about 48 hours (no I'm not > kidding). This is especially necessary if you've replaced the > motherboard or taken the disks from System A and stuck them in System B. I don't think the problem is the drift file. Most of the time ntp is really stable. I haven't seen any indication of clock drift that might be causing ntp to step the clock, but I haven't happened to look in the midst of port building. > All that said: I'm not convinced ntpd has anything to do with your > problem. EIST or EIST-like capabilities (such as Cool'n'Quiet) are > often the source of the problem. "device cpufreq" might solve your > issue entirely, hard to say. My kernel config includes GENERIC, which contains cpufreq. Also, kern.timecounter.hardware is ACPI-fast, which shouldn't be affected by CPU clock speed changes. This shows up in dmesg: powernow0: on cpu0 From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 21:11:27 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 605CD106566C for ; Tue, 28 Sep 2010 21:11:27 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 18A688FC13 for ; Tue, 28 Sep 2010 21:11:26 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8SLBFKB060447; Tue, 28 Sep 2010 14:11:19 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009282111.o8SLBFKB060447@gw.catspoiler.org> Date: Tue, 28 Sep 2010 14:11:15 -0700 (PDT) From: Don Lewis To: freebsd@jdc.parodius.com In-Reply-To: <201009282100.o8SL04CY060428@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 21:11:27 -0000 On 28 Sep, Don Lewis wrote: > % vmstat -i > interrupt total rate > irq0: clk 60683442 1000 > irq1: atkbd0 6 0 > irq8: rtc 7765537 127 > irq9: acpi0 13 0 > irq10: ohci0 ehci1+ 10275064 169 > irq11: fwohci0 ahc+ 132133 2 > irq12: psm0 21 0 > irq14: ata0 90982 1 > irq15: nfe0 ata1 18363 0 > > I'm not sure why I'm getting USB interrupts. There aren't any USB > devices plugged into this machine. Answer: irq 10 is also shared by vgapci0 and atapci1. From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 21:31:17 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A7A6D1065673; Tue, 28 Sep 2010 21:31:17 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id C35838FC13; Tue, 28 Sep 2010 21:31:15 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id AAA11500; Wed, 29 Sep 2010 00:31:00 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P0hlL-0000Hk-Qa; Wed, 29 Sep 2010 00:30:59 +0300 Message-ID: <4CA25E92.4060904@icyb.net.ua> Date: Wed, 29 Sep 2010 00:30:58 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Ben Kelly References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> <4CA22337.2010900@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 21:31:17 -0000 on 28/09/2010 21:40 Ben Kelly said the following: > > On Sep 28, 2010, at 1:17 PM, Andriy Gapon wrote: > >> on 28/09/2010 19:46 Ben Kelly said the following: >>> Hmm. My server is currently idle with no I/O happening: >>> >>> kstat.zfs.misc.arcstats.c: 25165824 kstat.zfs.misc.arcstats.c_max: >>> 46137344 kstat.zfs.misc.arcstats.size: 91863156 >>> >>> If what you say is true, this shouldn't happen, should it? This system >>> is an i386 machine with kmem max at 800M and arc set to 40M. This is >>> running head from April 6, 2010, so it is a bit old, though. >> >> Well, your system is a bit old indeed. And the branch is unknown, so I >> can't really see what sources you have. And I am not sure if I'll be able >> to say anything about those sources. > > Quite old. I've been intending to update, but haven't found the time lately. > I'll try to do the upgrade this weekend and see if it changes anything. > >> As to the numbers - yes, with current code I'd expect arcstats.size to go >> down to arcstats.c when there is no I/O. arc_reclaim_thread should do >> that. > > Thats what I thought as well, but when I debugged it a year or two ago I > found that the buffers were still referenced and thus could not be reclaimed. > As far as I can remember they needed a vfs/vnops like zfs_vnops_inactive or > zfs_vnops_reclaim to be executed in order to free the reference. What is > responsible for making those calls? It's time that we should start showing each other places in code :) Because I don't think that that's how the code work. E.g. I look at how zfs_read() calls dmu_read_uio() which calls dmu_buf_hold_array() and dmu_buf_rele_array() around uimove() call. >From what I see, dmu_buf_hold_array() calls dmu_buf_hold_array_by_dnode() calls dbuf_hold() calls arc_buf_add_ref() or arc_buf_alloc(). And conversely, dmu_buf_rele_array() calls dbuf_rele() calls arc_buf_remove_ref(). So, I am quite sure that ARC buffers are held/referenced only during ongoing I/O to or from them. Perhaps, on the other hand, you had in mind life-cycle of other things (not ARC buffers) that are accounted against ARC size (with type ARC_SPACE_OTHER)? Such as e.g. dmu_buf_impl_t-s allocated in dbuf_create(). I have to admit that I haven't investigated behavior of that part of ARC-assigned memory. It's only a small proportion (~10%) of the whole ARC size on my systems. >>> At one point I had patches running on my system that triggered the >>> pagedaemon based on arc load and it did allow me to keep my arc below the >>> max. Or at least I thought it did. >>> >>> In any case, I've never really been able to wrap my head around the VFS >>> layer and how it interacts with zfs. So I'm more than willing to believe >>> I'm confused. Any insights are greatly appreciated. >> >> ARC is a ZFS private cache. ZFS doesn't use unified buffer/page cache. So >> ARC is not directly affected by pagedaemon. But this is not exactly VFS >> layer thing. > > Can you explain the difference in how the vfs/vnode operations are called or > used for those two situations? They are called exactly the same. VFS layer and code above it are not aware of FS implementation details. > I thought that the buffer cache was used by filesystems to implement these > operations. So that the buffer cache was below the vfs/vnops layer. So Buffer cache works as part of unified VM and its buffers use the same pages as page cache does. > while zfs implemented its operations in terms of the arc, things like UFS > implemented vfs/vnops in terms of the buffer cache. I thought the layers Yes. Filesystems like UFS are "sandwiched" between buffer cache and page cache, which work in concert. Also, they don't (have to) implement their own buffer/page caching policies, because it's all managed by unified VM system. On the contrary, ZFS has its own private cache. So, first of all, its data may be cached in two places at once - page cache and ARC. And, because of that, some assumptions of the higher level code get violated, so ZFS has to jump through the hoops to meet those assumptions (e.g. see UIO_NOCOPY). > further up the chain like the page daemon did not distinguish that much > between these two implementation due to the VFS interface layer. (Although Right, but see above. > there seems to be a layering violation in that the buffer cache signals > directly to the upper page daemon layer to trigger page reclamation.) Umm, not sure if that is a fact. > The old (ancient) patch I tried previously to help reduce the arc working set > and allow it to shrink is here: > > http://www.wanderview.com/svn/public/misc/zfs/zfs_kmem_limit.diff > > Unfortunately, there are a couple ideas on fighting fragmentation mixed into > that patch. See the part about arc_reclaim_pages(). This patch did seem to > allow my arc to stay under the target maximum even when under load that > previously caused the system to exceed the maximum. When I update this > weekend I'll try a stripped down version of the patch to see if it helps or > not with the latest zfs. > > Thanks for your help in understanding this stuff! The patch seems good, especially the part about taking into account the kmem fragmentation. But it also seems to be heavily tuned towards "tiny ARC" systems like yours, so I am not sure yet how suitable it is for "mainstream" systems. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 22:01:27 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F036B106566C; Tue, 28 Sep 2010 22:01:27 +0000 (UTC) (envelope-from ben@wanderview.com) Received: from mail.wanderview.com (mail.wanderview.com [66.92.166.102]) by mx1.freebsd.org (Postfix) with ESMTP id 6EE088FC0A; Tue, 28 Sep 2010 22:01:27 +0000 (UTC) Received: from xykon.in.wanderview.com (xykon.in.wanderview.com [10.76.10.152]) (authenticated bits=0) by mail.wanderview.com (8.14.4/8.14.4) with ESMTP id o8SM1LVX031742 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 28 Sep 2010 22:01:21 GMT (envelope-from ben@wanderview.com) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Ben Kelly In-Reply-To: <4CA25E92.4060904@icyb.net.ua> Date: Tue, 28 Sep 2010 18:01:21 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: <5BD33772-C0EA-48A9-BE9A-C8FBAF0008D7@wanderview.com> References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> <4CA22337.2010900@icyb.net.ua> <4CA25E92.4060904@icyb.net.ua> To: Andriy Gapon X-Mailer: Apple Mail (2.1081) X-Spam-Score: -1.01 () ALL_TRUSTED,T_RP_MATCHES_RCVD X-Scanned-By: MIMEDefang 2.67 on 10.76.20.1 Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 22:01:28 -0000 On Sep 28, 2010, at 5:30 PM, Andriy Gapon wrote: << snipped lots of good info here... probably won't have time to look at = it in detail until the weekend >> >> there seems to be a layering violation in that the buffer cache = signals >> directly to the upper page daemon layer to trigger page reclamation.) >=20 > Umm, not sure if that is a fact. I was referring to the code in vfs_bio.c that used to twiddle = vm_pageout_deficit directly. That seems to have been replaced with a = call to vm_page_grab(). >> The old (ancient) patch I tried previously to help reduce the arc = working set >> and allow it to shrink is here: >>=20 >> http://www.wanderview.com/svn/public/misc/zfs/zfs_kmem_limit.diff >>=20 >> Unfortunately, there are a couple ideas on fighting fragmentation = mixed into >> that patch. See the part about arc_reclaim_pages(). This patch did = seem to >> allow my arc to stay under the target maximum even when under load = that >> previously caused the system to exceed the maximum. When I update = this >> weekend I'll try a stripped down version of the patch to see if it = helps or >> not with the latest zfs. >>=20 >> Thanks for your help in understanding this stuff! >=20 > The patch seems good, especially the part about taking into account = the kmem > fragmentation. But it also seems to be heavily tuned towards "tiny = ARC" systems > like yours, so I am not sure yet how suitable it is for "mainstream" = systems. Thanks. Yea, there is a lot of aggressive tuning there. In particular, = the slow growth algorithm is somewhat dubious. What I found, though, = was that the fragmentation jumped whenever the arc was reduced in size, = so it was an attempt to make the size slowly approach peak load without = overshooting. A better long term solution would probably be to enhance UMA to support = custom slab sizes on a zone-by-zone basis. That way all zfs/arc = allocations can use slabs of 128k (at a memory efficiency penalty of = course). I prototyped this with a dumbed down block pool allocator at = one point and was able to avoid most, if not all, of the fragmentation. = Adding the support to UMA seemed non-trivial, though. Thanks again for the information. I hope to get a chance to look at the = code this weekend. - Ben= From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 22:22:59 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F401A106566B; Tue, 28 Sep 2010 22:22:58 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 0E61D8FC08; Tue, 28 Sep 2010 22:22:57 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id BAA12308; Wed, 29 Sep 2010 01:22:45 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P0iZR-0000Lq-6i; Wed, 29 Sep 2010 01:22:45 +0300 Message-ID: <4CA26AB4.3050108@icyb.net.ua> Date: Wed, 29 Sep 2010 01:22:44 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Ben Kelly References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> <4CA22337.2010900@icyb.net.ua> <4CA25E92.4060904@icyb.net.ua> <5BD33772-C0EA-48A9-BE9A-C8FBAF0008D7@wanderview.com> In-Reply-To: <5BD33772-C0EA-48A9-BE9A-C8FBAF0008D7@wanderview.com> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , fs@freebsd.org, Jeremy Chadwick Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 22:22:59 -0000 on 29/09/2010 01:01 Ben Kelly said the following: > Thanks. Yea, there is a lot of aggressive tuning there. In particular, the > slow growth algorithm is somewhat dubious. What I found, though, was that > the fragmentation jumped whenever the arc was reduced in size, so it was an > attempt to make the size slowly approach peak load without overshooting. > > A better long term solution would probably be to enhance UMA to support > custom slab sizes on a zone-by-zone basis. That way all zfs/arc allocations > can use slabs of 128k (at a memory efficiency penalty of course). I > prototyped this with a dumbed down block pool allocator at one point and was > able to avoid most, if not all, of the fragmentation. Adding the support to > UMA seemed non-trivial, though. BTW, have you seen my posts about UMA and ZFS on hackers@ ? I found it advantageous to use UMA for ZFS I/O buffers, but only after reducing size of per-CPU caches for the zones with large-sized items. I further modified the code in my local tree to completely disable per-CPU caches for items > 32KB. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 23:20:44 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 586A81065673 for ; Tue, 28 Sep 2010 23:20:44 +0000 (UTC) (envelope-from leroy.vanlogchem@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 12D758FC1D for ; Tue, 28 Sep 2010 23:20:42 +0000 (UTC) Received: by qwd6 with SMTP id 6so142056qwd.13 for ; Tue, 28 Sep 2010 16:20:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=TubZNVL6t5j3BkjAikk/8gflDjI8/ehMR6bvX13zAeQ=; b=ourgUcNNE2jXDA04MGBalijk2hwbQr+gASEzCuJUj37hpLtGhRZtfnAipV8yRkmce/ qLHrCb7aWKPneQuKBCzFSWzMcyHbAGt+axKtEu/o4IQTyLtczq9b1LvUMwAUYowoLKv/ kzfCSS9+GduT0mdnY/axCvZtolPcZH2qezLcc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=Smo/tYsSIwNG0YDQ3Fjy0ao2eKcOswl6hiSx7ZJ+LIK/lBXF5SH0aYdqhHhDSWb675 oIfrvWspWGeNQUnpuaMTKZKgOMlsCd32LQEEILYBWI0yOpUxMOzSizxy2kFiZxwrprEW zHA41zXGYQTzMcT8vsvRiqi4s3L2yinit5RVQ= MIME-Version: 1.0 Received: by 10.229.35.5 with SMTP id n5mr347847qcd.175.1285716041620; Tue, 28 Sep 2010 16:20:41 -0700 (PDT) Received: by 10.229.2.25 with HTTP; Tue, 28 Sep 2010 16:20:41 -0700 (PDT) Date: Wed, 29 Sep 2010 01:20:41 +0200 Message-ID: From: Leroy van Logchem To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 23:20:44 -0000 >> Thanks for the clarification. I just wish I knew how vm.kmem_size_scale >> fit into the picture (meaning what it does, etc.). The sysctl >> description isn't very helpful. Again, my lack of VM knowledge... >> >Roughly, vm.kmem_size would get set to divided by >vm.kmem_size_scale. http://lists.freebsd.org/pipermail/freebsd-stable/2010-September/059114.html Thanks again for the explication, i was amiss after the post above. So increasing kmem_size_scale will reduce the resulting kmem_size. /*correct me if i'm wrong - "divided by" triggered this post*/ From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 23:36:47 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3035E106566C; Tue, 28 Sep 2010 23:36:47 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id DDE2A8FC13; Tue, 28 Sep 2010 23:36:46 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.4/8.14.3) with ESMTP id o8SNajE5027334; Tue, 28 Sep 2010 19:36:45 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.4/8.14.3/Submit) id o8SNajx7027333; Tue, 28 Sep 2010 23:36:45 GMT (envelope-from tinderbox@freebsd.org) Date: Tue, 28 Sep 2010 23:36:45 GMT Message-Id: <201009282336.o8SNajx7027333@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8 tinderbox] failure on mips/mips X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 23:36:47 -0000 TB --- 2010-09-28 19:44:25 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-09-28 19:44:25 - starting RELENG_8 tinderbox run for mips/mips TB --- 2010-09-28 19:44:25 - cleaning the object tree TB --- 2010-09-28 19:45:51 - cvsupping the source tree TB --- 2010-09-28 19:45:51 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup5.freebsd.org /tinderbox/RELENG_8/mips/mips/supfile TB --- 2010-09-28 19:50:26 - building world TB --- 2010-09-28 19:50:26 - MAKEOBJDIRPREFIX=/obj TB --- 2010-09-28 19:50:26 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2010-09-28 19:50:26 - TARGET=mips TB --- 2010-09-28 19:50:26 - TARGET_ARCH=mips TB --- 2010-09-28 19:50:26 - TZ=UTC TB --- 2010-09-28 19:50:26 - __MAKE_CONF=/dev/null TB --- 2010-09-28 19:50:26 - cd /src TB --- 2010-09-28 19:50:26 - /usr/bin/make -B buildworld >>> World build started on Tue Sep 28 19:50:28 UTC 2010 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything [...] /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 *** Error code 1 Stop in /src/usr.bin/tftp. *** Error code 1 Stop in /src/usr.bin. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2010-09-28 23:36:45 - WARNING: /usr/bin/make returned exit code 1 TB --- 2010-09-28 23:36:45 - ERROR: failed to build world TB --- 2010-09-28 23:36:45 - 2064.56 user 7803.58 system 13940.05 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8-mips-mips.full From owner-freebsd-stable@FreeBSD.ORG Tue Sep 28 23:45:03 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A6278106564A for ; Tue, 28 Sep 2010 23:45:03 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 631418FC12 for ; Tue, 28 Sep 2010 23:45:03 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8SNiqSK060715; Tue, 28 Sep 2010 16:44:56 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009282344.o8SNiqSK060715@gw.catspoiler.org> Date: Tue, 28 Sep 2010 16:44:52 -0700 (PDT) From: Don Lewis To: freebsd@jdc.parodius.com In-Reply-To: <20100928174325.GA69044@icarus.home.lan> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 28 Sep 2010 23:45:03 -0000 On 28 Sep, Jeremy Chadwick wrote: > Still speaking purely about ntpd: > > The above doesn't indicate a single problem. The deltas shown in both > delay, offset, and jitter are all 100% legitimate. A dd (to induce more > interrupt use) isn't going to exacerbate the problem (depending on your > system configuration, IRQ setup, local APIC, etc.). > > How about writing a small shell script that runs every minute in a > cronjob that does vmstat -i >> /some/file.log? Then when you see calcru > messages, look around the time frame where vmstat -i was run. Look for > high interrupt rates, aside from those associated with cpuX devices. Looking at the timestamps of things and comparing to my logs, I discovered that the last instance of ntp instability happened when I was running "make index" in /usr/ports. I tried it again with entertaining results. After a while, the machine became unresponsive. I was logged in over ssh and it stopped echoing keystrokes. In parallel I was running a script that echoed the date, the results of "vmstat -i", and the results of "ntpq -c pe". The latter showed jitter and offset going insane. Eventually "make index" finished and the machine was responsive again, but the time was way off and ntpd croaked because the necessary time correction was too large. Nothing else anomalous showed up in the logs. Hmn, about half an hour after ntpd died I started my CPU time accounting test and two minutes into that test I got a spew of calcru messages ... Tue Sep 28 14:52:27 PDT 2010 interrupt total rate irq0: clk 64077827 999 irq1: atkbd0 26 0 irq8: rtc 8199966 127 irq9: acpi0 19 0 irq10: ohci0 ehci1+ 10356112 161 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 27 0 irq14: ata0 96064 1 irq15: nfe0 ata1 23350 0 Total 82885524 1293 remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 137 128 377 0.195 0.111 0.030 Tue Sep 28 14:53:27 PDT 2010 interrupt total rate irq0: clk 64137854 999 irq1: atkbd0 26 0 irq8: rtc 8207648 127 irq9: acpi0 19 0 irq10: ohci0 ehci1+ 10360184 161 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 27 0 irq14: ata0 96154 1 irq15: nfe0 ata1 23379 0 Total 82957424 1293 remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 56 128 377 0.195 0.111 853895. Tue Sep 28 14:54:27 PDT 2010 interrupt total rate irq0: clk 64197881 999 irq1: atkbd0 26 0 irq8: rtc 8215329 127 irq9: acpi0 21 0 irq10: ohci0 ehci1+ 10360777 161 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 27 0 irq14: ata0 96244 1 irq15: nfe0 ata1 23405 0 Total 83025843 1293 remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 116 128 377 0.195 0.111 853895. Tue Sep 28 14:55:27 PDT 2010 interrupt total rate irq0: clk 64257907 999 irq1: atkbd0 26 0 irq8: rtc 8223011 127 irq9: acpi0 21 0 irq10: ohci0 ehci1+ 10360836 161 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 27 0 irq14: ata0 96334 1 irq15: nfe0 ata1 23424 0 Total 83093719 1292 remote refid st t when poll reach delay offset jitter ============================================================================== gw.catspoiler.o .GPS. 1 u 48 128 377 0.197 2259195 2091608 Tue Sep 28 14:56:27 PDT 2010 interrupt total rate irq0: clk 64317933 999 irq1: atkbd0 26 0 irq8: rtc 8230692 127 irq9: acpi0 21 0 irq10: ohci0 ehci1+ 10360857 161 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 27 0 irq14: ata0 96424 1 irq15: nfe0 ata1 23448 0 Total 83161561 1292 remote refid st t when poll reach delay offset jitter ============================================================================== gw.catspoiler.o .GPS. 1 u 108 128 377 0.197 2259195 2091608 Tue Sep 28 14:57:27 PDT 2010 interrupt total rate irq0: clk 64377960 999 irq1: atkbd0 26 0 irq8: rtc 8238374 127 irq9: acpi0 21 0 irq10: ohci0 ehci1+ 10360869 160 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 27 0 irq14: ata0 96514 1 irq15: nfe0 ata1 23469 0 Total 83229393 1292 remote refid st t when poll reach delay offset jitter ============================================================================== gw.catspoiler.o .GPS. 1 u 39 128 377 0.176 2259195 1909368 Tue Sep 28 14:59:51 PDT 2010 interrupt total rate irq0: clk 64521959 999 irq1: atkbd0 26 0 irq8: rtc 8256801 127 irq9: acpi0 21 0 irq10: ohci0 ehci1+ 10360941 160 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 27 0 irq14: ata0 96730 1 irq15: nfe0 ata1 23641 0 Total 83392279 1292 remote refid st t when poll reach delay offset jitter ============================================================================== gw.catspoiler.o .GPS. 1 u 55 128 377 0.174 2259195 1707791 Tue Sep 28 15:00:51 PDT 2010 interrupt total rate irq0: clk 64581986 999 irq1: atkbd0 26 0 irq8: rtc 8264482 127 irq9: acpi0 21 0 irq10: ohci0 ehci1+ 10361001 160 irq11: fwohci0 ahc+ 132133 2 irq12: psm0 27 0 irq14: ata0 96820 1 irq15: nfe0 ata1 23658 0 Total 83460154 1292 remote refid st t when poll reach delay offset jitter ============================================================================== gw.catspoiler.o .GPS. 1 u 115 128 377 0.174 2259195 1707791 Sep 28 15:16:06 scratch ntpd[1141]: time correction of 2259 seconds exceeds sanity limit (1000); set clock manually to the correct UTC time. From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 01:08:24 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 03EDA1065672; Wed, 29 Sep 2010 01:08:24 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8DB6C8FC08; Wed, 29 Sep 2010 01:08:23 +0000 (UTC) Received: by qyk7 with SMTP id 7so434487qyk.13 for ; Tue, 28 Sep 2010 18:08:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; bh=zs8nNjLa8+S++1leXXRprGA1GllZtA/+P7WblUbSuA4=; b=vPMfki3vezl7Cwbcpf+v5Cr8CJRtXDqiOMRCFpaEqDzzvAebWC8nY12GwUAPfTxp3a 4HpHXruhS/UsJc2pYY4kGaFGWqJ0iA+stJkLKDeE7MumEnkGifSCYc+NTNGDkEkKCmAk kv+E4MaVbFnMRvvc8a01S8YXOEduATm2pW5II= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=HXJtpsEkDkLYcIlYB8eyYZdEW7DRWljp2nRTM5YHwVuBWn+KS4WkVebgYZYcXzl3JY 0bQ6O5dejUCIdy5scgoicAQkrDXfxurR/y6VXMKcsvTqT50vOFFtjBB9xgPtpbD8i8Nk QxdPecCfi8qZZJSvCviiRfiq6B42Wky4m0hmk= MIME-Version: 1.0 Received: by 10.220.63.5 with SMTP id z5mr188074vch.105.1285720710225; Tue, 28 Sep 2010 17:38:30 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.220.176.77 with HTTP; Tue, 28 Sep 2010 17:38:30 -0700 (PDT) In-Reply-To: <4CA26AB4.3050108@icyb.net.ua> References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> <4CA22337.2010900@icyb.net.ua> <4CA25E92.4060904@icyb.net.ua> <5BD33772-C0EA-48A9-BE9A-C8FBAF0008D7@wanderview.com> <4CA26AB4.3050108@icyb.net.ua> Date: Tue, 28 Sep 2010 17:38:30 -0700 X-Google-Sender-Auth: n2VHTRWuhNlv4iTT-adWLojl3Vg Message-ID: From: Artem Belevich To: Andriy Gapon Content-Type: text/plain; charset=ISO-8859-1 Cc: stable@freebsd.org, Willem Jan Withagen , Jeremy Chadwick , fs@freebsd.org, Ben Kelly Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 01:08:24 -0000 On Tue, Sep 28, 2010 at 3:22 PM, Andriy Gapon wrote: > BTW, have you seen my posts about UMA and ZFS on hackers@ ? > I found it advantageous to use UMA for ZFS I/O buffers, but only after reducing > size of per-CPU caches for the zones with large-sized items. > I further modified the code in my local tree to completely disable per-CPU > caches for items > 32KB. Do you have updated patch disabling per-cpu caches for large items? I've just rebuilt FreeBSD-8 with your uma-2.diff (it needed r209050 from -head to compile) and so far things look good. I'll re-enable UMA for ZFS and see how it flies in a couple of days. --Artem From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 03:46:02 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4C8621065674 for ; Wed, 29 Sep 2010 03:46:02 +0000 (UTC) (envelope-from jurgen@ish.com.au) Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net [59.167.240.32]) by mx1.freebsd.org (Postfix) with ESMTP id D1B778FC15 for ; Wed, 29 Sep 2010 03:46:00 +0000 (UTC) Received: from ip-211.ish.com.au ([203.29.62.211]:56286 helo=ish.com.au) by fish.ish.com.au with esmtp (Exim 4.69) (envelope-from ) id 1P0ncD-0007qd-0n; Wed, 29 Sep 2010 13:45:57 +1000 Received: from [203.29.62.154] (HELO ip-154.ish.com.au) by ish.com.au (CommuniGate Pro SMTP 5.3.8) with ESMTP id 6224795; Wed, 29 Sep 2010 13:45:56 +1000 Message-ID: <4CA2B674.7050000@ish.com.au> Date: Wed, 29 Sep 2010 13:45:56 +1000 From: Jurgen Weber User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA19F27.6050903@ish.com.au> <20100928093051.GA59282@icarus.home.lan> In-Reply-To: <20100928093051.GA59282@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 03:46:02 -0000 Jeremy Thanks for having a look. Nothing in loader.conf # cat /etc/sysctl.conf # Do not send RSTs for packets to closed ports net.inet.tcp.blackhole=2 # Do not send ICMP port unreach messages for closed ports net.inet.udp.blackhole=1 # Generate random IP_ID's net.inet.ip.random_id=1 # Breaks RFC1379, but nobody uses it anyway net.inet.tcp.drop_synfin=1 net.inet.ip.redirect=1 net.inet.tcp.syncookies=1 net.inet.tcp.recvspace=65228 net.inet.tcp.sendspace=65228 # fastforwarding - see http://lists.freebsd.org/pipermail/freebsd-net/2004-January/002534.html net.inet.ip.fastforwarding=1 net.inet.tcp.delayed_ack=0 net.inet.udp.maxdgram=57344 kern.rndtest.verbose=0 net.link.bridge.pfil_onlyip=0 net.link.tap.user_open=1 # The system will attempt to calculate the bandwidth delay product for each connection and limit the amount of data queued to the network to just the amount required to maintain optimum throughput. net.inet.tcp.inflight.enable=1 net.inet.ip.portrange.first=1024 net.inet.ip.intr_queue_maxlen=1000 net.link.bridge.pfil_bridge=0 # Disable TCP extended debugging net.inet.tcp.log_debug=0 # Set a reasonable ICMPLimit net.inet.icmp.icmplim=500 # TSO causes problems with em(4) and reply-to, and isn't of much benefit in a firewall, disable. net.inet.tcp.tso=0 # kenv | grep smbios smbios.bios.reldate="12/19/2008" smbios.bios.vendor="Phoenix Technologies LTD" smbios.bios.version="1.2a " smbios.chassis.maker="Supermicro" smbios.chassis.serial="0123456789" smbios.chassis.tag=" " smbios.chassis.version="0123456789" smbios.planar.maker="Supermicro" smbios.planar.product="X7SBi-LN4" smbios.planar.serial="0123456789" smbios.planar.version="PCB Version" smbios.socket.enabled="1" smbios.socket.populated="1" smbios.system.maker="Supermicro" smbios.system.product="X7SBi-LN4" smbios.system.serial="0123456789" smbios.system.uuid="53d1a494-d663-a0e7-890b-8a0f00f08a0f" smbios.system.version="0123456789" # sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(-100) i8254(0) dummy(-1000000) kern.timecounter.hardware: i8254 kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 27546 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1322201372 kern.timecounter.tc.TSC.frequency: 2926018304 kern.timecounter.tc.TSC.quality: -100 kern.timecounter.smp_tsc: 0 kern.timecounter.invariant_tsc: 0 Thanks Jurgen On 28/09/10 7:30 PM, Jeremy Chadwick wrote: > Can you provide any tuning you do in loader.conf or sysctl.conf, as well > as your kernel configuration? --------------------------> ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 03:49:43 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 352B7106564A for ; Wed, 29 Sep 2010 03:49:43 +0000 (UTC) (envelope-from jurgen@ish.com.au) Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net [59.167.240.32]) by mx1.freebsd.org (Postfix) with ESMTP id E7D518FC08 for ; Wed, 29 Sep 2010 03:49:42 +0000 (UTC) Received: from ip-211.ish.com.au ([203.29.62.211]:21445 helo=ish.com.au) by fish.ish.com.au with esmtp (Exim 4.69) (envelope-from ) id 1P0nfo-0007x3-0g; Wed, 29 Sep 2010 13:49:40 +1000 Received: from [203.29.62.154] (HELO ip-154.ish.com.au) by ish.com.au (CommuniGate Pro SMTP 5.3.8) with ESMTP id 6224797; Wed, 29 Sep 2010 13:49:40 +1000 Message-ID: <4CA2B753.4010107@ish.com.au> Date: Wed, 29 Sep 2010 13:49:39 +1000 From: Jurgen Weber User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Andriy Gapon References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> In-Reply-To: <4CA1BE59.7060906@icyb.net.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 03:49:43 -0000 Andriy You can find everything you are after here: http://pastebin.com/WH4V2W0F Thanks Jurgen On 28/09/10 8:07 PM, Andriy Gapon wrote: > on 28/09/2010 10:54 Jurgen Weber said the following: >> # dmesg | grep Timecounter >> Timecounter "i8254" frequency 1193182 Hz quality 0 >> Timecounters tick every 1.000 msec >> # sysctl kern.timecounter.hardware >> kern.timecounter.hardware: i8254 >> >> Only have one timer to choose from. > > Can you provide a little bit more of "hard" data than the above? > Specifically, the following sysctls: > kern.timecounter > dev.cpu > > Output of vmstat -i. > _Verbose_ boot dmesg. > > Please do not disable ACPI when taking this data. > Preferably, upload it somewhere and post a link to it. -- --------------------------> ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 03:55:37 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2DD401065679 for ; Wed, 29 Sep 2010 03:55:37 +0000 (UTC) (envelope-from reichert@numachi.com) Received: from meisai.numachi.com (meisai.numachi.com [198.175.254.6]) by mx1.freebsd.org (Postfix) with SMTP id 8E9428FC24 for ; Wed, 29 Sep 2010 03:55:36 +0000 (UTC) Received: (qmail 47956 invoked by uid 1001); 29 Sep 2010 03:28:54 -0000 Date: Tue, 28 Sep 2010 23:28:54 -0400 From: Brian Reichert To: Miroslav Lachman <000.fbsd@quip.cz> Message-ID: <20100929032854.GA47898@numachi.com> References: <4CA22FF0.8060303@quip.cz> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA22FF0.8060303@quip.cz> User-Agent: Mutt/1.5.9i Cc: freebsd-stable@freebsd.org Subject: Re: fetch: Non-recoverable resolver failure X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 03:55:37 -0000 On Tue, Sep 28, 2010 at 08:12:00PM +0200, Miroslav Lachman wrote: > The exact lines from crontab are: > > */5 * * * * fetch -qo /dev/null > "https://hiden.example.com/cron/fiveminutes" > > */5 * * * * fetch -qo /dev/null > "http://another.example.com/wd.php?hash=cslhakjs87LJ3rysalj79" In addition to anything else, I suspect the question mark in double-quotes might cause some shell-related interpretation; perhaps single quotes will be safer... -- Brian Reichert 55 Crystal Ave. #286 Derry NH 03038-1725 USA BSD admin/developer at large From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 04:05:19 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 843B71065670 for ; Wed, 29 Sep 2010 04:05:19 +0000 (UTC) (envelope-from jurgen@ish.com.au) Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net [59.167.240.32]) by mx1.freebsd.org (Postfix) with ESMTP id 414638FC17 for ; Wed, 29 Sep 2010 04:05:18 +0000 (UTC) Received: from ip-211.ish.com.au ([203.29.62.211]:53945 helo=ish.com.au) by fish.ish.com.au with esmtp (Exim 4.69) (envelope-from ) id 1P0nuu-0008TU-0L; Wed, 29 Sep 2010 14:05:16 +1000 Received: from [203.29.62.154] (HELO ip-154.ish.com.au) by ish.com.au (CommuniGate Pro SMTP 5.3.8) with ESMTP id 6224829; Wed, 29 Sep 2010 14:05:15 +1000 Message-ID: <4CA2BAFB.60304@ish.com.au> Date: Wed, 29 Sep 2010 14:05:15 +1000 From: Jurgen Weber User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: borislav nikolov References: <4CA19F27.6050903@ish.com.au> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: "freebsd-stable@freebsd.org" Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 04:05:19 -0000 Interesting, using systat everything looks fine. The interrupts hang around 2000. Thanks Jurgen On 28/09/10 8:33 PM, borislav nikolov wrote: > Hello, > vmsat -i calculates interrupt rate based on interrupt count/uptime, and the interrupt count is 32 bit integer. > With high values of kern.hz it will overflow in few days (with kern.hz=4000 it will happen every 12 days or so). > If that is the case, use systat -vmstat 1 to get accurate interrupt rate. > That is just fyi, because i was confused once and it scared me abit, and i started changing counters untill i noticed this. > > p.s. please forgive my poor english -- --------------------------> ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 05:31:40 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 83BE6106564A for ; Wed, 29 Sep 2010 05:31:40 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 3E92D8FC1F for ; Wed, 29 Sep 2010 05:31:39 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8T5VRZJ061189; Tue, 28 Sep 2010 22:31:31 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009290531.o8T5VRZJ061189@gw.catspoiler.org> Date: Tue, 28 Sep 2010 22:31:27 -0700 (PDT) From: Don Lewis To: freebsd@jdc.parodius.com In-Reply-To: <201009282344.o8SNiqSK060715@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 05:31:40 -0000 On 28 Sep, Don Lewis wrote: > Looking at the timestamps of things and comparing to my logs, I > discovered that the last instance of ntp instability happened when I was > running "make index" in /usr/ports. I tried it again with entertaining > results. After a while, the machine became unresponsive. I was logged > in over ssh and it stopped echoing keystrokes. In parallel I was > running a script that echoed the date, the results of "vmstat -i", and > the results of "ntpq -c pe". The latter showed jitter and offset going > insane. Eventually "make index" finished and the machine was responsive > again, but the time was way off and ntpd croaked because the necessary > time correction was too large. Nothing else anomalous showed up in the > logs. Hmn, about half an hour after ntpd died I started my CPU time > accounting test and two minutes into that test I got a spew of calcru > messages ... I tried this experiment again using a kernel with WITNESS and DEBUG_VFS_LOCKS compiled in, and pinging this machine from another. Things look normal for a while, then the ping times get huge for a while and then recover. 64 bytes from 192.168.101.3: icmp_seq=1169 ttl=64 time=0.135 ms 64 bytes from 192.168.101.3: icmp_seq=1170 ttl=64 time=0.141 ms 64 bytes from 192.168.101.3: icmp_seq=1171 ttl=64 time=0.130 ms 64 bytes from 192.168.101.3: icmp_seq=1172 ttl=64 time=0.131 ms 64 bytes from 192.168.101.3: icmp_seq=1173 ttl=64 time=0.128 ms 64 bytes from 192.168.101.3: icmp_seq=1174 ttl=64 time=38232.140 ms 64 bytes from 192.168.101.3: icmp_seq=1175 ttl=64 time=37231.309 ms 64 bytes from 192.168.101.3: icmp_seq=1176 ttl=64 time=36230.470 ms 64 bytes from 192.168.101.3: icmp_seq=1177 ttl=64 time=35229.632 ms 64 bytes from 192.168.101.3: icmp_seq=1178 ttl=64 time=34228.791 ms 64 bytes from 192.168.101.3: icmp_seq=1179 ttl=64 time=33227.953 ms 64 bytes from 192.168.101.3: icmp_seq=1180 ttl=64 time=32227.091 ms 64 bytes from 192.168.101.3: icmp_seq=1181 ttl=64 time=31226.262 ms 64 bytes from 192.168.101.3: icmp_seq=1182 ttl=64 time=30225.425 ms 64 bytes from 192.168.101.3: icmp_seq=1183 ttl=64 time=29224.597 ms 64 bytes from 192.168.101.3: icmp_seq=1184 ttl=64 time=28223.757 ms 64 bytes from 192.168.101.3: icmp_seq=1185 ttl=64 time=27222.918 ms 64 bytes from 192.168.101.3: icmp_seq=1186 ttl=64 time=26222.086 ms 64 bytes from 192.168.101.3: icmp_seq=1187 ttl=64 time=25221.164 ms 64 bytes from 192.168.101.3: icmp_seq=1188 ttl=64 time=24220.407 ms 64 bytes from 192.168.101.3: icmp_seq=1189 ttl=64 time=23219.575 ms 64 bytes from 192.168.101.3: icmp_seq=1190 ttl=64 time=22218.737 ms 64 bytes from 192.168.101.3: icmp_seq=1191 ttl=64 time=21217.905 ms 64 bytes from 192.168.101.3: icmp_seq=1192 ttl=64 time=20217.066 ms 64 bytes from 192.168.101.3: icmp_seq=1193 ttl=64 time=19216.228 ms 64 bytes from 192.168.101.3: icmp_seq=1194 ttl=64 time=18215.333 ms 64 bytes from 192.168.101.3: icmp_seq=1195 ttl=64 time=17214.503 ms 64 bytes from 192.168.101.3: icmp_seq=1196 ttl=64 time=16213.720 ms 64 bytes from 192.168.101.3: icmp_seq=1197 ttl=64 time=15210.912 ms 64 bytes from 192.168.101.3: icmp_seq=1198 ttl=64 time=14210.044 ms 64 bytes from 192.168.101.3: icmp_seq=1199 ttl=64 time=13209.194 ms 64 bytes from 192.168.101.3: icmp_seq=1200 ttl=64 time=12208.376 ms 64 bytes from 192.168.101.3: icmp_seq=1201 ttl=64 time=11207.536 ms 64 bytes from 192.168.101.3: icmp_seq=1202 ttl=64 time=10206.694 ms 64 bytes from 192.168.101.3: icmp_seq=1203 ttl=64 time=9205.816 ms 64 bytes from 192.168.101.3: icmp_seq=1204 ttl=64 time=8205.014 ms 64 bytes from 192.168.101.3: icmp_seq=1205 ttl=64 time=7204.186 ms 64 bytes from 192.168.101.3: icmp_seq=1206 ttl=64 time=6203.294 ms 64 bytes from 192.168.101.3: icmp_seq=1207 ttl=64 time=5202.510 ms 64 bytes from 192.168.101.3: icmp_seq=1208 ttl=64 time=4201.677 ms 64 bytes from 192.168.101.3: icmp_seq=1209 ttl=64 time=3200.851 ms 64 bytes from 192.168.101.3: icmp_seq=1210 ttl=64 time=2200.013 ms 64 bytes from 192.168.101.3: icmp_seq=1211 ttl=64 time=1199.100 ms 64 bytes from 192.168.101.3: icmp_seq=1212 ttl=64 time=198.331 ms 64 bytes from 192.168.101.3: icmp_seq=1213 ttl=64 time=0.129 ms 64 bytes from 192.168.101.3: icmp_seq=1214 ttl=64 time=58223.470 ms 64 bytes from 192.168.101.3: icmp_seq=1215 ttl=64 time=57222.637 ms 64 bytes from 192.168.101.3: icmp_seq=1216 ttl=64 time=56221.800 ms 64 bytes from 192.168.101.3: icmp_seq=1217 ttl=64 time=55220.960 ms 64 bytes from 192.168.101.3: icmp_seq=1218 ttl=64 time=54220.116 ms 64 bytes from 192.168.101.3: icmp_seq=1219 ttl=64 time=53219.282 ms 64 bytes from 192.168.101.3: icmp_seq=1220 ttl=64 time=52218.444 ms 64 bytes from 192.168.101.3: icmp_seq=1221 ttl=64 time=51217.618 ms 64 bytes from 192.168.101.3: icmp_seq=1222 ttl=64 time=50216.778 ms 64 bytes from 192.168.101.3: icmp_seq=1223 ttl=64 time=49215.932 ms 64 bytes from 192.168.101.3: icmp_seq=1224 ttl=64 time=48215.095 ms 64 bytes from 192.168.101.3: icmp_seq=1225 ttl=64 time=47214.262 ms 64 bytes from 192.168.101.3: icmp_seq=1226 ttl=64 time=46213.440 ms 64 bytes from 192.168.101.3: icmp_seq=1227 ttl=64 time=45212.623 ms 64 bytes from 192.168.101.3: icmp_seq=1228 ttl=64 time=44211.783 ms 64 bytes from 192.168.101.3: icmp_seq=1229 ttl=64 time=43210.903 ms 64 bytes from 192.168.101.3: icmp_seq=1230 ttl=64 time=42210.111 ms 64 bytes from 192.168.101.3: icmp_seq=1231 ttl=64 time=41209.274 ms 64 bytes from 192.168.101.3: icmp_seq=1232 ttl=64 time=40208.448 ms 64 bytes from 192.168.101.3: icmp_seq=1233 ttl=64 time=39207.608 ms 64 bytes from 192.168.101.3: icmp_seq=1234 ttl=64 time=38206.774 ms 64 bytes from 192.168.101.3: icmp_seq=1235 ttl=64 time=37205.842 ms 64 bytes from 192.168.101.3: icmp_seq=1236 ttl=64 time=36205.104 ms 64 bytes from 192.168.101.3: icmp_seq=1237 ttl=64 time=35204.270 ms 64 bytes from 192.168.101.3: icmp_seq=1238 ttl=64 time=34203.433 ms 64 bytes from 192.168.101.3: icmp_seq=1239 ttl=64 time=33202.603 ms 64 bytes from 192.168.101.3: icmp_seq=1240 ttl=64 time=32201.764 ms 64 bytes from 192.168.101.3: icmp_seq=1241 ttl=64 time=31200.924 ms 64 bytes from 192.168.101.3: icmp_seq=1242 ttl=64 time=30200.082 ms 64 bytes from 192.168.101.3: icmp_seq=1243 ttl=64 time=29198.883 ms 64 bytes from 192.168.101.3: icmp_seq=1244 ttl=64 time=28198.414 ms 64 bytes from 192.168.101.3: icmp_seq=1245 ttl=64 time=27197.434 ms 64 bytes from 192.168.101.3: icmp_seq=1246 ttl=64 time=26196.738 ms 64 bytes from 192.168.101.3: icmp_seq=1247 ttl=64 time=25195.912 ms 64 bytes from 192.168.101.3: icmp_seq=1248 ttl=64 time=24195.074 ms 64 bytes from 192.168.101.3: icmp_seq=1249 ttl=64 time=23194.231 ms 64 bytes from 192.168.101.3: icmp_seq=1250 ttl=64 time=22193.407 ms 64 bytes from 192.168.101.3: icmp_seq=1251 ttl=64 time=21192.565 ms 64 bytes from 192.168.101.3: icmp_seq=1252 ttl=64 time=20191.725 ms 64 bytes from 192.168.101.3: icmp_seq=1253 ttl=64 time=19190.852 ms 64 bytes from 192.168.101.3: icmp_seq=1254 ttl=64 time=18190.060 ms 64 bytes from 192.168.101.3: icmp_seq=1255 ttl=64 time=17189.220 ms 64 bytes from 192.168.101.3: icmp_seq=1256 ttl=64 time=16188.381 ms 64 bytes from 192.168.101.3: icmp_seq=1257 ttl=64 time=15183.118 ms 64 bytes from 192.168.101.3: icmp_seq=1258 ttl=64 time=14182.711 ms 64 bytes from 192.168.101.3: icmp_seq=1259 ttl=64 time=13181.876 ms 64 bytes from 192.168.101.3: icmp_seq=1260 ttl=64 time=12181.034 ms 64 bytes from 192.168.101.3: icmp_seq=1261 ttl=64 time=11180.192 ms 64 bytes from 192.168.101.3: icmp_seq=1262 ttl=64 time=10179.357 ms 64 bytes from 192.168.101.3: icmp_seq=1263 ttl=64 time=9178.522 ms 64 bytes from 192.168.101.3: icmp_seq=1264 ttl=64 time=8177.692 ms 64 bytes from 192.168.101.3: icmp_seq=1265 ttl=64 time=7176.850 ms 64 bytes from 192.168.101.3: icmp_seq=1266 ttl=64 time=6176.026 ms 64 bytes from 192.168.101.3: icmp_seq=1267 ttl=64 time=5175.185 ms 64 bytes from 192.168.101.3: icmp_seq=1268 ttl=64 time=4174.355 ms 64 bytes from 192.168.101.3: icmp_seq=1269 ttl=64 time=3173.479 ms 64 bytes from 192.168.101.3: icmp_seq=1270 ttl=64 time=2172.658 ms 64 bytes from 192.168.101.3: icmp_seq=1271 ttl=64 time=1171.835 ms 64 bytes from 192.168.101.3: icmp_seq=1272 ttl=64 time=170.971 ms 64 bytes from 192.168.101.3: icmp_seq=1273 ttl=64 time=0.138 ms 64 bytes from 192.168.101.3: icmp_seq=1274 ttl=64 time=0.162 ms 64 bytes from 192.168.101.3: icmp_seq=1275 ttl=64 time=0.133 ms 64 bytes from 192.168.101.3: icmp_seq=1276 ttl=64 time=0.140 ms 64 bytes from 192.168.101.3: icmp_seq=1277 ttl=64 time=0.138 ms 64 bytes from 192.168.101.3: icmp_seq=1278 ttl=64 time=0.132 ms 64 bytes from 192.168.101.3: icmp_seq=1279 ttl=64 time=0.132 ms 64 bytes from 192.168.101.3: icmp_seq=1280 ttl=64 time=0.132 ms 64 bytes from 192.168.101.3: icmp_seq=1281 ttl=64 time=0.129 ms At that point the machine silently rebooted inspite of being compiled with KDB and DDB and not KDB_UNATTENDED. This silent reboot is reproduceable. From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 07:00:19 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 87A2B1065674 for ; Wed, 29 Sep 2010 07:00:19 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.emeryville.ca.mail.comcast.net (qmta15.emeryville.ca.mail.comcast.net [76.96.27.228]) by mx1.freebsd.org (Postfix) with ESMTP id 6B3ED8FC18 for ; Wed, 29 Sep 2010 07:00:19 +0000 (UTC) Received: from omta20.emeryville.ca.mail.comcast.net ([76.96.30.87]) by qmta15.emeryville.ca.mail.comcast.net with comcast id CWot1f0011smiN4AFX0KRN; Wed, 29 Sep 2010 07:00:19 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta20.emeryville.ca.mail.comcast.net with comcast id CX0H1f00R3LrwQ28gX0JYf; Wed, 29 Sep 2010 07:00:18 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id B8EB09B418; Wed, 29 Sep 2010 00:00:17 -0700 (PDT) Date: Wed, 29 Sep 2010 00:00:17 -0700 From: Jeremy Chadwick To: Don Lewis Message-ID: <20100929070017.GA82362@icarus.home.lan> References: <201009282344.o8SNiqSK060715@gw.catspoiler.org> <201009290531.o8T5VRZJ061189@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201009290531.o8T5VRZJ061189@gw.catspoiler.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 07:00:19 -0000 On Tue, Sep 28, 2010 at 10:31:27PM -0700, Don Lewis wrote: > On 28 Sep, Don Lewis wrote: > > > Looking at the timestamps of things and comparing to my logs, I > > discovered that the last instance of ntp instability happened when I was > > running "make index" in /usr/ports. I tried it again with entertaining > > results. After a while, the machine became unresponsive. I was logged > > in over ssh and it stopped echoing keystrokes. In parallel I was > > running a script that echoed the date, the results of "vmstat -i", and > > the results of "ntpq -c pe". The latter showed jitter and offset going > > insane. Eventually "make index" finished and the machine was responsive > > again, but the time was way off and ntpd croaked because the necessary > > time correction was too large. Nothing else anomalous showed up in the > > logs. Hmn, about half an hour after ntpd died I started my CPU time > > accounting test and two minutes into that test I got a spew of calcru > > messages ... > > I tried this experiment again using a kernel with WITNESS and > DEBUG_VFS_LOCKS compiled in, and pinging this machine from another. > Things look normal for a while, then the ping times get huge for a while > and then recover. > > 64 bytes from 192.168.101.3: icmp_seq=1169 ttl=64 time=0.135 ms > 64 bytes from 192.168.101.3: icmp_seq=1170 ttl=64 time=0.141 ms > 64 bytes from 192.168.101.3: icmp_seq=1171 ttl=64 time=0.130 ms > 64 bytes from 192.168.101.3: icmp_seq=1172 ttl=64 time=0.131 ms > 64 bytes from 192.168.101.3: icmp_seq=1173 ttl=64 time=0.128 ms > 64 bytes from 192.168.101.3: icmp_seq=1174 ttl=64 time=38232.140 ms > 64 bytes from 192.168.101.3: icmp_seq=1175 ttl=64 time=37231.309 ms > 64 bytes from 192.168.101.3: icmp_seq=1176 ttl=64 time=36230.470 ms > 64 bytes from 192.168.101.3: icmp_seq=1177 ttl=64 time=35229.632 ms > 64 bytes from 192.168.101.3: icmp_seq=1178 ttl=64 time=34228.791 ms > 64 bytes from 192.168.101.3: icmp_seq=1179 ttl=64 time=33227.953 ms > 64 bytes from 192.168.101.3: icmp_seq=1180 ttl=64 time=32227.091 ms > 64 bytes from 192.168.101.3: icmp_seq=1181 ttl=64 time=31226.262 ms > 64 bytes from 192.168.101.3: icmp_seq=1182 ttl=64 time=30225.425 ms > 64 bytes from 192.168.101.3: icmp_seq=1183 ttl=64 time=29224.597 ms > 64 bytes from 192.168.101.3: icmp_seq=1184 ttl=64 time=28223.757 ms > 64 bytes from 192.168.101.3: icmp_seq=1185 ttl=64 time=27222.918 ms > 64 bytes from 192.168.101.3: icmp_seq=1186 ttl=64 time=26222.086 ms > 64 bytes from 192.168.101.3: icmp_seq=1187 ttl=64 time=25221.164 ms > 64 bytes from 192.168.101.3: icmp_seq=1188 ttl=64 time=24220.407 ms > 64 bytes from 192.168.101.3: icmp_seq=1189 ttl=64 time=23219.575 ms > 64 bytes from 192.168.101.3: icmp_seq=1190 ttl=64 time=22218.737 ms > 64 bytes from 192.168.101.3: icmp_seq=1191 ttl=64 time=21217.905 ms > 64 bytes from 192.168.101.3: icmp_seq=1192 ttl=64 time=20217.066 ms > 64 bytes from 192.168.101.3: icmp_seq=1193 ttl=64 time=19216.228 ms > 64 bytes from 192.168.101.3: icmp_seq=1194 ttl=64 time=18215.333 ms > 64 bytes from 192.168.101.3: icmp_seq=1195 ttl=64 time=17214.503 ms > 64 bytes from 192.168.101.3: icmp_seq=1196 ttl=64 time=16213.720 ms > 64 bytes from 192.168.101.3: icmp_seq=1197 ttl=64 time=15210.912 ms > 64 bytes from 192.168.101.3: icmp_seq=1198 ttl=64 time=14210.044 ms > 64 bytes from 192.168.101.3: icmp_seq=1199 ttl=64 time=13209.194 ms > 64 bytes from 192.168.101.3: icmp_seq=1200 ttl=64 time=12208.376 ms > 64 bytes from 192.168.101.3: icmp_seq=1201 ttl=64 time=11207.536 ms > 64 bytes from 192.168.101.3: icmp_seq=1202 ttl=64 time=10206.694 ms > 64 bytes from 192.168.101.3: icmp_seq=1203 ttl=64 time=9205.816 ms > 64 bytes from 192.168.101.3: icmp_seq=1204 ttl=64 time=8205.014 ms > 64 bytes from 192.168.101.3: icmp_seq=1205 ttl=64 time=7204.186 ms > 64 bytes from 192.168.101.3: icmp_seq=1206 ttl=64 time=6203.294 ms > 64 bytes from 192.168.101.3: icmp_seq=1207 ttl=64 time=5202.510 ms > 64 bytes from 192.168.101.3: icmp_seq=1208 ttl=64 time=4201.677 ms > 64 bytes from 192.168.101.3: icmp_seq=1209 ttl=64 time=3200.851 ms > 64 bytes from 192.168.101.3: icmp_seq=1210 ttl=64 time=2200.013 ms > 64 bytes from 192.168.101.3: icmp_seq=1211 ttl=64 time=1199.100 ms > 64 bytes from 192.168.101.3: icmp_seq=1212 ttl=64 time=198.331 ms > 64 bytes from 192.168.101.3: icmp_seq=1213 ttl=64 time=0.129 ms > 64 bytes from 192.168.101.3: icmp_seq=1214 ttl=64 time=58223.470 ms > 64 bytes from 192.168.101.3: icmp_seq=1215 ttl=64 time=57222.637 ms > 64 bytes from 192.168.101.3: icmp_seq=1216 ttl=64 time=56221.800 ms > 64 bytes from 192.168.101.3: icmp_seq=1217 ttl=64 time=55220.960 ms > 64 bytes from 192.168.101.3: icmp_seq=1218 ttl=64 time=54220.116 ms > 64 bytes from 192.168.101.3: icmp_seq=1219 ttl=64 time=53219.282 ms > 64 bytes from 192.168.101.3: icmp_seq=1220 ttl=64 time=52218.444 ms > 64 bytes from 192.168.101.3: icmp_seq=1221 ttl=64 time=51217.618 ms > 64 bytes from 192.168.101.3: icmp_seq=1222 ttl=64 time=50216.778 ms > 64 bytes from 192.168.101.3: icmp_seq=1223 ttl=64 time=49215.932 ms > 64 bytes from 192.168.101.3: icmp_seq=1224 ttl=64 time=48215.095 ms > 64 bytes from 192.168.101.3: icmp_seq=1225 ttl=64 time=47214.262 ms > 64 bytes from 192.168.101.3: icmp_seq=1226 ttl=64 time=46213.440 ms > 64 bytes from 192.168.101.3: icmp_seq=1227 ttl=64 time=45212.623 ms > 64 bytes from 192.168.101.3: icmp_seq=1228 ttl=64 time=44211.783 ms > 64 bytes from 192.168.101.3: icmp_seq=1229 ttl=64 time=43210.903 ms > 64 bytes from 192.168.101.3: icmp_seq=1230 ttl=64 time=42210.111 ms > 64 bytes from 192.168.101.3: icmp_seq=1231 ttl=64 time=41209.274 ms > 64 bytes from 192.168.101.3: icmp_seq=1232 ttl=64 time=40208.448 ms > 64 bytes from 192.168.101.3: icmp_seq=1233 ttl=64 time=39207.608 ms > 64 bytes from 192.168.101.3: icmp_seq=1234 ttl=64 time=38206.774 ms > 64 bytes from 192.168.101.3: icmp_seq=1235 ttl=64 time=37205.842 ms > 64 bytes from 192.168.101.3: icmp_seq=1236 ttl=64 time=36205.104 ms > 64 bytes from 192.168.101.3: icmp_seq=1237 ttl=64 time=35204.270 ms > 64 bytes from 192.168.101.3: icmp_seq=1238 ttl=64 time=34203.433 ms > 64 bytes from 192.168.101.3: icmp_seq=1239 ttl=64 time=33202.603 ms > 64 bytes from 192.168.101.3: icmp_seq=1240 ttl=64 time=32201.764 ms > 64 bytes from 192.168.101.3: icmp_seq=1241 ttl=64 time=31200.924 ms > 64 bytes from 192.168.101.3: icmp_seq=1242 ttl=64 time=30200.082 ms > 64 bytes from 192.168.101.3: icmp_seq=1243 ttl=64 time=29198.883 ms > 64 bytes from 192.168.101.3: icmp_seq=1244 ttl=64 time=28198.414 ms > 64 bytes from 192.168.101.3: icmp_seq=1245 ttl=64 time=27197.434 ms > 64 bytes from 192.168.101.3: icmp_seq=1246 ttl=64 time=26196.738 ms > 64 bytes from 192.168.101.3: icmp_seq=1247 ttl=64 time=25195.912 ms > 64 bytes from 192.168.101.3: icmp_seq=1248 ttl=64 time=24195.074 ms > 64 bytes from 192.168.101.3: icmp_seq=1249 ttl=64 time=23194.231 ms > 64 bytes from 192.168.101.3: icmp_seq=1250 ttl=64 time=22193.407 ms > 64 bytes from 192.168.101.3: icmp_seq=1251 ttl=64 time=21192.565 ms > 64 bytes from 192.168.101.3: icmp_seq=1252 ttl=64 time=20191.725 ms > 64 bytes from 192.168.101.3: icmp_seq=1253 ttl=64 time=19190.852 ms > 64 bytes from 192.168.101.3: icmp_seq=1254 ttl=64 time=18190.060 ms > 64 bytes from 192.168.101.3: icmp_seq=1255 ttl=64 time=17189.220 ms > 64 bytes from 192.168.101.3: icmp_seq=1256 ttl=64 time=16188.381 ms > 64 bytes from 192.168.101.3: icmp_seq=1257 ttl=64 time=15183.118 ms > 64 bytes from 192.168.101.3: icmp_seq=1258 ttl=64 time=14182.711 ms > 64 bytes from 192.168.101.3: icmp_seq=1259 ttl=64 time=13181.876 ms > 64 bytes from 192.168.101.3: icmp_seq=1260 ttl=64 time=12181.034 ms > 64 bytes from 192.168.101.3: icmp_seq=1261 ttl=64 time=11180.192 ms > 64 bytes from 192.168.101.3: icmp_seq=1262 ttl=64 time=10179.357 ms > 64 bytes from 192.168.101.3: icmp_seq=1263 ttl=64 time=9178.522 ms > 64 bytes from 192.168.101.3: icmp_seq=1264 ttl=64 time=8177.692 ms > 64 bytes from 192.168.101.3: icmp_seq=1265 ttl=64 time=7176.850 ms > 64 bytes from 192.168.101.3: icmp_seq=1266 ttl=64 time=6176.026 ms > 64 bytes from 192.168.101.3: icmp_seq=1267 ttl=64 time=5175.185 ms > 64 bytes from 192.168.101.3: icmp_seq=1268 ttl=64 time=4174.355 ms > 64 bytes from 192.168.101.3: icmp_seq=1269 ttl=64 time=3173.479 ms > 64 bytes from 192.168.101.3: icmp_seq=1270 ttl=64 time=2172.658 ms > 64 bytes from 192.168.101.3: icmp_seq=1271 ttl=64 time=1171.835 ms > 64 bytes from 192.168.101.3: icmp_seq=1272 ttl=64 time=170.971 ms > 64 bytes from 192.168.101.3: icmp_seq=1273 ttl=64 time=0.138 ms > 64 bytes from 192.168.101.3: icmp_seq=1274 ttl=64 time=0.162 ms > 64 bytes from 192.168.101.3: icmp_seq=1275 ttl=64 time=0.133 ms > 64 bytes from 192.168.101.3: icmp_seq=1276 ttl=64 time=0.140 ms > 64 bytes from 192.168.101.3: icmp_seq=1277 ttl=64 time=0.138 ms > 64 bytes from 192.168.101.3: icmp_seq=1278 ttl=64 time=0.132 ms > 64 bytes from 192.168.101.3: icmp_seq=1279 ttl=64 time=0.132 ms > 64 bytes from 192.168.101.3: icmp_seq=1280 ttl=64 time=0.132 ms > 64 bytes from 192.168.101.3: icmp_seq=1281 ttl=64 time=0.129 ms > > At that point the machine silently rebooted inspite of being compiled > with KDB and DDB and not KDB_UNATTENDED. This silent reboot is > reproduceable. Given all the information here, in addition to the other portion of the thread (indicating ntpd reports extreme offset between the system clock and its stratum 1 source), I would say the motherboard is faulty or there is a system device which is behaving badly (possibly something pertaining to interrupts, but I don't know how to debug this on a low level). Can you boot verbosely and provide all of the output here or somewhere on the web? If possible, I would start by replacing the mainboard. The board looks to be a consumer-level board (I see an nfe(4) controller, for example). -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 07:23:18 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0569C106566B for ; Wed, 29 Sep 2010 07:23:18 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta02.emeryville.ca.mail.comcast.net (qmta02.emeryville.ca.mail.comcast.net [76.96.30.24]) by mx1.freebsd.org (Postfix) with ESMTP id E057E8FC1F for ; Wed, 29 Sep 2010 07:23:17 +0000 (UTC) Received: from omta05.emeryville.ca.mail.comcast.net ([76.96.30.43]) by qmta02.emeryville.ca.mail.comcast.net with comcast id CXJo1f0020vp7WLA2XPHJC; Wed, 29 Sep 2010 07:23:17 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta05.emeryville.ca.mail.comcast.net with comcast id CXPG1f0033LrwQ28RXPGTU; Wed, 29 Sep 2010 07:23:17 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 54F4A9B418; Wed, 29 Sep 2010 00:23:16 -0700 (PDT) Date: Wed, 29 Sep 2010 00:23:16 -0700 From: Jeremy Chadwick To: Miroslav Lachman <000.fbsd@quip.cz> Message-ID: <20100929072316.GA82514@icarus.home.lan> References: <4CA22FF0.8060303@quip.cz> <20100928184343.GA70384@icarus.home.lan> <4CA25718.2000101@quip.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA25718.2000101@quip.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: fetch: Non-recoverable resolver failure X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 07:23:18 -0000 On Tue, Sep 28, 2010 at 10:59:04PM +0200, Miroslav Lachman wrote: > Jeremy Chadwick wrote: > >On Tue, Sep 28, 2010 at 08:12:00PM +0200, Miroslav Lachman wrote: > >>Hi, > >> > >>we are using fetch command from cron to run PHP scripts periodically > >>and sometimes cron sends error e-mails like this: > >> > >>fetch: https://hiden.example.com/cron/fiveminutes: Non-recoverable > >>resolver failure > > [...] > > >>Note: target domains are hosted on the server it-self and named too. > >> > >>The system is FreeBSD 7.3-RELEASE-p2 i386 GENERIC > >> > >>Can somebody help me to diagnose this random fetch+resolver issue? > > > >The error in question comes from the resolver library returning > >EAI_FAIL. This return code can be returned to all sorts of applications > >(not just fetch), although how each app handles it may differ. So, > >chances are you really do have something going on upstream from you (one > >of the nameservers you use might not be available at all times), and it > >probably clears very quickly (before you have a chance to > >manually/interactively investigate it). > > The strange thing is that I have only one nameserver listed in > resolv.conf and it is the local one! (127.0.0.1) (there were two > "remote" nameservers, but I tried to switch to local one to rule out > remote nameservers / network problems) > > >You're probably going to have to set up a combination of scripts that do > >tcpdump logging, and ktrace -t+ -i (and probably -a) logging (ex. ktrace > >-t+ -i -a -f /var/log/ktrace.fetch.out fetch -qo ...) to find out what's > >going on behind the scenes. The irregularity of the problem (re: > >"sometimes") warrants such. I'd recommend using something other than > >127.0.0.1 as your resolver if you need to do tcpdump. > > I will try it... there will be a lot of output as there are many > cronjobs and relativelly high traffic on the webserver. But fetch > resolver failure occurred only few times a day. > > >Providing contents of your /etc/resolv.conf, as well as details about > >your network configuration on the machine (specifically if any > >firewall stacks (pf or ipfw) are in place) would help too. Some folks > >might want netstat -m output as well. > > There is nothing special in the network, the machine is Sun Fire > X2100 M2 with bge1 NIC connected to Cisco Linksys switch (100Mbps > port) with uplink (1Gbps port) connected to Cisco router with dual > 10Gbps connectivity. No firewalls in the path. There are more than > 10 other servers in the rack and we have no problems / error > messages in logs from other services / daemons related to DNS. > > # cat /etc/resolv.conf > nameserver 127.0.0.1 > > > /# netstat -m > 279/861/1140 mbufs in use (current/cache/total) > 257/553/810/25600 mbuf clusters in use (current/cache/total/max) > 257/313 mbuf+clusters out of packet secondary zone in use (current/cache) > 5/306/311/12800 4k (page size) jumbo clusters in use > (current/cache/total/max) > 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max) > 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max) > 603K/2545K/3149K bytes allocated to network (current/cache/total) > 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) > 0/0/0 requests for jumbo clusters denied (4k/9k/16k) > 13/470/6656 sfbufs in use (current/peak/max) > 0 requests for sfbufs denied > 0 requests for sfbufs delayed > 3351782 requests for I/O initiated by sendfile > 0 calls to protocol drain routines > > > (real IPs were replaced) > > # ifconfig bge1 > bge1: flags=8843 metric 0 mtu 1500 > options=9b > ether 00:1e:68:2f:71:ab > inet 1.2.3.40 netmask 0xffffff80 broadcast 1.2.3.127 > inet 1.2.3.41 netmask 0xffffffff broadcast 1.2.3.41 > inet 1.2.3.42 netmask 0xffffffff broadcast 1.2.3.42 > media: Ethernet autoselect (100baseTX ) > status: active > > > NIC is: > > bge1@pci0:6:4:1: class=0x020000 card=0x534c108e > chip=0x167814e4 rev=0xa3 hdr=0x00 > vendor = 'Broadcom Corporation' > device = 'BCM5715C 10/100/100 PCIe Ethernet Controller' > class = network > subclass = ethernet > > > There is PF with some basic rules, mostly blocking incomming > packets, allowing all outgoing and scrubbing: > > scrub in on bge1 all fragment reassemble > scrub out on bge1 all no-df random-id min-ttl 24 max-mss 1492 > fragment reassemble > > pass out on bge1 inet proto udp all keep state > pass out on bge1 inet proto tcp from 1.2.3.40 to any flags S/SA > modulate state > pass out on bge1 inet proto tcp from 1.2.3.41 to any flags S/SA > modulate state > pass out on bge1 inet proto tcp from 1.2.3.42 to any flags S/SA > modulate state > > modified PF options: > > set timeout { frag 15, interval 5 } > set limit { frags 2500, states 5000 } > set optimization aggressive > set block-policy drop > set loginterface bge1 > # Let loopback and internal interface traffic flow without restrictions > set skip on lo0 Please also provide "pfctl -s info" output, in addition to uname -a output (you can hide the hostname), since the pf stack differs depending on what FreeBSD version you're using. Things that catch my eye as potential problems -- I don't have a way to confirm these are responsible for your issue (DNS resolver lookups are UDP-based, not TCP), but I want to point them out anyway. 1) "modulate state" is broken on FreeBSD. Taken from our pf.conf notes: # Filtering (public interface only; see "set skip") # # NOTE: Do not use "modulate state", as it's known to be broken on FreeBSD. # http://lists.freebsd.org/pipermail/freebsd-pf/2008-March/004227.html 2) "optimization aggressive" sounds dangerous given what pf.conf(5) says about it. I'd like to know what it considers "idle". 3) I would also remove many of the options you have set in your "scrub out" rule. Starting with a clean slate to see if things improve is probably a good idea. As you'll see below, sometimes pf does things which may be correct per IP specification but don't work quite right with other vendors' IP stacks. 4) Your "set timeout" values look to be extreme. I would recommend leaving these at their defaults given your situation. 5) This feature is not in use in your pf.conf, but I want to point out regardless. "reassemble tcp" is also broken in some way. Again taken from our pf.conf notes: # Normalization -- resolve/reduce traffic ambiguities. # # NOTE: Do NOT use 'reassemble tcp' as it definitely causes breakage. # Issue may be related to other vendors' IP stacks, so let's leave it # disabled. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 07:25:09 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E4C321065670; Wed, 29 Sep 2010 07:25:09 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id EFA738FC0A; Wed, 29 Sep 2010 07:25:08 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA19592; Wed, 29 Sep 2010 10:24:51 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P0r23-00037A-0A; Wed, 29 Sep 2010 10:24:51 +0300 Message-ID: <4CA2E9C2.3030806@icyb.net.ua> Date: Wed, 29 Sep 2010 10:24:50 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Artem Belevich References: <4CA1D06C.9050305@digiware.nl> <20100928115047.GA62142__15392.0458550148$1285675457$gmane$org@icarus.home.lan> <4CA1DDE9.8090107@icyb.net.ua> <20100928132355.GA63149@icarus.home.lan> <4CA1EF69.4040402@icyb.net.ua> <4CA21809.7090504@icyb.net.ua> <71D54408-4B97-4F7A-BD83-692D8D23461A@wanderview.com> <4CA22337.2010900@icyb.net.ua> <4CA25E92.4060904@icyb.net.ua> <5BD33772-C0EA-48A9-BE9A-C8FBAF0008D7@wanderview.com> <4CA26AB4.3050108@icyb.net.ua> In-Reply-To: X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@freebsd.org, Willem Jan Withagen , Jeremy Chadwick , fs@freebsd.org, Ben Kelly Subject: Re: Still getting kmem exhausted panic X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 07:25:10 -0000 on 29/09/2010 03:38 Artem Belevich said the following: > On Tue, Sep 28, 2010 at 3:22 PM, Andriy Gapon wrote: >> BTW, have you seen my posts about UMA and ZFS on hackers@ ? >> I found it advantageous to use UMA for ZFS I/O buffers, but only after reducing >> size of per-CPU caches for the zones with large-sized items. >> I further modified the code in my local tree to completely disable per-CPU >> caches for items > 32KB. > > Do you have updated patch disabling per-cpu caches for large items? > I've just rebuilt FreeBSD-8 with your uma-2.diff (it needed r209050 > from -head to compile) and so far things look good. I'll re-enable UMA > for ZFS and see how it flies in a couple of days. I've just uploaded uma-3.diff. It implements what uma-1.diff did, plus totally skips per-CPU caches for items > 32KB, and also has code from uma-2.diff for flushing per-CPU caches on significant memory shortage. Will appreciate your feedback. Thank you for testing! -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 07:26:43 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 823471065670 for ; Wed, 29 Sep 2010 07:26:43 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id C56398FC1A for ; Wed, 29 Sep 2010 07:26:42 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id KAA19630; Wed, 29 Sep 2010 10:26:35 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P0r3j-00037K-GD; Wed, 29 Sep 2010 10:26:35 +0300 Message-ID: <4CA2EA2B.1040706@icyb.net.ua> Date: Wed, 29 Sep 2010 10:26:35 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jurgen Weber References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> <4CA2B753.4010107@ish.com.au> In-Reply-To: <4CA2B753.4010107@ish.com.au> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 07:26:43 -0000 on 29/09/2010 06:49 Jurgen Weber said the following: > Andriy > > You can find everything you are after here: > > http://pastebin.com/WH4V2W0F Looks like this was with ACPI disabled? Can you try to re-enable it? Also, it doesn't look like the dmesg is verbose. > On 28/09/10 8:07 PM, Andriy Gapon wrote: >> on 28/09/2010 10:54 Jurgen Weber said the following: >>> # dmesg | grep Timecounter >>> Timecounter "i8254" frequency 1193182 Hz quality 0 >>> Timecounters tick every 1.000 msec >>> # sysctl kern.timecounter.hardware >>> kern.timecounter.hardware: i8254 >>> >>> Only have one timer to choose from. >> >> Can you provide a little bit more of "hard" data than the above? >> Specifically, the following sysctls: >> kern.timecounter >> dev.cpu >> >> Output of vmstat -i. >> _Verbose_ boot dmesg. >> >> Please do not disable ACPI when taking this data. >> Preferably, upload it somewhere and post a link to it. > -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 07:29:30 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 649D3106564A for ; Wed, 29 Sep 2010 07:29:30 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta03.emeryville.ca.mail.comcast.net (qmta03.emeryville.ca.mail.comcast.net [76.96.30.32]) by mx1.freebsd.org (Postfix) with ESMTP id 4C48D8FC1B for ; Wed, 29 Sep 2010 07:29:30 +0000 (UTC) Received: from omta12.emeryville.ca.mail.comcast.net ([76.96.30.44]) by qmta03.emeryville.ca.mail.comcast.net with comcast id CXTr1f0020x6nqcA3XVW65; Wed, 29 Sep 2010 07:29:30 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta12.emeryville.ca.mail.comcast.net with comcast id CXVU1f0073LrwQ28YXVUAg; Wed, 29 Sep 2010 07:29:29 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id A44D09B418; Wed, 29 Sep 2010 00:29:28 -0700 (PDT) Date: Wed, 29 Sep 2010 00:29:28 -0700 From: Jeremy Chadwick To: Jurgen Weber Message-ID: <20100929072928.GA82955@icarus.home.lan> References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> <4CA2B753.4010107@ish.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA2B753.4010107@ish.com.au> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, Andriy Gapon Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 07:29:30 -0000 On Wed, Sep 29, 2010 at 01:49:39PM +1000, Jurgen Weber wrote: > Andriy > > You can find everything you are after here: > > http://pastebin.com/WH4V2W0F The information provided here shows ACPI is disabled in addition to the boot not being verbose. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 07:40:00 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EFF85106566C for ; Wed, 29 Sep 2010 07:39:59 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id C1C008FC08 for ; Wed, 29 Sep 2010 07:39:59 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8T7dnol061377; Wed, 29 Sep 2010 00:39:53 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009290739.o8T7dnol061377@gw.catspoiler.org> Date: Wed, 29 Sep 2010 00:39:49 -0700 (PDT) From: Don Lewis To: freebsd@jdc.parodius.com In-Reply-To: <20100929070017.GA82362@icarus.home.lan> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 07:40:00 -0000 On 29 Sep, Jeremy Chadwick wrote: > Given all the information here, in addition to the other portion of the > thread (indicating ntpd reports extreme offset between the system clock > and its stratum 1 source), I would say the motherboard is faulty or > there is a system device which is behaving badly (possibly something > pertaining to interrupts, but I don't know how to debug this on a low > level). Possible, but I haven't run into any problems running -CURRENT on this box with an SMP kernel. > Can you boot verbosely and provide all of the output here or somewhere > on the web? > If possible, I would start by replacing the mainboard. The board looks > to be a consumer-level board (I see an nfe(4) controller, for example). It's an Abit AN-M2 HD. The RAM is ECC. I haven't seen any machine check errors in the logs. I'll run prime95 as soon as I have a chance. From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 07:47:49 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E850D106566C for ; Wed, 29 Sep 2010 07:47:49 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.emeryville.ca.mail.comcast.net (qmta15.emeryville.ca.mail.comcast.net [76.96.27.228]) by mx1.freebsd.org (Postfix) with ESMTP id B5B918FC15 for ; Wed, 29 Sep 2010 07:47:49 +0000 (UTC) Received: from omta08.emeryville.ca.mail.comcast.net ([76.96.30.12]) by qmta15.emeryville.ca.mail.comcast.net with comcast id CXWo1f0010FhH24AFXnpdF; Wed, 29 Sep 2010 07:47:49 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta08.emeryville.ca.mail.comcast.net with comcast id CXno1f0093LrwQ28UXnowE; Wed, 29 Sep 2010 07:47:49 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 8DFC89B418; Wed, 29 Sep 2010 00:47:48 -0700 (PDT) Date: Wed, 29 Sep 2010 00:47:48 -0700 From: Jeremy Chadwick To: Don Lewis Message-ID: <20100929074748.GB83194@icarus.home.lan> References: <20100929070017.GA82362@icarus.home.lan> <201009290739.o8T7dnol061377@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201009290739.o8T7dnol061377@gw.catspoiler.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 07:47:50 -0000 On Wed, Sep 29, 2010 at 12:39:49AM -0700, Don Lewis wrote: > On 29 Sep, Jeremy Chadwick wrote: > > > Given all the information here, in addition to the other portion of the > > thread (indicating ntpd reports extreme offset between the system clock > > and its stratum 1 source), I would say the motherboard is faulty or > > there is a system device which is behaving badly (possibly something > > pertaining to interrupts, but I don't know how to debug this on a low > > level). > > Possible, but I haven't run into any problems running -CURRENT on this > box with an SMP kernel. > > > Can you boot verbosely and provide all of the output here or somewhere > > on the web? > > > > > If possible, I would start by replacing the mainboard. The board looks > > to be a consumer-level board (I see an nfe(4) controller, for example). > > It's an Abit AN-M2 HD. The RAM is ECC. I haven't seen any machine > check errors in the logs. I'll run prime95 as soon as I have a chance. Thanks for the verbose boot. Since it works on -CURRENT, can you provide a verbose boot from that as well? Possibly someone made some changes between RELENG_8 and HEAD which fixed an issue, which could be MFC'd. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 08:41:42 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6D6F8106564A for ; Wed, 29 Sep 2010 08:41:42 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 1B7E98FC19 for ; Wed, 29 Sep 2010 08:41:41 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8T8fVul061470; Wed, 29 Sep 2010 01:41:35 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009290841.o8T8fVul061470@gw.catspoiler.org> Date: Wed, 29 Sep 2010 01:41:31 -0700 (PDT) From: Don Lewis To: freebsd@jdc.parodius.com In-Reply-To: <20100929074748.GB83194@icarus.home.lan> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 08:41:42 -0000 On 29 Sep, Jeremy Chadwick wrote: > On Wed, Sep 29, 2010 at 12:39:49AM -0700, Don Lewis wrote: >> On 29 Sep, Jeremy Chadwick wrote: >> >> > Given all the information here, in addition to the other portion of the >> > thread (indicating ntpd reports extreme offset between the system clock >> > and its stratum 1 source), I would say the motherboard is faulty or >> > there is a system device which is behaving badly (possibly something >> > pertaining to interrupts, but I don't know how to debug this on a low >> > level). >> >> Possible, but I haven't run into any problems running -CURRENT on this >> box with an SMP kernel. >> >> > Can you boot verbosely and provide all of the output here or somewhere >> > on the web? >> >> >> >> > If possible, I would start by replacing the mainboard. The board looks >> > to be a consumer-level board (I see an nfe(4) controller, for example). >> >> It's an Abit AN-M2 HD. The RAM is ECC. I haven't seen any machine >> check errors in the logs. I'll run prime95 as soon as I have a chance. > > Thanks for the verbose boot. Since it works on -CURRENT, can you > provide a verbose boot from that as well? Possibly someone made some > changes between RELENG_8 and HEAD which fixed an issue, which could be > MFC'd. Even when I saw the wierd ntp stepping problem and the calcru messages, the system was still stable enough to build hundreds of ports. In the most recent case, I built 800+ ports over several days without any other hiccups. It could also be a difference between SMP and !SMP. I just found a bug that causes an immediate panic if lock profiling is enabled on a !SMP kernel. This bug also exists in -CURRENT. Here's the patch: Index: sys/sys/mutex.h =================================================================== RCS file: /home/ncvs/src/sys/sys/mutex.h,v retrieving revision 1.105.2.1 diff -u -r1.105.2.1 mutex.h --- sys/sys/mutex.h 3 Aug 2009 08:13:06 -0000 1.105.2.1 +++ sys/sys/mutex.h 29 Sep 2010 06:58:52 -0000 @@ -251,8 +251,11 @@ #define _rel_spin_lock(mp) do { \ if (mtx_recursed((mp))) \ (mp)->mtx_recurse--; \ - else \ + else { \ (mp)->mtx_lock = MTX_UNOWNED; \ + LOCKSTAT_PROFILE_RELEASE_LOCK(LS_MTX_SPIN_UNLOCK_RELEASE, \ + mp); \ + } \ spinlock_exit(); \ } while (0) #endif /* SMP */ After applying the above patch, I enabled lock profiling and got the following results when I ran "make index": I didn't see anything strange happening this time. I don't know if I got lucky, or the change in kernel options "fixed" the bug. From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 08:46:28 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AB3801065694 for ; Wed, 29 Sep 2010 08:46:28 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id BA06F8FC0A for ; Wed, 29 Sep 2010 08:46:27 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id LAA21537; Wed, 29 Sep 2010 11:46:24 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P0sIy-0003Dz-Ha; Wed, 29 Sep 2010 11:46:24 +0300 Message-ID: <4CA2FCDF.2060609@icyb.net.ua> Date: Wed, 29 Sep 2010 11:46:23 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Don Lewis References: <201009282100.o8SL04CY060428@gw.catspoiler.org> <201009282111.o8SLBFKB060447@gw.catspoiler.org> In-Reply-To: <201009282111.o8SLBFKB060447@gw.catspoiler.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 08:46:28 -0000 on 29/09/2010 00:11 Don Lewis said the following: > On 28 Sep, Don Lewis wrote: > > >> % vmstat -i >> interrupt total rate >> irq0: clk 60683442 1000 >> irq1: atkbd0 6 0 >> irq8: rtc 7765537 127 >> irq9: acpi0 13 0 >> irq10: ohci0 ehci1+ 10275064 169 >> irq11: fwohci0 ahc+ 132133 2 >> irq12: psm0 21 0 >> irq14: ata0 90982 1 >> irq15: nfe0 ata1 18363 0 >> >> I'm not sure why I'm getting USB interrupts. There aren't any USB >> devices plugged into this machine. > > Answer: irq 10 is also shared by vgapci0 and atapci1. Just curious why Local APIC timer isn't being used for hardclock on your system. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 08:56:24 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 090301065672 for ; Wed, 29 Sep 2010 08:56:24 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id CD5548FC1D for ; Wed, 29 Sep 2010 08:56:23 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8T8uB2W061505; Wed, 29 Sep 2010 01:56:15 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009290856.o8T8uB2W061505@gw.catspoiler.org> Date: Wed, 29 Sep 2010 01:56:11 -0700 (PDT) From: Don Lewis To: avg@icyb.net.ua In-Reply-To: <4CA2FCDF.2060609@icyb.net.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 08:56:24 -0000 On 29 Sep, Andriy Gapon wrote: > on 29/09/2010 00:11 Don Lewis said the following: >> On 28 Sep, Don Lewis wrote: >> >> >>> % vmstat -i >>> interrupt total rate >>> irq0: clk 60683442 1000 >>> irq1: atkbd0 6 0 >>> irq8: rtc 7765537 127 >>> irq9: acpi0 13 0 >>> irq10: ohci0 ehci1+ 10275064 169 >>> irq11: fwohci0 ahc+ 132133 2 >>> irq12: psm0 21 0 >>> irq14: ata0 90982 1 >>> irq15: nfe0 ata1 18363 0 >>> >>> I'm not sure why I'm getting USB interrupts. There aren't any USB >>> devices plugged into this machine. >> >> Answer: irq 10 is also shared by vgapci0 and atapci1. > > Just curious why Local APIC timer isn't being used for hardclock on your system. I'm using the same kernel config as the one on a slower !SMP box which I'm trying to squeeze as much performance out of as possible. My kernel config file contains these statements: nooptions SMP nodevice apic Testing with an SMP kernel is on my TODO list. From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 09:04:17 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1F6F71065675; Wed, 29 Sep 2010 09:04:17 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 34EB88FC1C; Wed, 29 Sep 2010 09:04:15 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id MAA22234; Wed, 29 Sep 2010 12:04:14 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P0saE-0003G5-5c; Wed, 29 Sep 2010 12:04:14 +0300 Message-ID: <4CA3010D.9080909@icyb.net.ua> Date: Wed, 29 Sep 2010 12:04:13 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Don Lewis References: <201009290856.o8T8uB2W061505@gw.catspoiler.org> In-Reply-To: <201009290856.o8T8uB2W061505@gw.catspoiler.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 09:04:17 -0000 on 29/09/2010 11:56 Don Lewis said the following: > I'm using the same kernel config as the one on a slower !SMP box which > I'm trying to squeeze as much performance out of as possible. My kernel > config file contains these statements: > nooptions SMP > nodevice apic > > Testing with an SMP kernel is on my TODO list. SMP or not, it's really weird to see apic disabled nowadays. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 14:50:18 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AEF4B106566B for ; Wed, 29 Sep 2010 14:50:18 +0000 (UTC) (envelope-from morgan.s.reed@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 4B2E08FC17 for ; Wed, 29 Sep 2010 14:50:17 +0000 (UTC) Received: by wwb17 with SMTP id 17so1103954wwb.31 for ; Wed, 29 Sep 2010 07:50:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:mime-version:received:from:date :message-id:subject:to:content-type; bh=f5V2jMlBDxxwrg6YIBCDxXvpN0qNRgSP1B2kLtA574E=; b=KlferynZkq63C5mzgyxZOuv7t7SUnT1a62YMocOEIFguOvlViK0Xtm1xaQb89gCmqX s2a3QZ6nUYMPImGf10fQoe0+CbT1l68ZDdZZD0iwke9eItTM3EpYZbKDYQrdhMVnaCk8 MyXFlvGiDHpTw9/tAQNnO5IyrIDl75eEE8Xac= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:from:date:message-id:subject:to:content-type; b=mMm6cHX/tjolsEwlG+Hc/jm2NbeB1e+T6nz1DnkIOE/rvQbhkvn9GWA/VMisB6JWdz MYiORw+Strd2bJy1wHtQhEzzaQnRC3bO7G4Kf4bxBACHl+sSHDf5AcwSmNtyszCChO7F QQTkVhRFE99k84uooYsTEmevlE1jut/JD/WuU= Received: by 10.216.5.21 with SMTP id 21mr2596369wek.20.1285770307479; Wed, 29 Sep 2010 07:25:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.216.2.138 with HTTP; Wed, 29 Sep 2010 07:24:47 -0700 (PDT) From: Morgan Reed Date: Thu, 30 Sep 2010 00:24:47 +1000 Message-ID: To: stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Subject: Diskless/readonly root booting issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 14:50:18 -0000 Hi all, I've been working on updating my semi-embedded images to 7.3-stable of late (I generally wait for .3+ releases), it's been a few years since the last time I did one of these and I'm having some issues getting my netboot test environment to behave itself. I'm sure it's something simple but I've spent quite a bit of time looking for answers and poking the system but no joy yet. Basically I use a PXE booted NFS root to test my reduced footprint image builds, the boot is working but init is attempting to remount / rw (in spite of it being marked ro in fstab) which of course fails because the directory is exported ro from the NFS server at which point the system dumps me to single user mode; === OUTPUT === Starting file system checks: udp: Netconfig database not found Mounting root filesystem rw failed, startup aborted ERROR: ABORTING BOOT (sending SIGTERM to parent)! Sep 30 09:60:02 init: /bin/sh on /etc/rc terminated abnormally, going to single user mode Enter full pathname of shell or RETURN for /bin/sh: ============ Relevant configs from the diskless root == rc.conf == ifconfig_le0="DHCP" diskless_mount=/etc/rc.initdiskless varsize=8192 varmfs="YES" tmpsize=8192 tmpmfs="YES" nfs_client_enable="YES" dumpdev="NO" ========= rc.initdiskless is the version from /usr/share/examples/rc.initdiskless == fstab == 192.168.2.2:/usr/fbtest / nfs ro 0 0 proc /proc procfs rw 0 0 ======== == loader.conf == verbose_loading="YES" autoboot_delay="2" ============ Kernel is (obviously) built with NFS_ROOT and NFSCLIENT, relatively minimalist otherwise, have also tested with GENERIC, same result. I must be forgetting something simple in all of this, I don't recall it being terribly difficult to get this stuff working when I was doing my original work with 6.3, though I don't recall the use of the initdiskless script, IIRC I was using rc.diskless2 which (again IIRC) was later replaced by /etc/rc.d/diskless but I've not been able to find this script anywhere. Any suggestions would be greatly appreciated at this point. Thanks, Morgan Reed From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 15:44:10 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 68DC7106566B for ; Wed, 29 Sep 2010 15:44:10 +0000 (UTC) (envelope-from vmagerya@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id E55F58FC13 for ; Wed, 29 Sep 2010 15:44:09 +0000 (UTC) Received: by bwz15 with SMTP id 15so826169bwz.13 for ; Wed, 29 Sep 2010 08:44:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=3HX/YSzft76Mwx3RWt5YH6ySJH45nqfBWHY8NSV5FeY=; b=bEu0Tr3yjQcyeOBsmOj6BUdZtWHQD0KnhnBKe17fYusyolVmru5nzXMNqVATgE2/CB 8D1et7RJVnxboOGUKU2kExQeFyixjxzi6bmTghNDvcbHlIct+7033QazlxSukWLxt1x7 wLscjAXgyKesGfhxYOlStJ2SMKiFBUTaeKk7I= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=Kt3ccjIhZ/V7tRJjoMtMFJTriBESaCoLwhOV9UF5iULH7p/oMPXS4uXcHXveq6F43+ GADK8q0DKvHzzBVRfks0Wgs9CxA629SSft1jqMetKzs6MWnm4F/+wEfanmzHLiBdKSDe Ig8e4ehd8Obm/R5iaeBW8sf5oxEc5bXCn2Q3A= Received: by 10.204.81.203 with SMTP id y11mr1340090bkk.152.1285775048903; Wed, 29 Sep 2010 08:44:08 -0700 (PDT) Received: from [172.16.0.6] (tx97.net [85.198.160.156]) by mx.google.com with ESMTPS id y19sm6795997bkw.18.2010.09.29.08.44.06 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 29 Sep 2010 08:44:07 -0700 (PDT) Message-ID: <4CA35E64.1040101@gmail.com> Date: Wed, 29 Sep 2010 18:42:28 +0300 From: Vitaly Magerya User-Agent: Thunderbird MIME-Version: 1.0 To: Chuck Swiger References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> In-Reply-To: <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 15:44:10 -0000 Chuck Swiger wrote: >> MCA: Bank 1, Status 0xe2000000000001f5 >> MCA: Global Cap 0x0000000000000005, Status 0x0000000000000000 >> MCA: Vendor "GenuineIntel", ID 0x695, APIC ID 0 >> MCA: CPU 0 UNCOR PCC OVER DCACHE L1 ??? error > > That is very likely to be a matter of luck. If I translate this MCA right, > it looks to be an uncorrected error in L1 data cache on the CPU. Try to run > something like prime95's torture test mode and see whether it fails overnight.... The test run for 17 hours without any problems (or MCA messages), then I put the laptop for a 5 minute sleep, resumed the test, and after 30 minutes of it there are now two MCA messages in dmesg. Are they somehow related, or is this a coincidence? From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 16:58:21 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3E61C106566C for ; Wed, 29 Sep 2010 16:58:21 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from asmtpout023.mac.com (asmtpout023.mac.com [17.148.16.98]) by mx1.freebsd.org (Postfix) with ESMTP id 2384F8FC18 for ; Wed, 29 Sep 2010 16:58:20 +0000 (UTC) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; charset=us-ascii Received: from cswiger1.apple.com ([17.209.4.71]) by asmtp023.mac.com (Sun Java(tm) System Messaging Server 6.3-8.01 (built Dec 16 2008; 32bit)) with ESMTPSA id <0L9I00CW8PSH6K30@asmtp023.mac.com> for freebsd-stable@freebsd.org; Wed, 29 Sep 2010 09:57:53 -0700 (PDT) X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=6.0.2-1004200000 definitions=main-1009290103 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.0.10011,1.0.148,0.0.0000 definitions=2010-09-29_08:2010-09-29, 2010-09-29, 1970-01-01 signatures=0 From: Chuck Swiger In-reply-to: <4CA35E64.1040101@gmail.com> Date: Wed, 29 Sep 2010 09:57:53 -0700 Message-id: <0FDB4144-8BE4-4BA5-B911-8652E07D60C2@mac.com> References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> <4CA35E64.1040101@gmail.com> To: Vitaly Magerya X-Mailer: Apple Mail (2.1081) Cc: freebsd-stable@freebsd.org Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 16:58:21 -0000 Hi-- On Sep 29, 2010, at 8:42 AM, Vitaly Magerya wrote: > The test run for 17 hours without any problems (or MCA messages), That part is good. At least starting from normal operation, your laptop is running stably under load.... > then I put the laptop for a 5 minute sleep, resumed the test, and after 30 > minutes of it there are now two MCA messages in dmesg. Are they somehow > related, or is this a coincidence? I doubt repeated coincidences. :-) Is prime95 testing running stable after waking from sleep? Regards, -- -Chuck From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 17:07:59 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8BAEF1065670 for ; Wed, 29 Sep 2010 17:07:59 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.emeryville.ca.mail.comcast.net (qmta15.emeryville.ca.mail.comcast.net [76.96.27.228]) by mx1.freebsd.org (Postfix) with ESMTP id 626D08FC14 for ; Wed, 29 Sep 2010 17:07:58 +0000 (UTC) Received: from omta18.emeryville.ca.mail.comcast.net ([76.96.30.74]) by qmta15.emeryville.ca.mail.comcast.net with comcast id Ccu11f0041bwxycAFh7ye9; Wed, 29 Sep 2010 17:07:58 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta18.emeryville.ca.mail.comcast.net with comcast id Ch7x1f00R3LrwQ28eh7y1e; Wed, 29 Sep 2010 17:07:58 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id A755C9B418; Wed, 29 Sep 2010 10:07:57 -0700 (PDT) Date: Wed, 29 Sep 2010 10:07:57 -0700 From: Jeremy Chadwick To: Chuck Swiger Message-ID: <20100929170757.GA94672@icarus.home.lan> References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> <4CA35E64.1040101@gmail.com> <0FDB4144-8BE4-4BA5-B911-8652E07D60C2@mac.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0FDB4144-8BE4-4BA5-B911-8652E07D60C2@mac.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, Vitaly Magerya Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 17:07:59 -0000 On Wed, Sep 29, 2010 at 09:57:53AM -0700, Chuck Swiger wrote: > Hi-- > > On Sep 29, 2010, at 8:42 AM, Vitaly Magerya wrote: > > The test run for 17 hours without any problems (or MCA messages), > > That part is good. At least starting from normal operation, your laptop is running stably under load.... > > > then I put the laptop for a 5 minute sleep, resumed the test, and after 30 > > minutes of it there are now two MCA messages in dmesg. Are they somehow > > related, or is this a coincidence? > > I doubt repeated coincidences. :-) Is prime95 testing running stable after waking from sleep? He's not running Prime95 (native Win32 app), he's running ports/math/mprime under FreeBSD natively. I don't know if this application stresses hardware to the same degree Prime95 does; I've used Prime95 many times to burn in new workstations. The Thinkpad hardware he's on is """old""" (note the quotes), so I wouldn't be surprised if the CPU (Intel Pentium M) happens to induce a strange/odd MCA event as a result of going in/out of sleep state. It could be a general system bug of some sort as well (one which has no repercussions). Look at it this way: if his L1 cache was going bad, his system would be freaking out doing literally anything (booting the kernel for example); I'm under the impression Pentium M CPUs do not have ECC L1 cache. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 17:26:03 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E669F106564A for ; Wed, 29 Sep 2010 17:26:03 +0000 (UTC) (envelope-from vmagerya@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 530E88FC15 for ; Wed, 29 Sep 2010 17:26:02 +0000 (UTC) Received: by bwz15 with SMTP id 15so951282bwz.13 for ; Wed, 29 Sep 2010 10:26:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=u9dVemSwt3UKmW6ifWGeCQwa8kxxTiAomccE7dC3cH8=; b=iAeC2IjA+XWcIVvsPckb1ayFr/eZwb7PR7WwRyshiqV1FyntWKbCnQgpb94efdpWxt TlQwzbU9Q54/UIYK3LGz92g2zpCZU7bvw+5ajUHPy0uVBc+7DsU8KCx2H+gYEn4PL7+d cYFVQL8Ho4BFwYStPU66MaB3Oa46ufpaZoqE8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=sozXOEBxrRAFGnob120LLjFMXA+uhjlRG+q9zijWCGRj6jZVJltgnRuiy6rZNtV4oZ KqRxgny+UkFpRV7y2cJEP0dcFxrlRgkBmVDBGwoNGjmtHoZPkRs6a3Xc8mJmIyVRCrjW lPJIwiz3KoXIE9az51aNogpuxJqKgO4ksXTN0= Received: by 10.204.63.9 with SMTP id z9mr1551379bkh.66.1285781162098; Wed, 29 Sep 2010 10:26:02 -0700 (PDT) Received: from [172.16.0.6] (tx97.net [85.198.160.156]) by mx.google.com with ESMTPS id y2sm6891753bkx.8.2010.09.29.10.26.00 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 29 Sep 2010 10:26:01 -0700 (PDT) Message-ID: <4CA37645.8080303@gmail.com> Date: Wed, 29 Sep 2010 20:24:21 +0300 From: Vitaly Magerya User-Agent: Thunderbird MIME-Version: 1.0 To: Jeremy Chadwick References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> <4CA35E64.1040101@gmail.com> <0FDB4144-8BE4-4BA5-B911-8652E07D60C2@mac.com> <20100929170757.GA94672@icarus.home.lan> In-Reply-To: <20100929170757.GA94672@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 17:26:04 -0000 Jeremy Chadwick wrote: > On Wed, Sep 29, 2010 at 09:57:53AM -0700, Chuck Swiger wrote: >> Is prime95 testing running stable after waking from sleep? Yes, 0 errors, 0 warnings. > The Thinkpad hardware he's on is """old""" (note the quotes), so I > wouldn't be surprised if the CPU (Intel Pentium M) happens to induce a > strange/odd MCA event as a result of going in/out of sleep state. It > could be a general system bug of some sort as well (one which has no > repercussions). Well, since it causes no other visible problems, it might just as well be a false alarm. From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 17:35:15 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C1796106567A for ; Wed, 29 Sep 2010 17:35:15 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [76.96.30.16]) by mx1.freebsd.org (Postfix) with ESMTP id A225D8FC16 for ; Wed, 29 Sep 2010 17:35:15 +0000 (UTC) Received: from omta06.emeryville.ca.mail.comcast.net ([76.96.30.51]) by qmta01.emeryville.ca.mail.comcast.net with comcast id CbtW1f00416AWCUA1hbFx2; Wed, 29 Sep 2010 17:35:15 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta06.emeryville.ca.mail.comcast.net with comcast id ChbE1f0033LrwQ28ShbESQ; Wed, 29 Sep 2010 17:35:14 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id CCEA99B418; Wed, 29 Sep 2010 10:35:13 -0700 (PDT) Date: Wed, 29 Sep 2010 10:35:13 -0700 From: Jeremy Chadwick To: Chuck Swiger Message-ID: <20100929173513.GA95222@icarus.home.lan> References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> <4CA35E64.1040101@gmail.com> <0FDB4144-8BE4-4BA5-B911-8652E07D60C2@mac.com> <20100929170757.GA94672@icarus.home.lan> <2B9D8374-AA0A-4F2C-9681-5216204859F8@mac.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2B9D8374-AA0A-4F2C-9681-5216204859F8@mac.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, Vitaly Magerya Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 17:35:15 -0000 On Wed, Sep 29, 2010 at 10:16:13AM -0700, Chuck Swiger wrote: > On Sep 29, 2010, at 10:07 AM, Jeremy Chadwick wrote: > > On Wed, Sep 29, 2010 at 09:57:53AM -0700, Chuck Swiger wrote: > >> > >> I doubt repeated coincidences. :-) Is prime95 testing running stable after waking from sleep? > > > > He's not running Prime95 (native Win32 app), he's running > > ports/math/mprime under FreeBSD natively. I don't know if this > > application stresses hardware to the same degree Prime95 does; I've used > > Prime95 many times to burn in new workstations. > > It's doing the same math operations; something like "mprime -t" is the same as the Win32 test mode per the docs: > > -t Run the torture test. Same as Options/Torture Test. > > > The Thinkpad hardware he's on is """old""" (note the quotes), so I > > wouldn't be surprised if the CPU (Intel Pentium M) happens to induce a > > strange/odd MCA event as a result of going in/out of sleep state. It > > could be a general system bug of some sort as well (one which has no > > repercussions). > > That sounds reasonable to me, but I'm wary of uncorrected errors which seem to be reproducible to specific circumstances. > > > Look at it this way: if his L1 cache was going bad, his system would be > > freaking out doing literally anything (booting the kernel for example); > > I'm under the impression Pentium M CPUs do not have ECC L1 cache. > > Sure, if the MCA report is reflecting a legitimate problem, and it was happening more often than every few minutes, and it happened after a cold reboot rather than after wakeup from sleep.... :-) > > I place more faith in ~17 hours of Prime95/mprime working OK to validate that the hardware is not obviously broken. Oh, absolutely. If anything my statement was indirectly agreeing with your recommended test (sans being unsure how mprime behaved). :-) I wonder if there's CPU errata or something along those lines which might explain the behaviour. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 17:37:39 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5BE3F1065672 for ; Wed, 29 Sep 2010 17:37:39 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta14.westchester.pa.mail.comcast.net (qmta14.westchester.pa.mail.comcast.net [76.96.59.212]) by mx1.freebsd.org (Postfix) with ESMTP id 043A98FC1B for ; Wed, 29 Sep 2010 17:37:38 +0000 (UTC) Received: from omta11.westchester.pa.mail.comcast.net ([76.96.62.36]) by qmta14.westchester.pa.mail.comcast.net with comcast id CaxF1f0020mv7h05EhdfKh; Wed, 29 Sep 2010 17:37:39 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta11.westchester.pa.mail.comcast.net with comcast id Chdd1f0013LrwQ23XhddGE; Wed, 29 Sep 2010 17:37:38 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id C75EA9B418; Wed, 29 Sep 2010 10:37:35 -0700 (PDT) Date: Wed, 29 Sep 2010 10:37:35 -0700 From: Jeremy Chadwick To: Vitaly Magerya Message-ID: <20100929173735.GB95222@icarus.home.lan> References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> <4CA35E64.1040101@gmail.com> <0FDB4144-8BE4-4BA5-B911-8652E07D60C2@mac.com> <20100929170757.GA94672@icarus.home.lan> <4CA37645.8080303@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA37645.8080303@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 17:37:39 -0000 On Wed, Sep 29, 2010 at 08:24:21PM +0300, Vitaly Magerya wrote: > Jeremy Chadwick wrote: > > On Wed, Sep 29, 2010 at 09:57:53AM -0700, Chuck Swiger wrote: > >> Is prime95 testing running stable after waking from sleep? > > Yes, 0 errors, 0 warnings. > > > The Thinkpad hardware he's on is """old""" (note the quotes), so I > > wouldn't be surprised if the CPU (Intel Pentium M) happens to induce a > > strange/odd MCA event as a result of going in/out of sleep state. It > > could be a general system bug of some sort as well (one which has no > > repercussions). > > Well, since it causes no other visible problems, it might just as well > be a false alarm. Highly possible. If it bothers you to the point where you'd rather not see it, you can disable MCA events by setting hw.mca.enabled="0" in /boot/loader.conf. I don't know of a way to conditionally ignore certain MCAs. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 18:04:20 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1C58106566C for ; Wed, 29 Sep 2010 18:04:20 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id 6C6FB8FC14 for ; Wed, 29 Sep 2010 18:04:19 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 2976C509A8 for ; Wed, 29 Sep 2010 19:04:19 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XHZClX6szfLZ for ; Wed, 29 Sep 2010 19:04:18 +0100 (BST) Received: from nyi.unixathome.org (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 8864A509A3 for ; Wed, 29 Sep 2010 19:04:18 +0100 (BST) Received: from 68.64.144.221 (SquirrelMail authenticated user dan) by nyi.unixathome.org with HTTP; Wed, 29 Sep 2010 14:04:18 -0400 Message-ID: Date: Wed, 29 Sep 2010 14:04:18 -0400 From: "Dan Langille" To: freebsd-stable@freebsd.org User-Agent: SquirrelMail/1.4.20-RC2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Subject: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 18:04:20 -0000 It's taken about 15 hours to copy 800GB. I'm sure there's some tuning I can do. The system is now running: # zfs send storage/bacula@transfer | zfs receive storage/compressed/bacula All the drives are ATA-8 SATA 2.x device from systat: 1 users Load 0.36 0.58 0.57 Sep 29 13:47 Mem:KB REAL VIRTUAL VN PAGER SWAP PAGER Tot Share Tot Share Free in out in out Act 42012 7584 544044 11028 204492 count All 962356 8736 1074363k 18220 pages Proc: Interrupts r p d s w Csw Trp Sys Int Sof Flt 141 cow 9951 total 42 23k 668 3094 1951 2166 657 288 zfod ohci0 ohci ozfod ohci2 ohci 13.6%Sys 0.8%Intr 0.2%User 0.0%Nice 85.5%Idle %ozfod ahc0 irq20 | | | | | | | | | | | daefr ahci0 22 ======= 366 prcfr 2000 cpu0: time 26 dtbuf 47129 totfr 3 em0 irq256 Namei Name-cache Dir-cache 100000 desvn react 892 siis0 257 Calls hits % hits % 87983 numvn pdwak 1056 siis1 259 4608 4608 100 24981 frevn pdpgs 2000 cpu3: time intrn 2000 cpu1: time Disks ada0 ada1 ada2 ada3 ada4 ada5 ada6 1355484 wire 2000 cpu2: time KB/t 35.95 37.00 36.75 41.44 40.05 40.86 41.11 25936 act tps 306 299 301 267 276 271 269 2452756 inact MB/s 10.75 10.82 10.79 10.79 10.80 10.80 10.81 76664 cache %busy 27 50 25 37 27 27 27 127828 free 427728 buf $ zpool iostat 10 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- storage 7.67T 5.02T 358 38 43.1M 1.96M storage 7.67T 5.02T 317 475 39.4M 30.9M storage 7.67T 5.02T 357 533 44.3M 34.4M storage 7.67T 5.02T 371 556 46.0M 35.8M storage 7.67T 5.02T 313 521 38.9M 28.7M storage 7.67T 5.02T 309 457 38.4M 30.4M storage 7.67T 5.02T 388 589 48.2M 37.8M storage 7.67T 5.02T 377 581 46.8M 36.5M storage 7.67T 5.02T 310 559 38.4M 30.4M storage 7.67T 5.02T 430 611 53.4M 41.3M $ zfs get all storage/compressed NAME PROPERTY VALUE SOURCE storage/compressed type filesystem - storage/compressed creation Tue Sep 28 20:35 2010 - storage/compressed used 856G - storage/compressed available 3.38T - storage/compressed referenced 44.8K - storage/compressed compressratio 1.60x - storage/compressed mounted yes - storage/compressed quota none default storage/compressed reservation none default storage/compressed recordsize 128K default storage/compressed mountpoint /storage/compressed default storage/compressed sharenfs off default storage/compressed checksum on default storage/compressed compression on local storage/compressed atime on default storage/compressed devices on default storage/compressed exec on default storage/compressed setuid on default storage/compressed readonly off default storage/compressed jailed off default storage/compressed snapdir hidden default storage/compressed aclmode groupmask default storage/compressed aclinherit restricted default storage/compressed canmount on default storage/compressed shareiscsi off default storage/compressed xattr off temporary storage/compressed copies 1 default storage/compressed version 4 - storage/compressed utf8only off - storage/compressed normalization none - storage/compressed casesensitivity sensitive - storage/compressed vscan off default storage/compressed nbmand off default storage/compressed sharesmb off default storage/compressed refquota none default storage/compressed refreservation none default storage/compressed primarycache all default storage/compressed secondarycache all default storage/compressed usedbysnapshots 0 - storage/compressed usedbydataset 44.8K - storage/compressed usedbychildren 856G - storage/compressed usedbyrefreservation 0 - $ less /var/run/dmesg.boot Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.1-STABLE #0: Sat Sep 18 23:43:48 EDT 2010 dan@kraken.example.org:/usr/obj/usr/src/sys/KRAKEN amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: AMD Phenom(tm) II X4 945 Processor (3010.17-MHz K8-class CPU) Origin = "AuthenticAMD" Id = 0x100f42 Family = 10 Model = 4 Stepping = 2 Features=0x178bfbff Features2=0x802009 AMD Features=0xee500800 AMD Features2=0x37ff TSC: P-state invariant real memory = 4294967296 (4096 MB) avail memory = 4100673536 (3910 MB) ACPI APIC Table: <111909 APIC1708> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ACPI Warning: Optional field Pm2ControlBlock has zero address or length: 0x0000000000000000/0x1 (20100331/tbfadt-655) ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: <111909 RSDT1708> on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of fee00000, 1000 (3) failed acpi0: reservation of ffb80000, 80000 (3) failed acpi0: reservation of fec10000, 20 (3) failed acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, dfe00000 (3) failed ACPI HPET table warning: Sequence is non-zero (2) Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0 cpu0: on acpi0 cpu1: on acpi0 cpu2: on acpi0 cpu3: on acpi0 acpi_hpet0: iomem 0xfed00000-0xfed003ff on acpi0 Timecounter "HPET" frequency 14318180 Hz quality 900 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: irq 18 at device 2.0 on pci0 pci7: on pcib1 em0: port 0xec00-0xec1f mem 0xfbfe0000-0xfbffffff,0xfbf00000-0xfbf7ffff,0xfbfdc000-0xfbfdffff irq 18 at device 0.0 on pci7 em0: Using MSI interrupt em0: [FILTER] em0: Ethernet address: 00:1b:21:51:ab:2d pcib2: irq 17 at device 5.0 on pci0 pci5: on pcib2 pcib3: irq 17 at device 0.0 on pci5 pci6: on pcib3 siis0: port 0xdc00-0xdc0f mem 0xfbeffc00-0xfbeffc7f,0xfbef0000-0xfbef7fff irq 17 at device 4.0 on pci6 siis0: [ITHREAD] siisch0: at channel 0 on siis0 siisch0: [ITHREAD] siisch1: at channel 1 on siis0 siisch1: [ITHREAD] siisch2: at channel 2 on siis0 siisch2: [ITHREAD] siisch3: at channel 3 on siis0 siisch3: [ITHREAD] pcib4: irq 18 at device 6.0 on pci0 pci4: on pcib4 re0: port 0xc800-0xc8ff mem 0xfbdff000-0xfbdfffff irq 18 at device 0.0 on pci4 re0: Using 1 MSI messages re0: Chip rev. 0x38000000 re0: MAC rev. 0x00000000 miibus0: on re0 rgephy0: PHY 1 on miibus0 rgephy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto re0: Ethernet address: e0:cb:4e:42:f0:ff re0: [FILTER] pcib5: irq 19 at device 11.0 on pci0 pci2: on pcib5 pcib6: irq 19 at device 0.0 on pci2 pci3: on pcib6 siis1: port 0xbc00-0xbc0f mem 0xfbcffc00-0xfbcffc7f,0xfbcf0000-0xfbcf7fff irq 19 at device 4.0 on pci3 siis1: [ITHREAD] siisch4: at channel 0 on siis1 siisch4: [ITHREAD] siisch5: at channel 1 on siis1 siisch5: [ITHREAD] siisch6: at channel 2 on siis1 siisch6: [ITHREAD] siisch7: at channel 3 on siis1 siisch7: [ITHREAD] ahci0: port 0x9000-0x9007,0x8000-0x8003,0x7000-0x7007,0x6000-0x6003,0x5000-0x500f mem 0xfb6fe400-0xfb6fe7ff irq 22 at device 17.0 on pci0 ahci0: [ITHREAD] ahci0: AHCI v1.10 with 4 3Gbps ports, Port Multiplier supported ahcich0: at channel 0 on ahci0 ahcich0: [ITHREAD] ahcich1: at channel 1 on ahci0 ahcich1: [ITHREAD] ahcich2: at channel 2 on ahci0 ahcich2: [ITHREAD] ahcich3: at channel 3 on ahci0 ahcich3: [ITHREAD] ohci0: mem 0xfb6fa000-0xfb6fafff irq 16 at device 18.0 on pci0 ohci0: [ITHREAD] usbus0: on ohci0 ohci1: mem 0xfb6fb000-0xfb6fbfff irq 16 at device 18.1 on pci0 ohci1: [ITHREAD] usbus1: on ohci1 ehci0: mem 0xfb6fe800-0xfb6fe8ff irq 17 at device 18.2 on pci0 ehci0: [ITHREAD] ehci0: AMD SB600/700 quirk applied usbus2: EHCI version 1.0 usbus2: on ehci0 ohci2: mem 0xfb6fc000-0xfb6fcfff irq 18 at device 19.0 on pci0 ohci2: [ITHREAD] usbus3: on ohci2 ohci3: mem 0xfb6fd000-0xfb6fdfff irq 18 at device 19.1 on pci0 ohci3: [ITHREAD] usbus4: on ohci3 ehci1: mem 0xfb6fec00-0xfb6fecff irq 19 at device 19.2 on pci0 ehci1: [ITHREAD] ehci1: AMD SB600/700 quirk applied usbus5: EHCI version 1.0 usbus5: on ehci1 pci0: at device 20.0 (no driver attached) atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xff00-0xff0f at device 20.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] isab0: at device 20.3 on pci0 isa0: on isab0 pcib7: at device 20.4 on pci0 pci1: on pcib7 ahc0: port 0xa800-0xa8ff mem 0xfb7df000-0xfb7dffff irq 20 at device 5.0 on pci1 ahc0: [ITHREAD] aic7880: Ultra Wide Channel A, SCSI Id=7, 16/253 SCBs vgapci0: mem 0xfb800000-0xfbbfffff,0xfb7f0000-0xfb7fffff irq 21 at device 6.0 on pci1 ohci4: mem 0xfb6ff000-0xfb6fffff irq 18 at device 20.5 on pci0 ohci4: [ITHREAD] usbus6: on ohci4 acpi_button0: on acpi0 atrtc0: port 0x70-0x71 irq 8 on acpi0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] orm0: at iomem 0xc0000-0xc7fff,0xc8000-0xc87ff,0xc8800-0xc97ff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] ppc0: cannot reserve I/O port range acpi_throttle0: on cpu0 hwpstate0: on cpu0 Timecounters tick every 1.000 msec (noperiph:siisch0:0:-1:-1): rescan already queued (noperiph:siisch2:0:-1:-1): rescan already queued (noperiph:siisch3:0:-1:-1): rescan already queued (noperiph:siisch4:0:-1:-1): rescan already queued (noperiph:siisch5:0:-1:-1): rescan already queued (noperiph:siisch6:0:-1:-1): rescan already queued (noperiph:siisch7:0:-1:-1): rescan already queued usbus0: 12Mbps Full Speed USB v1.0 usbus1: 12Mbps Full Speed USB v1.0 usbus2: 480Mbps High Speed USB v2.0 usbus3: 12Mbps Full Speed USB v1.0 usbus4: 12Mbps Full Speed USB v1.0 usbus5: 480Mbps High Speed USB v2.0 usbus6: 12Mbps Full Speed USB v1.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 ugen2.1: at usbus2 uhub2: on usbus2 ugen3.1: at usbus3 uhub3: on usbus3 ugen4.1: at usbus4 uhub4: on usbus4 ugen5.1: at usbus5 uhub5: on usbus5 ugen6.1: at usbus6 uhub6: on usbus6 uhub6: 2 ports with 2 removable, self powered uhub0: 3 ports with 3 removable, self powered uhub1: 3 ports with 3 removable, self powered uhub3: 3 ports with 3 removable, self powered uhub4: 3 ports with 3 removable, self powered uhub2: 6 ports with 6 removable, self powered uhub5: 6 ports with 6 removable, self powered ada0 at siisch0 bus 0 scbus0 target 0 lun 0 ada0: ATA-8 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled ada0: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) ada1 at siisch2 bus 0 scbus2 target 0 lun 0 ada1: ATA-8 SATA 2.x device ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada1: Command Queueing enabled ada1: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) ada2 at siisch3 bus 0 scbus3 target 0 lun 0 ada2: ATA-8 SATA 2.x device ada2: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada2: Command Queueing enabled ada2: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) ada3 at siisch4 bus 0 scbus4 target 0 lun 0 ada3: ATA-8 SATA 2.x device ada3: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada3: Command Queueing enabled ada3: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) ada4 at siisch5 bus 0 scbus5 target 0 lun 0 ada4: ATA-8 SATA 2.x device ada4: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada4: Command Queueing enabled ada4: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) ada5 at siisch6 bus 0 scbus6 target 0 lun 0 ada5: ATA-8 SATA 2.x device ada5: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada5: Command Queueing enabled ada5: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) ada6 at siisch7 bus 0 scbus7 target 0 lun 0 ada6: ATA-8 SATA 2.x device ada6: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada6: Command Queueing enabled ada6: 1907729MB (3907029168 512 byte sectors: 16H 63S/T 16383C) ada7 at ahcich0 bus 0 scbus8 target 0 lun 0 ada7: ATA-7 SATA 2.x device ada7: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada7: Command Queueing enabled ada7: 76319MB (156301488 512 byte sectors: 16H 63S/T 16383C) ada8 at ahcich2 bus 0 scbus10 target 0 lun 0 ada8: ATA-8 SATA 2.x device ada8: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada8: Command Queueing enabled ada8: 152587MB (312500000 512 byte sectors: 16H 63S/T 16383C) SMP: AP CPU #3 Launched! cd0 at ahcich1 bus 0 scbus9 target 0 lun 0SMP: AP CPU #1 Launched! cd0: Removable CD-ROM SCSI-0 device cd0: 150.000MB/s transfers (SATA 1.x, UDMA5, ATAPI 12bytes, PIO 8192bytes)SMP: AP CPU #2 Launched! cd0: Attempt to query device size failed: NOT READY, Medium not present - tray closed GEOM_MIRROR: Device mirror/gm0 launched (1/2). GEOM_MIRROR: Device gm0: rebuilding provider ada7. GEOM: mirror/gm0s1: geometry does not match label (16h,63s != 255h,63s). Trying to mount root from ufs:/dev/mirror/gm0s1a WARNING: / was not properly dismounted ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present; to enable, add "vfs.zfs.prefetch_disable=0" to /boot/loader.conf. ZFS filesystem version 4 ZFS storage pool version 15 WARNING: /tmp was not properly dismounted WARNING: /usr was not properly dismounted WARNING: /var was not properly dismounted -- Dan Langille -- http://langille.org/ From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 19:57:28 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 826E6106564A for ; Wed, 29 Sep 2010 19:57:28 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0CD058FC0A for ; Wed, 29 Sep 2010 19:57:27 +0000 (UTC) Received: by bwz15 with SMTP id 15so1124242bwz.13 for ; Wed, 29 Sep 2010 12:57:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=DpIrR9HCm56gCM8i0VJzZ3d4j7B1POvR9NCLUxhZp24=; b=wg3jADcF9Np2opjCJP0ZUeZUIJD6POG+BF4h/0s1rM9ZH099OQA4sRtMgvkIoLTwUP YAT8gXWQNtNftvu7nK+7Hcl8LBU/Dxe7a0nLJomr1AYpzSvszDdVOXKtg9KZe+iTMogb RjzarC00bT+uTLnDs/QiV3keb6f5JdyZixeC0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=ogwDr+DYdfqBPBNrPsvsDSxZM6whCUdocrHxyHyINZ+u+EIFoJlc1H7pS6hjzc+vzd G6zuD6esC148B/wZBq8xKIriE34Is3qn/z4V3CcpH0HaeWKu64AI7pKrf4vIL0CxmUEq xovp9NL2fXrSWqpWLQowD9Z7/SuHaxOu2jQA0= MIME-Version: 1.0 Received: by 10.204.123.137 with SMTP id p9mr1583786bkr.206.1285790246853; Wed, 29 Sep 2010 12:57:26 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.220.176.77 with HTTP; Wed, 29 Sep 2010 12:57:26 -0700 (PDT) In-Reply-To: References: Date: Wed, 29 Sep 2010 12:57:26 -0700 X-Google-Sender-Auth: 8IGHdFDATTEt_m7I4tN78i9er70 Message-ID: From: Artem Belevich To: Dan Langille Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 19:57:28 -0000 On Wed, Sep 29, 2010 at 11:04 AM, Dan Langille wrote: > It's taken about 15 hours to copy 800GB. =A0I'm sure there's some tuning = I > can do. > > The system is now running: > > # zfs send storage/bacula@transfer | zfs receive storage/compressed/bacul= a Try piping zfs data through mbuffer (misc/mbuffer in ports). I've found that it does help a lot to smooth out data flow and increase send/receive throughput even when send/receive happens on the same host. Run it with a buffer large enough to accommodate few seconds worth of write throughput for your target disks. Here's an example: http://blogs.everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slo= w-zfs-send-zfs-receive/ --Artem From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 21:41:58 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D6D1B106566B for ; Wed, 29 Sep 2010 21:41:58 +0000 (UTC) (envelope-from kc5vdj.freebsd@gmail.com) Received: from mail-iw0-f182.google.com (mail-iw0-f182.google.com [209.85.214.182]) by mx1.freebsd.org (Postfix) with ESMTP id 744758FC14 for ; Wed, 29 Sep 2010 21:41:58 +0000 (UTC) Received: by iwn34 with SMTP id 34so1902414iwn.13 for ; Wed, 29 Sep 2010 14:41:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=Sdcgaf7/J73owxVfZrWy65z04vdTrLAHVaTK6qEoJXs=; b=iNh1pwc8rP/JoZHeVT1aqhlD1yqGlvP9HYsgeZn4glIgz/h82k+ctZg8PSeMlpIXS+ 8+CitsmHYwHgFl7pAbGVIzU62YcCimvA38D7lBCS5f2U5N4HVHmMKX0jA4RLKorPCKJr NZ0bdEuovd/srUuxSjgqnVqN8x+fQtiRNUktc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=nQBbX2qrqZL46ooZiaD48PVlX67hJ9wubruIdoXqPNz0NtrWITahkYi5L90bYB4iSJ nunSR2kyMTjjJIXYYv8yvIdJb2cVMUJxjYZEhurm0qWomruV6vGEnq0q/7zS+zU7RiUK L+A7IUiwT87ot+bxLZIpexrWJaFPXoHWfH3Lo= Received: by 10.231.15.76 with SMTP id j12mr2481592iba.30.1285796508672; Wed, 29 Sep 2010 14:41:48 -0700 (PDT) Received: from argus.electron-tube.net (desm-47-213.dsl.netins.net [167.142.47.213]) by mx.google.com with ESMTPS id h8sm9362198ibk.3.2010.09.29.14.41.47 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 29 Sep 2010 14:41:48 -0700 (PDT) Message-ID: <4CA3AFA2.6070703@gmail.com> Date: Wed, 29 Sep 2010 16:29:06 -0500 From: Jim Bryant User-Agent: Thunderbird 2.0.0.24 (X11/20100911) MIME-Version: 1.0 To: Warren Block References: <4C9A7943.1020806@gmail.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: wifi issues under -stable X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 21:41:59 -0000 Warren Block wrote: > On Wed, 22 Sep 2010, Jim Bryant wrote: > >> i have two laptops, both are Compaq(HP) C300 series using the >> motherboards with the 945GM chipset and using T7200 and T7600 Core2 >> Duos. >> >> One (this one) has an intel pro wireless 3945ABG installed, which >> returns: >> >> wpi0: irq 18 at device 0.0 on pci6 >> wpi0: Driver Revision 20071127 >> wpi0: 0x1000 bytes of rid 0x10 res 3 failed (0, 0xffffffffffffffff). >> wpi0: could not allocate memory resource >> device_attach: wpi0 attach returned 6 >> >> and the broadcom used in the other does the exact same thing. >> >> I'm thinking that this isn't really a problem with the wifi, but may >> be a mini-pci-e issue. > > Don't know about the Intel, but some Broadcoms work. Please show what > you're doing in /boot/loader.conf and /etc/rc.conf. > the other machine has configs pretty much the same as this one. the problem seems to be at the kernel level tho, at probe time. 4:19:52pm argus(14): cat /boot/loader.conf beastie_disable="YES" # Turn the beastie boot menu on and off# Beginning of the block added by the VMware software vmxnet_load="YES" # End of the block added by the VMware software 4:20:02pm argus(15): cat /etc/rc.conf # -- sysinstall generated deltas -- # Tue Mar 23 21:07:50 2010 # Created: Tue Mar 23 21:07:50 2010 # Enable network daemons for user convenience. # Please make all changes to this file, not to /etc/defaults/rc.conf. # This file now contains just the overrides from /etc/defaults/rc.conf. accounting_enable="YES" gateway_enable="YES" hostname="argus.root.com" ifconfig_rl0="inet 192.168.0.2 192.168.0.1 netmask 255.255.255.0" inetd_enable="YES" ipv6_enable="NO" keyrate="fast" lpd_enable="YES" moused_enable="YES" moused_port="/dev/psm0" moused_type="auto" named_enable="YES" nfs_client_enable="YES" nfs_reserved_port_only="YES" nfs_server_enable="YES" router="/sbin/routed" router_enable="YES" router_flags="-q" rpc_lockd_enable="YES" rpc_statd_enable="YES" rpcbind_enable="YES" rwhod_enable="NO" saver="NO" scrnmap="NO" sshd_enable="YES" blanktime="300" font8x14="cp437-8x14" font8x16="cp437-8x16" font8x8="cp437-8x8" allscreens_flags="-c blink MODE_280" # Set this vidcontrol mode for all virtual screens allscreens_kbdflags="-r fast" # Set this kbdcontrol mode for all virtual screens fusefs_enable="YES" mysql_enable="YES" lircd_enable="YES" 4:23:36pm argus(17): pciconf -lv hostb0@pci0:0:0:0: class=0x060000 card=0x30a5103c chip=0x27a08086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = '955XM/945GM/PM/GMS/940GML Express Processor to DRAM Controller' class = bridge subclass = HOST-PCI vgapci0@pci0:0:2:0: class=0x030000 card=0x30a5103c chip=0x27a28086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'Mobile 945GM/GU Express Integrated Graphics Controller' class = display subclass = VGA vgapci1@pci0:0:2:1: class=0x038000 card=0x30a5103c chip=0x27a68086 rev=0x03 hdr=0x00 vendor = 'Intel Corporation' device = 'Mobile 945GM/GU Express Integrated Graphics Controller' class = display hdac0@pci0:0:27:0: class=0x040300 card=0x30a5103c chip=0x27d88086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'IDT High Definition Audio Driver (BA101897)' class = multimedia subclass = HDA pcib1@pci0:0:28:0: class=0x060400 card=0x30a5103c chip=0x27d08086 rev=0x01 hdr=0x01 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) PCIe Root Port' class = bridge subclass = PCI-PCI pcib2@pci0:0:28:2: class=0x060400 card=0x30a5103c chip=0x27d48086 rev=0x01 hdr=0x01 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) PCIe Root Port' class = bridge subclass = PCI-PCI uhci0@pci0:0:29:0: class=0x0c0300 card=0x30a5103c chip=0x27c88086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB uhci1@pci0:0:29:1: class=0x0c0300 card=0x30a5103c chip=0x27c98086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB uhci2@pci0:0:29:2: class=0x0c0300 card=0x30a5103c chip=0x27ca8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB Universal Host Controller' class = serial bus subclass = USB ehci0@pci0:0:29:7: class=0x0c0320 card=0x30a5103c chip=0x27cc8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801G (ICH7 Family) USB 2.0 Enhanced Host Controller' class = serial bus subclass = USB pcib3@pci0:0:30:0: class=0x060401 card=0x30a5103c chip=0x24488086 rev=0xe1 hdr=0x01 vendor = 'Intel Corporation' device = '82801 Family (ICH2/3/4/5/6/7/8/9-M) Hub Interface to PCI Bridge' class = bridge subclass = PCI-PCI isab0@pci0:0:31:0: class=0x060100 card=0x30a5103c chip=0x27b98086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801GBM (ICH7-M) LPC Interface Controller' class = bridge subclass = PCI-ISA atapci0@pci0:0:31:2: class=0x010180 card=0x30a5103c chip=0x27c48086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = '82801GBM/GHM (ICH7-M Family) Serial ATA Storage Controller' class = mass storage subclass = ATA ichsmb0@pci0:0:31:3: class=0x0c0500 card=0x30a5103c chip=0x27da8086 rev=0x01 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel[R] 82801G (ICH7 Family) C- 27DA (82801G)' class = serial bus subclass = SMBus wpi0@pci0:6:0:0: class=0x028000 card=0x135b103c chip=0x42228086 rev=0x02 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 3945ABG Wireless LAN controller (10208086)' class = network rl0@pci0:8:8:0: class=0x020000 card=0x30a5103c chip=0x813910ec rev=0x10 hdr=0x00 vendor = 'Realtek Semiconductor' device = 'Realtek RTL8139 Family PCI Fast Ethernet NIC (RTL-8139/8139C/8139D)' class = network subclass = ethernet From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 21:51:52 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8357F106566B for ; Wed, 29 Sep 2010 21:51:52 +0000 (UTC) (envelope-from jurgen@ish.com.au) Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net [59.167.240.32]) by mx1.freebsd.org (Postfix) with ESMTP id 4156D8FC0C for ; Wed, 29 Sep 2010 21:51:52 +0000 (UTC) Received: from ip-211.ish.com.au ([203.29.62.211]:22727 helo=ish.com.au) by fish.ish.com.au with esmtp (Exim 4.69) (envelope-from ) id 1P14Z3-0001r0-2n; Thu, 30 Sep 2010 07:51:49 +1000 Received: from [203.29.62.154] (HELO ip-154.ish.com.au) by ish.com.au (CommuniGate Pro SMTP 5.3.8) with ESMTP id 6250573; Thu, 30 Sep 2010 07:51:49 +1000 Message-ID: <4CA3B4F5.4070005@ish.com.au> Date: Thu, 30 Sep 2010 07:51:49 +1000 From: Jurgen Weber User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> <4CA2B753.4010107@ish.com.au> <20100929072928.GA82955@icarus.home.lan> In-Reply-To: <20100929072928.GA82955@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org, Andriy Gapon Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 21:51:52 -0000 I do not understand what you mean by a verbose dmesg...... looking at the man page there is no verbose option for dmesg except what I completed (dmesg -a). Once that is clarified I can reboot the backup machine and turn on ACPI for you. On 29/09/10 5:29 PM, Jeremy Chadwick wrote: > On Wed, Sep 29, 2010 at 01:49:39PM +1000, Jurgen Weber wrote: >> Andriy >> >> You can find everything you are after here: >> >> http://pastebin.com/WH4V2W0F > > The information provided here shows ACPI is disabled in addition to the > boot not being verbose. > -- --------------------------> ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 21:51:52 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 93C26106566C for ; Wed, 29 Sep 2010 21:51:52 +0000 (UTC) (envelope-from jurgen@ish.com.au) Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net [59.167.240.32]) by mx1.freebsd.org (Postfix) with ESMTP id 270DC8FC08 for ; Wed, 29 Sep 2010 21:51:51 +0000 (UTC) Received: from ip-211.ish.com.au ([203.29.62.211]:39420 helo=ish.com.au) by fish.ish.com.au with esmtp (Exim 4.69) (envelope-from ) id 1P14Yu-0001ql-2x; Thu, 30 Sep 2010 07:51:40 +1000 Received: from [203.29.62.154] (HELO ip-154.ish.com.au) by ish.com.au (CommuniGate Pro SMTP 5.3.8) with ESMTP id 6250571; Thu, 30 Sep 2010 07:51:40 +1000 Message-ID: <4CA3B4EC.90106@ish.com.au> Date: Thu, 30 Sep 2010 07:51:40 +1000 From: Jurgen Weber User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Andriy Gapon References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> <4CA2B753.4010107@ish.com.au> <4CA2EA2B.1040706@icyb.net.ua> In-Reply-To: <4CA2EA2B.1040706@icyb.net.ua> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 21:51:52 -0000 Hi I do not understand what you mean by a verbose dmesg...... looking at the man page there is no verbose option for dmesg except what I completed (dmesg -a). Once that is clarified I can reboot the backup machine and turn on ACPI for you. Thanks On 29/09/10 5:26 PM, Andriy Gapon wrote: > on 29/09/2010 06:49 Jurgen Weber said the following: >> Andriy >> >> You can find everything you are after here: >> >> http://pastebin.com/WH4V2W0F > > Looks like this was with ACPI disabled? > Can you try to re-enable it? > Also, it doesn't look like the dmesg is verbose. > > >> On 28/09/10 8:07 PM, Andriy Gapon wrote: >>> on 28/09/2010 10:54 Jurgen Weber said the following: >>>> # dmesg | grep Timecounter >>>> Timecounter "i8254" frequency 1193182 Hz quality 0 >>>> Timecounters tick every 1.000 msec >>>> # sysctl kern.timecounter.hardware >>>> kern.timecounter.hardware: i8254 >>>> >>>> Only have one timer to choose from. >>> >>> Can you provide a little bit more of "hard" data than the above? >>> Specifically, the following sysctls: >>> kern.timecounter >>> dev.cpu >>> >>> Output of vmstat -i. >>> _Verbose_ boot dmesg. >>> >>> Please do not disable ACPI when taking this data. >>> Preferably, upload it somewhere and post a link to it. >> > > -- --------------------------> ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 21:56:03 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C40DC106566C for ; Wed, 29 Sep 2010 21:56:03 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta06.emeryville.ca.mail.comcast.net (qmta06.emeryville.ca.mail.comcast.net [76.96.30.56]) by mx1.freebsd.org (Postfix) with ESMTP id 9DB358FC1E for ; Wed, 29 Sep 2010 21:56:03 +0000 (UTC) Received: from omta11.emeryville.ca.mail.comcast.net ([76.96.30.36]) by qmta06.emeryville.ca.mail.comcast.net with comcast id Cchx1f0040mlR8UA6lw3TR; Wed, 29 Sep 2010 21:56:03 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta11.emeryville.ca.mail.comcast.net with comcast id Clw11f00J3LrwQ28Xlw1gT; Wed, 29 Sep 2010 21:56:02 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 8D03E9B418; Wed, 29 Sep 2010 14:56:01 -0700 (PDT) Date: Wed, 29 Sep 2010 14:56:01 -0700 From: Jeremy Chadwick To: Jurgen Weber Message-ID: <20100929215601.GA99844@icarus.home.lan> References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> <4CA2B753.4010107@ish.com.au> <20100929072928.GA82955@icarus.home.lan> <4CA3B4F5.4070005@ish.com.au> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA3B4F5.4070005@ish.com.au> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, Andriy Gapon Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 21:56:03 -0000 On Thu, Sep 30, 2010 at 07:51:49AM +1000, Jurgen Weber wrote: > I do not understand what you mean by a verbose dmesg...... looking > at the man page there is no verbose option for dmesg except what I > completed (dmesg -a). > > Once that is clarified I can reboot the backup machine and turn on > ACPI for you. > > On 29/09/10 5:29 PM, Jeremy Chadwick wrote: > >On Wed, Sep 29, 2010 at 01:49:39PM +1000, Jurgen Weber wrote: > >>Andriy > >> > >>You can find everything you are after here: > >> > >>http://pastebin.com/WH4V2W0F > > > >The information provided here shows ACPI is disabled in addition to the > >boot not being verbose. When the machine boots (when loader starts), you'll see the FreeBSD logo with a menu of choices (boot, boot with ACPI disabled, single user mode, etc.). One of them is boot verbosely; I think it's #5, labelled "Boot with verbose logging" or something like that. Choose that. That will cause your machine to boot with ACPI enabled, in addition to booting verbosely. There will be a LOT more information printed on the screen during the boot process, and it should be visible in /var/log/messages after the machine is started. This is the information we're looking for. HTH! -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 21:58:05 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 84B98106564A for ; Wed, 29 Sep 2010 21:58:05 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id CCAF48FC08 for ; Wed, 29 Sep 2010 21:58:04 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id AAA05697; Thu, 30 Sep 2010 00:57:57 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P14ez-0004X5-1D; Thu, 30 Sep 2010 00:57:57 +0300 Message-ID: <4CA3B664.4070805@icyb.net.ua> Date: Thu, 30 Sep 2010 00:57:56 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jurgen Weber References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> <4CA2B753.4010107@ish.com.au> <4CA2EA2B.1040706@icyb.net.ua> <4CA3B4EC.90106@ish.com.au> In-Reply-To: <4CA3B4EC.90106@ish.com.au> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 21:58:05 -0000 on 30/09/2010 00:51 Jurgen Weber said the following: > Hi > > I do not understand what you mean by a verbose dmesg...... looking at the man > page there is no verbose option for dmesg except what I completed (dmesg -a). > > Once that is clarified I can reboot the backup machine and turn on ACPI for you. Verbose dmesg is produced when kernel is booted with verbose logging. Either boot -v on loader prompt. Or '5' (IIRC) in loader menu. Or nextboot -k kernel -o -v before reboot. Or verbose_loading="YES" in loader.conf. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 22:27:32 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 14739106566C for ; Wed, 29 Sep 2010 22:27:32 +0000 (UTC) (envelope-from jurgen@ish.com.au) Received: from fish.ish.com.au (eth5921.nsw.adsl.internode.on.net [59.167.240.32]) by mx1.freebsd.org (Postfix) with ESMTP id 989228FC08 for ; Wed, 29 Sep 2010 22:27:31 +0000 (UTC) Received: from ip-211.ish.com.au ([203.29.62.211]:61377 helo=ish.com.au) by fish.ish.com.au with esmtp (Exim 4.69) (envelope-from ) id 1P157X-0002t8-03; Thu, 30 Sep 2010 08:27:27 +1000 Received: from [203.29.62.154] (HELO ip-154.ish.com.au) by ish.com.au (CommuniGate Pro SMTP 5.3.8) with ESMTP id 6250585; Thu, 30 Sep 2010 08:27:26 +1000 Message-ID: <4CA3BD4E.8000906@ish.com.au> Date: Thu, 30 Sep 2010 08:27:26 +1000 From: Jurgen Weber User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick , Andriy Gapon References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> <4CA2B753.4010107@ish.com.au> <20100929072928.GA82955@icarus.home.lan> <4CA3B4F5.4070005@ish.com.au> <20100929215601.GA99844@icarus.home.lan> In-Reply-To: <20100929215601.GA99844@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 22:27:32 -0000 Gentlemen Ah, ok. Learn something new everyday. Fantastic. The first time the machine stopped during the boot process, but that is ok the 2nd time we have success. http://pastebin.com/r4UWdN7U I am not sure if ACPI is on, Jeremy you mention below that it should be in just by booting with this option so let me know if there are any problems there. Thanks Jurgen On 30/09/10 7:56 AM, Jeremy Chadwick wrote: > On Thu, Sep 30, 2010 at 07:51:49AM +1000, Jurgen Weber wrote: >> I do not understand what you mean by a verbose dmesg...... looking >> at the man page there is no verbose option for dmesg except what I >> completed (dmesg -a). >> >> Once that is clarified I can reboot the backup machine and turn on >> ACPI for you. >> >> On 29/09/10 5:29 PM, Jeremy Chadwick wrote: >>> On Wed, Sep 29, 2010 at 01:49:39PM +1000, Jurgen Weber wrote: >>>> Andriy >>>> >>>> You can find everything you are after here: >>>> >>>> http://pastebin.com/WH4V2W0F >>> >>> The information provided here shows ACPI is disabled in addition to the >>> boot not being verbose. > > When the machine boots (when loader starts), you'll see the FreeBSD logo > with a menu of choices (boot, boot with ACPI disabled, single user mode, > etc.). One of them is boot verbosely; I think it's #5, labelled "Boot > with verbose logging" or something like that. > > Choose that. That will cause your machine to boot with ACPI enabled, in > addition to booting verbosely. There will be a LOT more information > printed on the screen during the boot process, and it should be visible > in /var/log/messages after the machine is started. This is the > information we're looking for. > > HTH! > -- --------------------------> ish http://www.ish.com.au Level 1, 30 Wilson Street Newtown 2042 Australia phone +61 2 9550 5001 fax +61 2 9550 4001 From owner-freebsd-stable@FreeBSD.ORG Wed Sep 29 23:27:43 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1D563106566B for ; Wed, 29 Sep 2010 23:27:43 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id DDEF48FC12 for ; Wed, 29 Sep 2010 23:27:42 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8TNRTPa077418; Wed, 29 Sep 2010 16:27:33 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009292327.o8TNRTPa077418@gw.catspoiler.org> Date: Wed, 29 Sep 2010 16:27:29 -0700 (PDT) From: Don Lewis To: avg@icyb.net.ua In-Reply-To: <4CA3010D.9080909@icyb.net.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Sep 2010 23:27:43 -0000 On 29 Sep, Andriy Gapon wrote: > on 29/09/2010 11:56 Don Lewis said the following: >> I'm using the same kernel config as the one on a slower !SMP box which >> I'm trying to squeeze as much performance out of as possible. My kernel >> config file contains these statements: >> nooptions SMP >> nodevice apic >> >> Testing with an SMP kernel is on my TODO list. > > SMP or not, it's really weird to see apic disabled nowadays. I tried enabling apic and got worse results. I saw ping RTTs as high as 67 seconds. Here's the timer info with apic enabled: # sysctl kern.timecounter kern.timecounter.tick: 1 kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) kern.timecounter.hardware: ACPI-fast kern.timecounter.stepwarnings: 0 kern.timecounter.tc.i8254.mask: 65535 kern.timecounter.tc.i8254.counter: 53633 kern.timecounter.tc.i8254.frequency: 1193182 kern.timecounter.tc.i8254.quality: 0 kern.timecounter.tc.ACPI-fast.mask: 16777215 kern.timecounter.tc.ACPI-fast.counter: 7988816 kern.timecounter.tc.ACPI-fast.frequency: 3579545 kern.timecounter.tc.ACPI-fast.quality: 1000 kern.timecounter.tc.TSC.mask: 4294967295 kern.timecounter.tc.TSC.counter: 1341917999 kern.timecounter.tc.TSC.frequency: 2500014018 kern.timecounter.tc.TSC.quality: 800 kern.timecounter.invariant_tsc: 0 Here's the verbose boot info with apic: I've also experimented with SMP as well as SCHED_4BSD (all previous testing was with !SMP and SCHED_ULE). I still see occasional problems with SCHED_4BSD and !SMP, but so far I have not seen any problems with SCHED_ULE and SMP. I did manage to catch the problem with lock profiling enabled: I'm currently testing SMP some more to verify if it really avoids this problem. From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 01:20:50 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AADD0106566C for ; Thu, 30 Sep 2010 01:20:50 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id 7B3158FC18 for ; Thu, 30 Sep 2010 01:20:50 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 77868509B2; Thu, 30 Sep 2010 02:20:49 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LQjYsoOaaDRP; Thu, 30 Sep 2010 02:20:49 +0100 (BST) Received: from smtp-auth.unixathome.org (smtp-auth.unixathome.org [10.4.7.7]) (Authenticated sender: hidden) by nyi.unixathome.org (Postfix) with ESMTPSA id 3C285509A8 ; Thu, 30 Sep 2010 02:20:49 +0100 (BST) Message-ID: <4CA3E5F0.7000603@langille.org> Date: Wed, 29 Sep 2010 21:20:48 -0400 From: Dan Langille Organization: The FreeBSD Diary User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Artem Belevich References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 01:20:50 -0000 On 9/29/2010 3:57 PM, Artem Belevich wrote: > On Wed, Sep 29, 2010 at 11:04 AM, Dan Langille wrote: >> It's taken about 15 hours to copy 800GB. I'm sure there's some tuning I >> can do. >> >> The system is now running: >> >> # zfs send storage/bacula@transfer | zfs receive storage/compressed/bacula > > Try piping zfs data through mbuffer (misc/mbuffer in ports). I've > found that it does help a lot to smooth out data flow and increase > send/receive throughput even when send/receive happens on the same > host. Run it with a buffer large enough to accommodate few seconds > worth of write throughput for your target disks. Thanks. I just installed it. I'll use it next time. I don't want to interrupt this one. I'd like to see how long it takes. Then compare. > Here's an example: > http://blogs.everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/ That looks really good. Thank you. -- Dan Langille - http://langille.org/ From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 06:09:17 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9A9E2106564A for ; Thu, 30 Sep 2010 06:09:17 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id A32688FC14 for ; Thu, 30 Sep 2010 06:09:16 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA11931; Thu, 30 Sep 2010 09:09:07 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P1CKI-0007Fh-RT; Thu, 30 Sep 2010 09:09:06 +0300 Message-ID: <4CA42982.5040602@icyb.net.ua> Date: Thu, 30 Sep 2010 09:09:06 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jurgen Weber References: <4CA19F27.6050903@ish.com.au> <4CA1BE59.7060906@icyb.net.ua> <4CA2B753.4010107@ish.com.au> <20100929072928.GA82955@icarus.home.lan> <4CA3B4F5.4070005@ish.com.au> <20100929215601.GA99844@icarus.home.lan> <4CA3BD4E.8000906@ish.com.au> In-Reply-To: <4CA3BD4E.8000906@ish.com.au> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org, Jeremy Chadwick Subject: Re: cpu timer issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 06:09:17 -0000 on 30/09/2010 01:27 Jurgen Weber said the following: > Gentlemen > > Ah, ok. Learn something new everyday. Fantastic. The first time the machine > stopped during the boot process, but that is ok the 2nd time we have success. > > http://pastebin.com/r4UWdN7U > > I am not sure if ACPI is on, Jeremy you mention below that it should be in just > by booting with this option so let me know if there are any problems there. If you disabled it in BIOS, you have to re-enable it there. There is no magic. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 06:11:27 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6E759106567A; Thu, 30 Sep 2010 06:11:27 +0000 (UTC) (envelope-from avg@icyb.net.ua) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 728AB8FC18; Thu, 30 Sep 2010 06:11:26 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id JAA11957; Thu, 30 Sep 2010 09:11:23 +0300 (EEST) (envelope-from avg@icyb.net.ua) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1P1CMV-0007Fz-3S; Thu, 30 Sep 2010 09:11:23 +0300 Message-ID: <4CA42A0A.6090003@icyb.net.ua> Date: Thu, 30 Sep 2010 09:11:22 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.9) Gecko/20100918 Lightning/1.0b2 Thunderbird/3.1.4 MIME-Version: 1.0 To: Don Lewis References: <201009292327.o8TNRTPa077418@gw.catspoiler.org> In-Reply-To: <201009292327.o8TNRTPa077418@gw.catspoiler.org> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 06:11:27 -0000 on 30/09/2010 02:27 Don Lewis said the following: > On 29 Sep, Andriy Gapon wrote: >> on 29/09/2010 11:56 Don Lewis said the following: >>> I'm using the same kernel config as the one on a slower !SMP box which >>> I'm trying to squeeze as much performance out of as possible. My kernel >>> config file contains these statements: >>> nooptions SMP >>> nodevice apic >>> >>> Testing with an SMP kernel is on my TODO list. >> >> SMP or not, it's really weird to see apic disabled nowadays. > > I tried enabling apic and got worse results. I saw ping RTTs as high as > 67 seconds. Here's the timer info with apic enabled: I didn't expect anything to change in this output with APIC enabled. > # sysctl kern.timecounter > kern.timecounter.tick: 1 > kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-1000000) > kern.timecounter.hardware: ACPI-fast > kern.timecounter.stepwarnings: 0 > kern.timecounter.tc.i8254.mask: 65535 > kern.timecounter.tc.i8254.counter: 53633 > kern.timecounter.tc.i8254.frequency: 1193182 > kern.timecounter.tc.i8254.quality: 0 > kern.timecounter.tc.ACPI-fast.mask: 16777215 > kern.timecounter.tc.ACPI-fast.counter: 7988816 > kern.timecounter.tc.ACPI-fast.frequency: 3579545 > kern.timecounter.tc.ACPI-fast.quality: 1000 > kern.timecounter.tc.TSC.mask: 4294967295 > kern.timecounter.tc.TSC.counter: 1341917999 > kern.timecounter.tc.TSC.frequency: 2500014018 > kern.timecounter.tc.TSC.quality: 800 > kern.timecounter.invariant_tsc: 0 > > Here's the verbose boot info with apic: > vmstat -i ? > I've also experimented with SMP as well as SCHED_4BSD (all previous > testing was with !SMP and SCHED_ULE). I still see occasional problems > with SCHED_4BSD and !SMP, but so far I have not seen any problems with > SCHED_ULE and SMP. Good! > I did manage to catch the problem with lock profiling enabled: > > I'm currently testing SMP some more to verify if it really avoids this > problem. OK. -- Andriy Gapon From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 06:49:27 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 33C57106564A for ; Thu, 30 Sep 2010 06:49:27 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8BC898FC14 for ; Thu, 30 Sep 2010 06:49:26 +0000 (UTC) Received: by fxm9 with SMTP id 9so1365224fxm.13 for ; Wed, 29 Sep 2010 23:49:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=nl2Av61VLS9fA7MtNSHtG2YRkNYeP0g1dwUv5JUB+nQ=; b=kpDiE0SPPi9rRclwBW8EwhXHepr2BgC1q4/dc/eTLpx6gqzoPhNo6eWCrcNby7gyQK HqKfdUs/36H4sU5DoGO6sZ/apz+mF7nlzgTTzNjCYNLHAbm1Rp2h1WYmeOlPvb8sJaFV DZuSkzlEP87nmQ12jPR4GfkYgUEv71ngcMDFI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=CCSIPSIT4PiO3YyoIimZcVCaj5HERSfk39Ii3mzGmf0JYejTSEccaoDugVx7njKjFP 83PeYKsRjmsaKl/t7iDevxnF3lnBFKusFvn0oi06uufUWXPM6ZnCiRmivAfmRBCWL6l0 fSIAUK6nBW4471/Nm+UkGNw4JyzJmhRtZchSw= MIME-Version: 1.0 Received: by 10.223.121.208 with SMTP id i16mr3036436far.46.1285829364294; Wed, 29 Sep 2010 23:49:24 -0700 (PDT) Received: by 10.223.120.139 with HTTP; Wed, 29 Sep 2010 23:49:24 -0700 (PDT) Date: Thu, 30 Sep 2010 01:49:24 -0500 Message-ID: From: Adam Vande More To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: MCA messages in dmesg X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 06:49:27 -0000 For awhile now, my home server has been acting up. Actually it had a bad set of RAM long ago, replaced and it and worked fine. It's been weird again now, and I've found this in dmesg: MCA: Bank 0, Status 0xf200000000000800 MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 2 MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory MCA: Bank 0, Status 0xf200000000000800 MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 3 MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory I really don't know what MCA is, but that looks like possibility bad RAM again. I have some other DIMM's I can try, but I was hoping someone had some info on exactly what those messages mean. One concern is the motherboard bad, and hosing the memory. Some more info: FreeBSD vbox.galacticdominator.com 8.1-STABLE FreeBSD 8.1-STABLE #0: Mon Aug 2 11:19:16 CDT 2010 adam@vbox.galacticdominator.com:/usr/obj/usr/src/sys/GENERIC amd64 smbios.bios.reldate="01/22/2008" smbios.bios.vendor="Phoenix Technologies, LTD" smbios.bios.version="6.00 PG" smbios.chassis.maker="NVIDIA" smbios.chassis.serial=" " smbios.chassis.tag=" " smbios.chassis.version="NFORCE 680i LT SLI" smbios.memory.enabled="4194304" smbios.planar.maker="NVIDIA" smbios.planar.product="NFORCE 680i LT SLI" smbios.planar.serial="1" smbios.planar.version="2" smbios.socket.enabled="1" smbios.socket.populated="1" smbios.system.maker="NVIDIA" smbios.system.product="NFORCE 680i LT SLI" smbios.system.serial="1" smbios.system.uuid="86fe600d-034b-0400-0000-000000000000" smbios.system.version="2" smbios.version="2.4" Normal dmesg: Copyright (c) 1992-2010 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.1-STABLE #0: Mon Aug 2 11:19:16 CDT 2010 adam@vbox.galacticdominator.com:/usr/obj/usr/src/sys/GENERIC amd64 Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Quad CPU Q6600 @ 2.40GHz (2700.03-MHz K8-class CPU) Origin = "GenuineIntel" Id = 0x6fb Family = 6 Model = f Stepping = 11 Features=0xbfebfbff Features2=0xe3bd AMD Features=0x20100800 AMD Features2=0x1 TSC: P-state invariant real memory = 4294967296 (4096 MB) avail memory = 4073664512 (3884 MB) ACPI APIC Table: FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs FreeBSD/SMP: 1 package(s) x 4 core(s) cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 1 cpu2 (AP): APIC ID: 2 cpu3 (AP): APIC ID: 3 ioapic0: Changing APIC ID to 4 ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) acpi0: reservation of 0, a0000 (3) failed acpi0: reservation of 100000, afdf0000 (3) failed Timecounter "ACPI-fast" frequency 3579545 Hz quality 1000 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 cpu0: on acpi0 cpu1: on acpi0 cpu2: on acpi0 cpu3: on acpi0 acpi_hpet0: iomem 0xfeff0000-0xfeff03ff on acpi0 device_attach: acpi_hpet0 attach returned 12 acpi_button0: on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pci0: at device 0.1 (no driver attached) pci0: at device 0.2 (no driver attached) pci0: at device 0.3 (no driver attached) pci0: at device 0.4 (no driver attached) pci0: at device 0.5 (no driver attached) pci0: at device 0.6 (no driver attached) pci0: at device 0.7 (no driver attached) pci0: at device 1.0 (no driver attached) pci0: at device 1.1 (no driver attached) pci0: at device 1.2 (no driver attached) pci0: at device 1.3 (no driver attached) pci0: at device 1.4 (no driver attached) pci0: at device 1.5 (no driver attached) pci0: at device 1.6 (no driver attached) pci0: at device 2.0 (no driver attached) pci0: at device 2.1 (no driver attached) pci0: at device 2.2 (no driver attached) pcib1: at device 3.0 on pci0 pci1: on pcib1 vgapci0: port 0x8c00-0x8c7f mem 0xcc000000-0xccffffff,0xb0000000-0xbfffffff,0xcd000000-0xcdffffff irq 16 at device 0.0 on pci1 nvidia0: on vgapci0 vgapci0: child nvidia0 requested pci_enable_busmaster vgapci0: child nvidia0 requested pci_enable_io vgapci0: child nvidia0 requested pci_enable_io nvidia0: [ITHREAD] pci0: at device 9.0 (no driver attached) isab0: port 0xfc00-0xfc7f at device 10.0 on pci0 isa0: on isab0 pci0: at device 10.1 (no driver attached) ohci0: mem 0xcffff000-0xcfffffff irq 20 at device 11.0 on pci0 ohci0: [ITHREAD] usbus0: on ohci0 ehci0: mem 0xcfffe000-0xcfffe0ff irq 21 at device 11.1 on pci0 ehci0: [ITHREAD] usbus1: EHCI version 1.0 usbus1: on ehci0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xec00-0xec0f at device 13.0 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] atapci1: port 0x9f0-0x9f7,0xbf0-0xbf3,0x970-0x977,0xb70-0xb73,0xd800-0xd80f mem 0xcfffd000-0xcfffdfff irq 22 at device 14.0 on pci0 atapci1: [ITHREAD] ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] atapci2: port 0x9e0-0x9e7,0xbe0-0xbe3,0x960-0x967,0xb60-0xb63,0xc400-0xc40f mem 0xcfffc000-0xcfffcfff irq 23 at device 14.1 on pci0 atapci2: [ITHREAD] ata4: on atapci2 ata4: [ITHREAD] ata5: on atapci2 ata5: [ITHREAD] atapci3: port 0xc000-0xc007,0xbc00-0xbc03,0xb800-0xb807,0xb400-0xb403,0xb000-0xb00f mem 0xcfffb000-0xcfffbfff irq 20 at device 14.2 on pci0 atapci3: [ITHREAD] ata6: on atapci3 ata6: [ITHREAD] ata7: on atapci3 ata7: [ITHREAD] pcib2: at device 15.0 on pci0 pci2: on pcib2 fwohci0: mem 0xcfeff000-0xcfeff7ff,0xcfef8000-0xcfefbfff irq 19 at device 7.0 on pci2 fwohci0: [ITHREAD] fwohci0: OHCI version 1.10 (ROM=0) fwohci0: No. of Isochronous channels is 4. fwohci0: EUI64 0d:60:fe:86:00:04:4b:03 fwohci0: Phy 1394a available S400, 2 ports. fwohci0: Link S400, max_rec 2048 bytes. firewire0: on fwohci0 fwe0: on firewire0 if_fwe0: Fake Ethernet address: 0e:60:fe:04:4b:03 fwe0: Ethernet address: 0e:60:fe:04:4b:03 fwip0: on firewire0 fwip0: Firewire address: 0d:60:fe:86:00:04:4b:03 @ 0xfffe00000000, S400, maxrec 2048 dcons_crom0: on firewire0 dcons_crom0: bus_addr 0xa60c4000 fwohci0: Initiate bus reset fwohci0: fwohci_intr_core: BUS reset fwohci0: fwohci_intr_core: node_id=0x00000000, SelfID Count=1, CYCLEMASTER mode xl0: <3Com 3c900B-TPO Etherlink XL> port 0x9c00-0x9c7f mem 0xcfefe000-0xcfefe07f irq 18 at device 10.0 on pci2 xl0: selecting 10baseT transceiver, half duplex xl0: Ethernet address: 00:04:76:d2:50:25 xl0: [ITHREAD] hdac0: mem 0xcfff4000-0xcfff7fff irq 21 at device 15.1 on pci0 hdac0: HDA Driver Revision: 20100226_0142 hdac0: [ITHREAD] nfe0: port 0xac00-0xac07 mem 0xcfffa000-0xcfffafff,0xcfff9000-0xcfff90ff,0xcfff8000-0xcfff800f irq 22 at device 18.0 on pci0 miibus0: on nfe0 e1000phy0: PHY 2 on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 1000baseT-FDX, auto nfe0: Ethernet address: 00:04:4b:04:01:28 nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] nfe0: [FILTER] atrtc0: port 0x70-0x71,0x72-0x73 on acpi0 fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] orm0: at iomem 0xd0000-0xd3fff,0xd4000-0xd57ff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 atkbdc0: at port 0x60,0x64 on isa0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] ppc0: cannot reserve I/O port range est0: on cpu0 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 9210921060c061b device_attach: est0 attach returned 6 p4tcc0: on cpu0 est1: on cpu1 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 9210921060c061b device_attach: est1 attach returned 6 p4tcc1: on cpu1 est2: on cpu2 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 9210921060c061b device_attach: est2 attach returned 6 p4tcc2: on cpu2 est3: on cpu3 est: CPU supports Enhanced Speedstep, but is not recognized. est: cpu_vendor GenuineIntel, msr 9210921060c061b device_attach: est3 attach returned 6 p4tcc3: on cpu3 firewire0: 1 nodes, maxhop <= 0 cable IRM irm(0) (me) firewire0: bus manager 0 ZFS filesystem version 3 ZFS storage pool version 14 Timecounters tick every 1.000 msec vboxdrv: fAsync=0 offMin=0x25b offMax=0x159f usbus0: 12Mbps Full Speed USB v1.0 usbus1: 480Mbps High Speed USB v2.0 ugen0.1: at usbus0 uhub0: on usbus0 ugen1.1: at usbus1 uhub1: on usbus1 uhub0: 10 ports with 10 removable, self powered acd0: DVDR at ata2-master UDMA100 SATA 1.5Gb/s ad8: 953869MB at ata4-master UDMA100 SATA 3Gb/s ad12: 76319MB at ata6-master UDMA100 SATA 3Gb/s ad14: 953869MB at ata7-master UDMA100 SATA 3Gb/s hdac0: HDA Codec #0: Realtek ALC885 pcm0: at cad 0 nid 1 on hdac0 pcm1: at cad 0 nid 1 on hdac0 acd0: FAILURE - INQUIRY ILLEGAL REQUEST asc=0x24 ascq=0x00 sks=0x40 0x00 0x01 cd0 at ata2 bus 0 scbus0 target 0 lun 0 cd0: Removable CD-ROM SCSI-0 device cd0: 100.000MB/s transfers cd0: cd present [3073 x 2048 byte records] SMP: AP CPU #1 Launched! SMP: AP CPU #3 Launched! SMP: AP CPU #2 Launched! Root mount waiting for: usbus1 uhub1: 10 ports with 10 removable, self powered Trying to mount root from zfs:tank GEOM: zvol/tank/usr/home/django-zvols1: geometry does not match label (16h,63s != 255h,63s). GEOM: zvol/tank/usr/home/horde-zvols1: geometry does not match label (16h,63s != 255h,63s). vboxnet0: Ethernet address: 0a:00:27:00:00:00 nfe0: link state changed to UP nfe0: promiscuous mode enabled -- Adam Vande More From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 06:51:54 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 69D4410656A3 for ; Thu, 30 Sep 2010 06:51:54 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta14.emeryville.ca.mail.comcast.net (qmta14.emeryville.ca.mail.comcast.net [76.96.27.212]) by mx1.freebsd.org (Postfix) with ESMTP id 51B238FC19 for ; Thu, 30 Sep 2010 06:51:53 +0000 (UTC) Received: from omta12.emeryville.ca.mail.comcast.net ([76.96.30.44]) by qmta14.emeryville.ca.mail.comcast.net with comcast id Cudx1f0040x6nqcAEurt1h; Thu, 30 Sep 2010 06:51:53 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta12.emeryville.ca.mail.comcast.net with comcast id Curr1f00B3LrwQ28YurrFX; Thu, 30 Sep 2010 06:51:52 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 7592A9B418; Wed, 29 Sep 2010 23:51:51 -0700 (PDT) Date: Wed, 29 Sep 2010 23:51:51 -0700 From: Jeremy Chadwick To: freebsd-stable@freebsd.org Message-ID: <20100930065151.GA9634@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Ed Schouten , ale@FreeBSD.org Subject: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 06:51:54 -0000 Something interesting I've come across which happens on both RELENG_7 and RELENG_8 (indicating it's not a problem with the older tty code or the newer pty/pts code), and it's reproducible on Linux (sort of...). mysqld_safe appears to hold a pty/tty open even after the process has been backgrounded. I can understand how/why this might occur, just not in this particular case. I had a colleague test the situation on his Linux machine. He was able to confirm that: 1) "mysqld_safe > /dev/null 2>&1 &" never released the tty 2) "nohup mysqld_safe > /dev/null 2>&1 &" did release the tty With regards to test #1, looking in /proc/{pid}/fd showed that STDIN was being held open. I recommended he point STDIN to /dev/null as so: "mysqld_safe < /dev/null > /dev/null 2>&1 &" Which also solved the problem. On FreeBSD it's a different story. Below, mysql-server was started as root on pts/1. The open file descriptors all point to /dev/null, so I'm not sure why the pty/tty is being held open. icarus# ps -aux -U mysql USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 10078 0.2 0.3 35100 11032 1 S 11:38PM 0:00.02 [mysqld] mysql 9997 0.0 0.0 8228 1592 1 S 11:38PM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mys icarus# procstat -f 9997 PID COMM FD T V FLAGS REF OFFSET PRO NAME 9997 sh cwd v d -------- - - - /root 9997 sh root v d -------- - - - / 9997 sh 0 v c r------- 1 0 - /dev/null 9997 sh 1 v c -w------ 2 0 - /dev/null 9997 sh 2 v c -w------ 2 0 - /dev/null icarus# procstat -f 10078 PID COMM FD T V FLAGS REF OFFSET PRO NAME 10078 mysqld cwd v d -------- - - - /storage/mysql 10078 mysqld root v d -------- - - - / 10078 mysqld 0 v c r------- 1 0 - /dev/null 10078 mysqld 1 v r rwa----- 1 32048 - /storage/mysql/icarus.home.lan.err 10078 mysqld 2 v r rwa----- 1 32380 - /storage/mysql/icarus.home.lan.err At this point I log out of pts/1 and log back in to the machine (which sticks me on pts/2 as a result of the problem). Looking again, we see: icarus# ps -aux -U mysql USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 9997 0.0 0.0 8228 1592 1- I 11:38PM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mys mysql 10078 0.0 0.3 35100 11032 1- I 11:38PM 0:00.02 [mysqld] With absolutely no change in procstat output relevant to fds 0/1/2. Yet pts/1 still appears held open by something: icarus# ls -l /dev/pts total 0 crw--w---- 1 jdc tty 0, 116 Sep 29 23:44 0 crw-rw-rw- 1 root wheel 0, 115 Sep 29 23:41 1 crw--w---- 1 jdc tty 0, 117 Sep 29 23:44 2 fstat also shows no indication of anything using pts/1: icarus# fstat /dev/pts/1 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W NAME icarus# fstat | grep pts/1 icarus# Ideas? -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 07:00:44 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6AC9E106567A; Thu, 30 Sep 2010 07:00:44 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id A9F688FC22; Thu, 30 Sep 2010 07:00:43 +0000 (UTC) Received: by fxm9 with SMTP id 9so1368610fxm.13 for ; Thu, 30 Sep 2010 00:00:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=fZSo8Daw1EDq3wsaZlazef6yi9wKWhBAVjGwmmThSZE=; b=j5s+mbdTo9DrlrbTkiTltiJcEuH369Fu+NAio/HkDnWleyr3OG60r8XygZF043z8c7 DCvYa9LSDQDbrzUir1tp8wXlpCLPYCJPaRg2K2wQ51GhUJOMogjIKCGflLjJ+zJfB+Jn epid7O7T+3/EacavzpjBwbJSzOsyYoeIq/Ipc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=fq6jjqbzhSu4tN7RKWnjDH07VHmNhDfZXNmCKVlWHvwUVJJeDs4DIZ/TzXm3GiWfjp yuyKyTKRuf8VShoJBTMtzrThgT6iIPM6Hf/LCcImaQ9GdO6kxLPf+jjpodu3tpaWYjWG K1lSmCd3sWwb91J3Th32XPwYMOXBHmetwnQBQ= MIME-Version: 1.0 Received: by 10.223.119.211 with SMTP id a19mr3102810far.4.1285830042659; Thu, 30 Sep 2010 00:00:42 -0700 (PDT) Received: by 10.223.120.139 with HTTP; Thu, 30 Sep 2010 00:00:42 -0700 (PDT) In-Reply-To: <20100930065151.GA9634@icarus.home.lan> References: <20100930065151.GA9634@icarus.home.lan> Date: Thu, 30 Sep 2010 02:00:42 -0500 Message-ID: From: Adam Vande More To: Jeremy Chadwick Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Ed Schouten , freebsd-stable@freebsd.org, ale@freebsd.org Subject: Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 07:00:44 -0000 On Thu, Sep 30, 2010 at 1:51 AM, Jeremy Chadwick wrote: > Something interesting I've come across which happens on both RELENG_7 > and RELENG_8 (indicating it's not a problem with the older tty code or > the newer pty/pts code), and it's reproducible on Linux (sort of...). > > mysqld_safe appears to hold a pty/tty open even after the process has > been backgrounded. I can understand how/why this might occur, just not > in this particular case. > Actually cam across this the other day: http://lists.freebsd.org/pipermail/freebsd-ports/2010-July/062417.html It appears you aren't the only one to notice the issue. -- Adam Vande More From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 07:03:34 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1FD1106566C; Thu, 30 Sep 2010 07:03:34 +0000 (UTC) (envelope-from ed@hoeg.nl) Received: from mx0.hoeg.nl (unknown [IPv6:2a01:4f8:101:5343::aa]) by mx1.freebsd.org (Postfix) with ESMTP id 262828FC16; Thu, 30 Sep 2010 07:03:34 +0000 (UTC) Received: by mx0.hoeg.nl (Postfix, from userid 1000) id 46E3C2A28CF9; Thu, 30 Sep 2010 09:03:33 +0200 (CEST) Date: Thu, 30 Sep 2010 09:03:33 +0200 From: Ed Schouten To: Jeremy Chadwick Message-ID: <20100930070333.GU87427@hoeg.nl> References: <20100930065151.GA9634@icarus.home.lan> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="QxSStYAgvEtE+iQJ" Content-Disposition: inline In-Reply-To: <20100930065151.GA9634@icarus.home.lan> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, ale@FreeBSD.org Subject: Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 07:03:34 -0000 --QxSStYAgvEtE+iQJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi Jeremy, * Jeremy Chadwick wrote: > 1) "mysqld_safe > /dev/null 2>&1 &" never released the tty > 2) "nohup mysqld_safe > /dev/null 2>&1 &" did release the tty What happens if you run the following command? daemon -cf mysqld_safe The point is that FreeBSD's pts(4) driver only deallocates TTYs when it's really sure nothing uses it anymore. Even if there is not a single file descriptor referring to the slave device, it has to wait until there exist no processes which have the TTY as its controlling TTY. The `pstat -t' command is quite useful to figure out whether there is still a session associated with the TTY. See the following thread: http://lists.freebsd.org/pipermail/freebsd-ports/2010-July/062417.html --=20 Ed Schouten WWW: http://80386.nl/ --QxSStYAgvEtE+iQJ Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.16 (FreeBSD) iEYEARECAAYFAkykNkUACgkQ52SDGA2eCwUNZQCfe9pqbfGllCbI8eBnrUUeMNb5 ebQAnitV3htvjRs9sEzipAVR6viULUQl =bmVt -----END PGP SIGNATURE----- --QxSStYAgvEtE+iQJ-- From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 07:28:20 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF84E1065674 for ; Thu, 30 Sep 2010 07:28:20 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta02.emeryville.ca.mail.comcast.net (qmta02.emeryville.ca.mail.comcast.net [76.96.30.24]) by mx1.freebsd.org (Postfix) with ESMTP id 95B118FC20 for ; Thu, 30 Sep 2010 07:28:20 +0000 (UTC) Received: from omta01.emeryville.ca.mail.comcast.net ([76.96.30.11]) by qmta02.emeryville.ca.mail.comcast.net with comcast id Cv4Q1f0030EPchoA2vULjP; Thu, 30 Sep 2010 07:28:20 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta01.emeryville.ca.mail.comcast.net with comcast id CvUK1f0033LrwQ28MvUKVd; Thu, 30 Sep 2010 07:28:19 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 0C82D9B418; Thu, 30 Sep 2010 00:28:19 -0700 (PDT) Date: Thu, 30 Sep 2010 00:28:19 -0700 From: Jeremy Chadwick To: Ed Schouten Message-ID: <20100930072819.GA10678@icarus.home.lan> References: <20100930065151.GA9634@icarus.home.lan> <20100930070333.GU87427@hoeg.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100930070333.GU87427@hoeg.nl> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, ale@FreeBSD.org Subject: Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 07:28:20 -0000 On Thu, Sep 30, 2010 at 09:03:33AM +0200, Ed Schouten wrote: > Hi Jeremy, > > * Jeremy Chadwick wrote: > > 1) "mysqld_safe > /dev/null 2>&1 &" never released the tty > > 2) "nohup mysqld_safe > /dev/null 2>&1 &" did release the tty > > What happens if you run the following command? > > daemon -cf mysqld_safe Let's try it and find out. This is all being done from pts/2. icarus# ps -auxwww -U mysql | grep mysqld_safe mysql 9997 0.0 0.0 8228 1592 1- I 11:38PM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mysql/my.cnf --user=mysql --datadir=/storage/mysql --pid-file=/storage/mysql/icarus.home.lan.pid --skip-innodb icarus# /usr/local/etc/rc.d/mysql-server stop Stopping mysql. Waiting for PIDS: 10078. icarus# daemon -c -f -u mysql /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mysql/my.cnf --user=mysql --datadir=/storage/mysql --pid-file=/storage/mysql/icarus.home.lan.pid --skip-innodb icarus# ps -auxwww -U mysql USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 11036 0.0 0.0 8228 1600 ?? Is 12:21AM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mysql/my.cnf --user=mysql --datadir=/storage/mysql --pid-file=/storage/mysql/icarus.home.lan.pid --skip-innodb mysql 11116 0.0 0.3 35100 11032 ?? I 12:21AM 0:00.02 [mysqld] icarus# exit $ exit [another window, different tty] icarus# pstat -t | grep pts/2 icarus# Summary: looks good to me. > The point is that FreeBSD's pts(4) driver only deallocates TTYs when > it's really sure nothing uses it anymore. Even if there is not a single > file descriptor referring to the slave device, it has to wait until > there exist no processes which have the TTY as its controlling TTY. Ah I see. Well that would explain the difference between Linux and FreeBSD then -- it sounds like Linux has a one-off with regards to fds that point to /dev/null. > The `pstat -t' command is quite useful to figure out whether there is > still a session associated with the TTY. > > See the following thread: > > http://lists.freebsd.org/pipermail/freebsd-ports/2010-July/062417.html Ahhh, two people pointing me to the same thread, sweet. :-) I wasn't subscribed to -ports back in July, else I'd almost certainly have said something then. It's exactly as you stated in that thread -- the tty is in "G" state (waiting to be freed/process to exist). Please note the below output was obtained *before* attempting the "daemon -cf" stuff you recommended. icarus# pstat -t | grep pts/1 pts/1 0 0 0 0 0 0 0 0 9372 0 G Until rc(8) can be updated to support daemon(8) natively, the ~76 ports which Do The Wrong Thing(tm) should get updated to do it this way. Ones like mysqlXX-server should be placed high on the priority list given their popularity/importance. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 07:56:18 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 43417106566C for ; Thu, 30 Sep 2010 07:56:18 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [76.96.30.16]) by mx1.freebsd.org (Postfix) with ESMTP id 280278FC0C for ; Thu, 30 Sep 2010 07:56:17 +0000 (UTC) Received: from omta17.emeryville.ca.mail.comcast.net ([76.96.30.73]) by qmta01.emeryville.ca.mail.comcast.net with comcast id CvF11f0041afHeLA1vwH9s; Thu, 30 Sep 2010 07:56:17 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta17.emeryville.ca.mail.comcast.net with comcast id CvwG1f00A3LrwQ28dvwGEu; Thu, 30 Sep 2010 07:56:17 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 88B719B418; Thu, 30 Sep 2010 00:56:16 -0700 (PDT) Date: Thu, 30 Sep 2010 00:56:16 -0700 From: Jeremy Chadwick To: Alex Dupre Message-ID: <20100930075616.GA11519@icarus.home.lan> References: <20100930065151.GA9634@icarus.home.lan> <20100930070333.GU87427@hoeg.nl> <20100930072819.GA10678@icarus.home.lan> <4CA43C91.5040000@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA43C91.5040000@FreeBSD.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Ed Schouten , freebsd-stable@freebsd.org Subject: Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 07:56:18 -0000 On Thu, Sep 30, 2010 at 09:30:25AM +0200, Alex Dupre wrote: > Jeremy Chadwick ha scritto: > > Until rc(8) can be updated to support daemon(8) natively, > > This would be the Right Thing IMHO. > > > the ~76 ports > > which Do The Wrong Thing(tm) should get updated to do it this way. Ones > > like mysqlXX-server should be placed high on the priority list given > > their popularity/importance. > > If you have an already tested patch for the mysql rc script, I'll commit > it asap. Just finished it for databases/mysql51-server. Tested on RELENG_8 with the below variables in use, and also tested with mysql_limits="yes". mysql_enable="yes" mysql_dbdir="/storage/mysql" mysql_args="--skip-innodb" Should work fine on RELENG_7 since it has /usr/sbin/daemon too. Tested using stop, start, and restart. I can test a reboot if you'd like, just let me know. Validation: icarus# /usr/local/etc/rc.d/mysql-server stop Stopping mysql. Waiting for PIDS: 12015. icarus# /usr/local/etc/rc.d/mysql-server start Starting mysql. icarus# ps -auxwww -U mysql USER PID %CPU %MEM VSZ RSS TT STAT STARTED TIME COMMAND mysql 12271 0.0 0.0 8228 1600 ?? Is 12:53AM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/storage/mysql/my.cnf --user=mysql --datadir=/storage/mysql --pid-file=/storage/mysql/icarus.home.lan.pid --skip-innodb mysql 12352 0.0 0.3 35100 11032 ?? I 12:53AM 0:00.02 [mysqld] I'll also take this opportunity to point this out, since I'm certain someone will mention it: daemon's -u argument would be ideal except that it """breaks""" when using rc.subr's xxx_user variable (which uses su(1) to change credentials/spawn $command). With both in use, daemon then fails on setusercontext(), which in turn fails because of initgroups() returning EPERM -- and this does make sense. So let's not use daemon -u in rc.subr for the time being. The diff is pretty obvious/simple (2 line change), so the other databases/mysqlXX-server ports can be upgraded in the same manner. --- files/mysql-server.sh.in.orig 2010-03-27 03:24:53.000000000 -0700 +++ files/mysql-server.sh.in 2010-09-30 00:45:38.000000000 -0700 @@ -35,8 +35,8 @@ mysql_user="mysql" mysql_limits_args="-e -U ${mysql_user}" pidfile="${mysql_dbdir}/`/bin/hostname`.pid" -command="%%PREFIX%%/bin/mysqld_safe" -command_args="--defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args} > /dev/null 2>&1 &" +command="/usr/sbin/daemon" +command_args="-c -f /usr/local/bin/mysqld_safe --defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args}" procname="%%PREFIX%%/libexec/mysqld" start_precmd="${name}_prestart" start_postcmd="${name}_poststart" -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 07:57:09 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A29BD106564A for ; Thu, 30 Sep 2010 07:57:09 +0000 (UTC) (envelope-from ale@FreeBSD.org) Received: from andxor.it (relay.andxor.it [195.223.2.3]) by mx1.freebsd.org (Postfix) with SMTP id D11088FC12 for ; Thu, 30 Sep 2010 07:57:08 +0000 (UTC) Received: (qmail 53239 invoked from network); 30 Sep 2010 07:30:26 -0000 Received: from unknown (HELO ale.andxor.it) (192.168.2.5) by andxor.it with SMTP; 30 Sep 2010 07:30:26 -0000 Message-ID: <4CA43C91.5040000@FreeBSD.org> Date: Thu, 30 Sep 2010 09:30:25 +0200 From: Alex Dupre User-Agent: Thunderbird 2.0.0.22 (X11/20090624) MIME-Version: 1.0 To: Jeremy Chadwick References: <20100930065151.GA9634@icarus.home.lan> <20100930070333.GU87427@hoeg.nl> <20100930072819.GA10678@icarus.home.lan> In-Reply-To: <20100930072819.GA10678@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Ed Schouten , freebsd-stable@freebsd.org Subject: Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 07:57:09 -0000 Jeremy Chadwick ha scritto: > Until rc(8) can be updated to support daemon(8) natively, This would be the Right Thing IMHO. > the ~76 ports > which Do The Wrong Thing(tm) should get updated to do it this way. Ones > like mysqlXX-server should be placed high on the priority list given > their popularity/importance. If you have an already tested patch for the mysql rc script, I'll commit it asap. -- Alex Dupre From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 08:45:09 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DF1851065670 for ; Thu, 30 Sep 2010 08:45:09 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 988248FC17 for ; Thu, 30 Sep 2010 08:45:09 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1P1ETa-000NXr-O1; Thu, 30 Sep 2010 10:26:50 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2 To: Morgan Reed In-reply-to: References: Comments: In-reply-to Morgan Reed message dated "Thu, 30 Sep 2010 00:24:47 +1000." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Thu, 30 Sep 2010 10:26:50 +0200 From: Daniel Braniss Message-ID: Cc: stable@freebsd.org Subject: Re: Diskless/readonly root booting issues X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 08:45:10 -0000 > Hi all, > > I've been working on updating my semi-embedded images to > 7.3-stable of late (I generally wait for .3+ releases), it's been a > few years since the last time I did one of these and I'm having some > issues getting my netboot test environment to behave itself. > > I'm sure it's something simple but I've spent quite a bit of time > looking for answers and poking the system but no joy yet. > > Basically I use a PXE booted NFS root to test my reduced footprint > image builds, the boot is working but init is attempting to remount / > rw (in spite of it being marked ro in fstab) which of course fails > because the directory is exported ro from the NFS server at which > point the system dumps me to single user mode; > > === OUTPUT === > > Starting file system checks: > udp: Netconfig database not found > Mounting root filesystem rw failed, startup aborted > ERROR: ABORTING BOOT (sending SIGTERM to parent)! > Sep 30 09:60:02 init: /bin/sh on /etc/rc terminated abnormally, going > to single user mode > Enter full pathname of shell or RETURN for /bin/sh: > > ============ > > Relevant configs from the diskless root > > == rc.conf == > > ifconfig_le0="DHCP" > > diskless_mount=/etc/rc.initdiskless > > varsize=8192 > varmfs="YES" > > tmpsize=8192 > tmpmfs="YES" > > nfs_client_enable="YES" > > dumpdev="NO" > > ========= > > rc.initdiskless is the version from /usr/share/examples/rc.initdiskless > > == fstab == > > 192.168.2.2:/usr/fbtest / nfs ro 0 0 > proc /proc procfs rw 0 0 > > ======== > > == loader.conf == > > verbose_loading="YES" > > autoboot_delay="2" > > ============ > > Kernel is (obviously) built with NFS_ROOT and NFSCLIENT, relatively > minimalist otherwise, have also tested with GENERIC, same result. > > I must be forgetting something simple in all of this, I don't recall > it being terribly difficult to get this stuff working when I was doing > my original work with 6.3, though I don't recall the use of the > initdiskless script, IIRC I was using rc.diskless2 which (again IIRC) > was later replaced by /etc/rc.d/diskless but I've not been able to > find this script anywhere. > > Any suggestions would be greatly appreciated at this point. > > Thanks, > > Morgan Reed firstly, you should be using the latest pxeboot, it passes the root file-handle to the kernel, so no need to remount it, so remove the line from the fstab. secondly, try using /etc/rc.initdiskless - which is the default. use the KISS method :-) danny From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 08:54:22 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7B16D106566B; Thu, 30 Sep 2010 08:54:22 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id 3D0BD8FC14; Thu, 30 Sep 2010 08:54:21 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.4/8.14.3) with ESMTP id o8U8sL7U021470; Thu, 30 Sep 2010 04:54:21 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.4/8.14.3/Submit) id o8U8sL9X021469; Thu, 30 Sep 2010 08:54:21 GMT (envelope-from tinderbox@freebsd.org) Date: Thu, 30 Sep 2010 08:54:21 GMT Message-Id: <201009300854.o8U8sL9X021469@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8_0 tinderbox] failure on ia64/ia64 X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 08:54:22 -0000 TB --- 2010-09-30 08:07:07 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-09-30 08:07:07 - starting RELENG_8_0 tinderbox run for ia64/ia64 TB --- 2010-09-30 08:07:07 - cleaning the object tree TB --- 2010-09-30 08:10:48 - cvsupping the source tree TB --- 2010-09-30 08:10:48 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8_0/ia64/ia64/supfile TB --- 2010-09-30 08:54:21 - WARNING: /usr/bin/csup returned exit code 1 TB --- 2010-09-30 08:54:21 - ERROR: unable to cvsup the source tree TB --- 2010-09-30 08:54:21 - 1.22 user 134.47 system 2833.60 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8_0-ia64-ia64.full From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 09:39:04 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 78B6C1065672; Thu, 30 Sep 2010 09:39:04 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id 39FDF8FC12; Thu, 30 Sep 2010 09:39:03 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.4/8.14.3) with ESMTP id o8U9d3b9081286; Thu, 30 Sep 2010 05:39:03 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.4/8.14.3/Submit) id o8U9d3B9081285; Thu, 30 Sep 2010 09:39:03 GMT (envelope-from tinderbox@freebsd.org) Date: Thu, 30 Sep 2010 09:39:03 GMT Message-Id: <201009300939.o8U9d3B9081285@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8_0 tinderbox] failure on mips/mips X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 09:39:04 -0000 TB --- 2010-09-30 08:54:21 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-09-30 08:54:21 - starting RELENG_8_0 tinderbox run for mips/mips TB --- 2010-09-30 08:54:21 - cleaning the object tree TB --- 2010-09-30 08:56:19 - cvsupping the source tree TB --- 2010-09-30 08:56:19 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8_0/mips/mips/supfile TB --- 2010-09-30 09:39:03 - WARNING: /usr/bin/csup returned exit code 1 TB --- 2010-09-30 09:39:03 - ERROR: unable to cvsup the source tree TB --- 2010-09-30 09:39:03 - 0.80 user 76.49 system 2681.91 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8_0-mips-mips.full From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 13:06:52 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 828941065674 for ; Thu, 30 Sep 2010 13:06:52 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.emeryville.ca.mail.comcast.net (qmta15.emeryville.ca.mail.comcast.net [76.96.27.228]) by mx1.freebsd.org (Postfix) with ESMTP id 6554C8FC08 for ; Thu, 30 Sep 2010 13:06:51 +0000 (UTC) Received: from omta11.emeryville.ca.mail.comcast.net ([76.96.30.36]) by qmta15.emeryville.ca.mail.comcast.net with comcast id Cztg1f0020mlR8UAF16rkj; Thu, 30 Sep 2010 13:06:51 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta11.emeryville.ca.mail.comcast.net with comcast id D16p1f00K3LrwQ28X16q62; Thu, 30 Sep 2010 13:06:50 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 9391E9B418; Thu, 30 Sep 2010 06:06:49 -0700 (PDT) Date: Thu, 30 Sep 2010 06:06:49 -0700 From: Jeremy Chadwick To: Paul Mather Message-ID: <20100930130649.GA18206@icarus.home.lan> References: <20100930065151.GA9634@icarus.home.lan> <20100930070333.GU87427@hoeg.nl> <20100930072819.GA10678@icarus.home.lan> <4CA43C91.5040000@FreeBSD.org> <20100930075616.GA11519@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Ed Schouten , freebsd-stable@FreeBSD.org, Alex Dupre Subject: Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 13:06:52 -0000 On Thu, Sep 30, 2010 at 08:53:07AM -0400, Paul Mather wrote: > On Sep 30, 2010, at 3:56 AM, Jeremy Chadwick wrote: > > > The diff is pretty obvious/simple (2 line change), so the other > > databases/mysqlXX-server ports can be upgraded in the same manner. > > > > --- files/mysql-server.sh.in.orig 2010-03-27 03:24:53.000000000 -0700 > > +++ files/mysql-server.sh.in 2010-09-30 00:45:38.000000000 -0700 > > @@ -35,8 +35,8 @@ > > mysql_user="mysql" > > mysql_limits_args="-e -U ${mysql_user}" > > pidfile="${mysql_dbdir}/`/bin/hostname`.pid" > > -command="%%PREFIX%%/bin/mysqld_safe" > > -command_args="--defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args} > /dev/null 2>&1 &" > > +command="/usr/sbin/daemon" > > +command_args="-c -f /usr/local/bin/mysqld_safe --defaults-extra-file=${mysql_dbdir}/my.cnf --user=${mysql_user} --datadir=${mysql_dbdir} --pid-file=${pidfile} ${mysql_args}" > > Shouldn't this be "-c -f %%PREFIX%%/bin/mysqld_safe ..." rather than hard-coding /usr/local? Yes. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 13:14:07 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D148C106566C for ; Thu, 30 Sep 2010 13:14:07 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id C22B88FC13 for ; Thu, 30 Sep 2010 13:14:05 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o8U8nr8r081019; Thu, 30 Sep 2010 01:49:57 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201009300849.o8U8nr8r081019@gw.catspoiler.org> Date: Thu, 30 Sep 2010 01:49:53 -0700 (PDT) From: Don Lewis To: avg@icyb.net.ua In-Reply-To: <4CA42A0A.6090003@icyb.net.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 13:14:07 -0000 On 30 Sep, Andriy Gapon wrote: > on 30/09/2010 02:27 Don Lewis said the following: > vmstat -i ? I didn't see anything odd in the vmstat -i output that I posted to the list earlier. It looked more or less normal as the ntp offset suddenly went insane. >> I did manage to catch the problem with lock profiling enabled: >> >> I'm currently testing SMP some more to verify if it really avoids this >> problem. > > OK. I wasn't able to cause SMP on stable to break. The silent reboots that I was seeing with WITNESS go away if I add WITNESS_SKIPSPIN. Witness doesn't complain about anything. I tested -CURRENT and !SMP seems to work ok. One difference in terms of hardware between the two tests is that I'm using a SATA drive when testing -STABLE and a SCSI drive when testing -CURRENT. At this point, I think the biggest clues are going to be in the lock profile results. From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 13:34:28 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 710AC1065674 for ; Thu, 30 Sep 2010 13:34:28 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2-6.sentex.ca [IPv6:2607:f3e0:80:80::2]) by mx1.freebsd.org (Postfix) with ESMTP id 138B18FC17 for ; Thu, 30 Sep 2010 13:34:27 +0000 (UTC) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by smarthost2.sentex.ca (8.14.4/8.14.4) with ESMTP id o8UDYJhQ096400 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Thu, 30 Sep 2010 09:34:19 -0400 (EDT) (envelope-from mike@sentex.net) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.14.4/8.14.4) with ESMTP id o8UDYJiT017075; Thu, 30 Sep 2010 09:34:19 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <201009301334.o8UDYJiT017075@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Thu, 30 Sep 2010 09:34:16 -0400 To: Jack Vogel From: Mike Tancsa In-Reply-To: References: <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <201008171955.o7HJt67T087902@lava.sentex.ca> <20100817200020.GE6482@michelle.cdnetworks.com> <201009141759.o8EHxcZ0013539@lava.sentex.ca> <201009262157.o8QLvR0L012171@lava.sentex.ca> <201009262343.o8QNhgDG012676@lava.sentex.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Scanned-By: MIMEDefang 2.67 on 205.211.164.50 Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: RELENG_7 em problems (and RELENG_8) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 13:34:28 -0000 At 08:00 PM 9/26/2010, Jack Vogel wrote: >The system I've had stress tests running on has 82574 LOMs, so I hope it >will solve the problem, will see tomorrow morning at how things have held >up... I pulled a copy of sys/dev/e1000 from HEAD and copied onto my RELENG_8 box. I had another nic lock up last night :( Anyways, now running with the driver from HEAD on RELENG_8 amd64 em0: port 0x4040-0x405f mem 0xb4400000-0xb441ffff,0xb4425000-0xb4425fff irq 16 at device 25.0 on pci0 em0: Using an MSI interrupt em0: [FILTER] em0: Ethernet address: 00:15:17:ed:68:a5 em1: port 0x2000-0x201f mem 0xb4100000-0xb411ffff,0xb4120000-0xb4123fff irq 16 at device 0.0 on pci9 em1: Using MSIX interrupts with 3 vectors em1: [ITHREAD] em1: [ITHREAD] em1: [ITHREAD] em1: Ethernet address: 00:15:17:ed:68:a4 em0@pci0:0:25:0: class=0x020000 card=0x34ec8086 chip=0x10ef8086 rev=0x05 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 13[e0] = PCI Advanced Features: FLR TP em1@pci0:9:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 rev=0x00 hdr=0x00 vendor = 'Intel Corporation' device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' class = network subclass = ethernet cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) cap 11[a0] = MSI-X supports 5 messages in map 0x1c enabled ecap 0001[100] = AER 1 0 fatal 0 non-fatal 0 corrected ecap 0003[140] = Serial 1 001517ffffed68a4 interrupt total rate irq4: uart0 2283 6 irq16: siis0 4332 11 irq18: arcmsr0 137175 372 irq19: twa0 18805 51 irq21: ehci0 2734 7 irq23: ehci1 675 1 cpu0: timer 733804 1994 irq256: em0 73195 198 irq257: em1:rx 0 238 0 irq258: em1:tx 0 37 0 irq260: ahci0 4328 11 cpu1: timer 725637 1971 cpu3: timer 725709 1972 cpu2: timer 725688 1971 Total 3154640 8572 ---Mike >Jack > > >On Sun, Sep 26, 2010 at 4:43 PM, Mike Tancsa ><mike@sentex.net> wrote: >At 06:19 PM 9/26/2010, Jack Vogel wrote: >Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm >not sure whats broken from what you show here. I will try to get the new >driver out shortly for you to try. > > >With this particular NIC, it will wedge under high load. I tried 2 >different motherboards and chipsets the same behaviour. > > ---Mike > > >Jack > > > >On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa ><mike@sentex.net> wrote: >At 06:36 PM 9/24/2010, Jack Vogel wrote: >There is a new revision of the em driver coming next week, its going thru some >stress pounding over the weekend, if no issues show up I'll put it into HEAD. > >Yongari's changes in TX context handling which effects checksum and tso >are added. I've also decided that multiple queues in 82574 just are a source >of problems without a lot of benefit, so it still uses MSIX but with >only 3 vectors, >meaning it seperates TX and RX but has a single queue. > > >Thanks, looking forward to trying it out! With respect to the >multiple queues, I thought the driver already used just the one on >RELENG_8 ? If not, is there a way to force the existing driver to >use just the one queue ? > >On the box that has the NIC locking up, it shows > >em1@pci0:9:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 >rev=0x00 hdr=0x00 > > vendor = 'Intel Corporation' > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > >and > >vmstat -i shows > >irq256: em0 5129063 353 >irq257: em1 531251 36 > >in a wedged state, stats look like > >dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5 >dev.em.1.%driver: em >dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART >dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 >subdevice=0x34ec class=0x020000 >dev.em.1.%parent: pci9 >dev.em.1.nvm: -1 >dev.em.1.rx_int_delay: 0 >dev.em.1.tx_int_delay: 66 >dev.em.1.rx_abs_int_delay: 66 >dev.em.1.tx_abs_int_delay: 66 >dev.em.1.rx_processing_limit: 100 >dev.em.1.link_irq: 0 >dev.em.1.mbuf_alloc_fail: 0 >dev.em.1.cluster_alloc_fail: 0 >dev.em.1.dropped: 0 >dev.em.1.tx_dma_fail: 0 >dev.em.1.fc_high_water: 18432 >dev.em.1.fc_low_water: 16932 >dev.em.1.mac_stats.excess_coll: 0 >dev.em.1.mac_stats.symbol_errors: 0 >dev.em.1.mac_stats.sequence_errors: 0 >dev.em.1.mac_stats.defer_count: 0 >dev.em.1.mac_stats.missed_packets: 41522 >dev.em.1.mac_stats.recv_no_buff: 19 >dev.em.1.mac_stats.recv_errs: 0 >dev.em.1.mac_stats.crc_errs: 0 >dev.em.1.mac_stats.alignment_errs: 0 >dev.em.1.mac_stats.coll_ext_errs: 0 >dev.em.1.mac_stats.rx_overruns: 41398 >dev.em.1.mac_stats.watchdog_timeouts: 0 >dev.em.1.mac_stats.xon_recvd: 0 >dev.em.1.mac_stats.xon_txd: 0 >dev.em.1.mac_stats.xoff_recvd: 0 >dev.em.1.mac_stats.xoff_txd: 0 >dev.em.1.mac_stats.total_pkts_recvd: 95229129 >dev.em.1.mac_stats.good_pkts_recvd: 95187607 >dev.em.1.mac_stats.bcast_pkts_recvd: 79244 >dev.em.1.mac_stats.mcast_pkts_recvd: 0 >dev.em.1.mac_stats.rx_frames_64: 93680 >dev.em.1.mac_stats.rx_frames_65_127: 1516349 >dev.em.1.mac_stats.rx_frames_128_255: 4464941 >dev.em.1.mac_stats.rx_frames_256_511: 4024 >dev.em.1.mac_stats.rx_frames_512_1023: 2096067 >dev.em.1.mac_stats.rx_frames_1024_1522: 87012546 >dev.em.1.mac_stats.good_octets_recvd: 0 >dev.em.1.mac_stats.good_octest_txd: 0 >dev.em.1.mac_stats.total_pkts_txd: 66775098 >dev.em.1.mac_stats.good_pkts_txd: 66775098 >dev.em.1.mac_stats.bcast_pkts_txd: 509 >dev.em.1.mac_stats.mcast_pkts_txd: 7 >dev.em.1.mac_stats.tx_frames_64: 48038472 >dev.em.1.mac_stats.tx_frames_65_127: 13402833 >dev.em.1.mac_stats.tx_frames_128_255: 5324413 >dev.em.1.mac_stats.tx_frames_256_511: 957 >dev.em.1.mac_stats.tx_frames_512_1023: 319 >dev.em.1.mac_stats.tx_frames_1024_1522: 8104 >dev.em.1.mac_stats.tso_txd: 1069 >dev.em.1.mac_stats.tso_ctx_fail: 0 >dev.em.1.interrupts.asserts: 0 >dev.em.1.interrupts.rx_pkt_timer: 0 >dev.em.1.interrupts.rx_abs_timer: 0 >dev.em.1.interrupts.tx_pkt_timer: 0 >dev.em.1.interrupts.tx_abs_timer: 0 >dev.em.1.interrupts.tx_queue_empty: 0 >dev.em.1.interrupts.tx_queue_min_thresh: 0 >dev.em.1.interrupts.rx_desc_min_thresh: 0 >dev.em.1.interrupts.rx_overrun: 0 >dev.em.1.host.breaker_tx_pkt: 0 >dev.em.1.host.host_tx_pkt_discard: 0 >dev.em.1.host.rx_pkt: 0 >dev.em.1.host.breaker_rx_pkts: 0 >dev.em.1.host.breaker_rx_pkt_drop: 0 >dev.em.1.host.tx_good_pkt: 0 >dev.em.1.host.breaker_tx_pkt_drop: 0 >dev.em.1.host.rx_good_bytes: 0 >dev.em.1.host.tx_good_bytes: 0 >dev.em.1.host.length_errors: 0 >dev.em.1.host.serdes_violation_pkt: 0 >dev.em.1.host.header_redir_missed: 0 > >ifconfig down/up just panics or locks up the box when its in this >state. I also have IPMI enabled on this nic, but it shows the same >issue with it disabled. > > ---Mike > > > >-------------------------------------------------------------------- >Mike Tancsa, tel +1 519 651 3400 >Sentex Communications, >mike@sentex.net >Providing Internet since >1994 ><http://www.sentex.net>www.sentex.net >Cambridge, Ontario >Canada ><http://www.sentex.net/mike>www.sentex.net/mike > > >-------------------------------------------------------------------- >Mike Tancsa, tel +1 519 651 3400 >Sentex >Communications, >mike@sentex.net >Providing Internet since >1994 www.sentex.net >Cambridge, Ontario >Canada www.sentex.net/mike > -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 13:59:47 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4225B1065679 for ; Thu, 30 Sep 2010 13:59:47 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id ABEC78FC1B for ; Thu, 30 Sep 2010 13:59:46 +0000 (UTC) Received: from elsa.codelab.cz (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 5E20819E030; Thu, 30 Sep 2010 15:59:45 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id AEC8919E02D; Thu, 30 Sep 2010 15:59:41 +0200 (CEST) Message-ID: <4CA497CD.1020609@quip.cz> Date: Thu, 30 Sep 2010 15:59:41 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.13) Gecko/20100914 SeaMonkey/2.0.8 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA22FF0.8060303@quip.cz> <20100928184343.GA70384@icarus.home.lan> <4CA25718.2000101@quip.cz> <20100929072316.GA82514@icarus.home.lan> In-Reply-To: <20100929072316.GA82514@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: fetch: Non-recoverable resolver failure X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 13:59:47 -0000 Jeremy Chadwick wrote: > On Tue, Sep 28, 2010 at 10:59:04PM +0200, Miroslav Lachman wrote: >> Jeremy Chadwick wrote: >>> On Tue, Sep 28, 2010 at 08:12:00PM +0200, Miroslav Lachman wrote: >>>> Hi, >>>> >>>> we are using fetch command from cron to run PHP scripts periodically >>>> and sometimes cron sends error e-mails like this: >>>> >>>> fetch: https://hiden.example.com/cron/fiveminutes: Non-recoverable >>>> resolver failure >> >> [...] >> >>>> Note: target domains are hosted on the server it-self and named too. >>>> >>>> The system is FreeBSD 7.3-RELEASE-p2 i386 GENERIC >>>> >>>> Can somebody help me to diagnose this random fetch+resolver issue? [...] >> There is PF with some basic rules, mostly blocking incomming >> packets, allowing all outgoing and scrubbing: >> >> scrub in on bge1 all fragment reassemble >> scrub out on bge1 all no-df random-id min-ttl 24 max-mss 1492 >> fragment reassemble >> >> pass out on bge1 inet proto udp all keep state >> pass out on bge1 inet proto tcp from 1.2.3.40 to any flags S/SA >> modulate state >> pass out on bge1 inet proto tcp from 1.2.3.41 to any flags S/SA >> modulate state >> pass out on bge1 inet proto tcp from 1.2.3.42 to any flags S/SA >> modulate state >> >> modified PF options: >> >> set timeout { frag 15, interval 5 } >> set limit { frags 2500, states 5000 } >> set optimization aggressive >> set block-policy drop >> set loginterface bge1 >> # Let loopback and internal interface traffic flow without restrictions >> set skip on lo0 > > Please also provide "pfctl -s info" output, in addition to uname -a > output (you can hide the hostname), since the pf stack differs depending > on what FreeBSD version you're using. # pfctl -s info No ALTQ support in kernel ALTQ related functions disabled Status: Enabled for 32 days 11:31:02 Debug: Urgent Interface Stats for bge1 IPv4 IPv6 Bytes In 37064314787 0 Bytes Out 279633869976 0 Packets In Passed 214057477 0 Blocked 1180125 0 Packets Out Passed 272266744 0 Blocked 128777 0 State Table Total Rate current entries 181 searches 518860439 184.9/s inserts 16608172 5.9/s removals 16607991 5.9/s Counters match 17951131 6.4/s bad-offset 0 0.0/s fragment 23 0.0/s short 0 0.0/s normalize 4 0.0/s memory 0 0.0/s bad-timestamp 0 0.0/s congestion 0 0.0/s ip-option 0 0.0/s proto-cksum 3095 0.0/s state-mismatch 16707 0.0/s state-insert 0 0.0/s state-limit 0 0.0/s src-limit 0 0.0/s synproxy 0 0.0/s uname: 7.3-RELEASE-p2 FreeBSD 7.3-RELEASE-p2 #0: Mon Jul 12 19:04:04 UTC 2010 root@i386-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC i386 > Things that catch my eye as potential problems -- I don't have a way to > confirm these are responsible for your issue (DNS resolver lookups are > UDP-based, not TCP), but I want to point them out anyway. > > 1) "modulate state" is broken on FreeBSD. Taken from our pf.conf notes: > > # Filtering (public interface only; see "set skip") > # > # NOTE: Do not use "modulate state", as it's known to be broken on FreeBSD. > # http://lists.freebsd.org/pipermail/freebsd-pf/2008-March/004227.html > > 2) "optimization aggressive" sounds dangerous given what pf.conf(5) says > about it. I'd like to know what it considers "idle". > > 3) I would also remove many of the options you have set in your "scrub > out" rule. Starting with a clean slate to see if things improve is > probably a good idea. As you'll see below, sometimes pf does things > which may be correct per IP specification but don't work quite right > with other vendors' IP stacks. > > 4) Your "set timeout" values look to be extreme. I would recommend > leaving these at their defaults given your situation. > > 5) This feature is not in use in your pf.conf, but I want to point out > regardless. "reassemble tcp" is also broken in some way. Again taken > from our pf.conf notes: > > # Normalization -- resolve/reduce traffic ambiguities. > # > # NOTE: Do NOT use 'reassemble tcp' as it definitely causes breakage. > # Issue may be related to other vendors' IP stacks, so let's leave it > # disabled. Thank you for all your hints about PF! Maybe it's time to consider refactoring our standard pf.conf which was made years ago... The original problem seems to be problem of how resolver on FreeBSD 7.3 works. This machine was upgraded from 7.2 few weeks ago and we had not this problem before. I added '|| dig hiden.example.com' to the crontab so I get dig output in the case of fetch failure: */5 * * * * fetch -qo /dev/null "https://hiden.example.com/cron/fiveminutes" || dig hiden.example.com The domain has TTL set to 360 seconds and each fetch "Non-recoverable resolver failure" is exactly in the time when TTL was expired and new query to authoritative nameservers must be done: ; <<>> DiG 9.4.-ESV <<>> hiden.example.com ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30191 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 0 ;; QUESTION SECTION: ;hiden.example.com. IN A ;; ANSWER SECTION: hiden.example.com. 360 IN CNAME server.example.com. server.example.com. 360 IN A 1.2.3.49 ;; AUTHORITY SECTION: example.com. 224 IN NS ns1.ignum.com. example.com. 224 IN NS ns2.ignum.cz. ;; Query time: 395 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Thu Sep 30 11:30:16 2010 ;; MSG SIZE rcvd: 135 Note: real domains and IPs were replaced with example.com / 1.2.3.49 I made some easy script to run dig queries to affected domains each 3 minutes from cron with logging to file. The script is in use for one day and did not log any error response (resolving by dig command works fine) and we got only one occurence of fetch "Non-recoverable resolver failure" in the time when cached DNS entry expired (the above one), this is coincidence where diq query from script was made in the same time as fetch job. The same DNS answere was e-mailed from cron and loggend in to file by the script. So my thought is that DNS cache server (locally running BIND) is working fine, authoritative nameservers too, but resolving the domain for the first time and passing the reply to the fetch fails for unknown reason. I will try to use curl or wget instead of fetch to see if the symptoms persist or not. Miroslav Lachman From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 14:13:01 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BF0701065672; Thu, 30 Sep 2010 14:13:01 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 8CF038FC19; Thu, 30 Sep 2010 14:13:01 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 2A67A46B66; Thu, 30 Sep 2010 10:13:01 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id EC0DC8A04E; Thu, 30 Sep 2010 10:12:56 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Date: Thu, 30 Sep 2010 09:37:03 -0400 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; ) References: <20100224165203.GA10423@zod.isi.edu> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> In-Reply-To: <4CA2488D.7000101@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201009300937.03434.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Thu, 30 Sep 2010 10:12:57 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Ted Faber , Vitaly Magerya , Jung-uk Kim , Ian Smith Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 14:13:01 -0000 On Tuesday, September 28, 2010 3:57:01 pm Vitaly Magerya wrote: > Jung-uk Kim wrote: > >> - the mouse doesn't work until I restart moused manually > > > > I always use hint.psm.0.flags="0x6000" in /boot/loader.conf, i.e., > > turn on both HOOKRESUME and INITAFTERSUSPEND, to work around similar > > problem on different laptop. > > Yes, that helps (after the stall period). > > > Can you please report other problems in the appropriate ML? > > > > em -> freebsd-net@ > > usb -> freebsd-usb@ > > acpi_ec -> freebsd-acpi@ > > I will try to do so. > > I'm not sure about acpi_ec issue though; it's only a warning, and it > doesn't cause me any troubles. > > I also have this kernel message once in a few hours (seemingly random) > if I used sleep/resume before: > > MCA: Bank 1, Status 0xe2000000000001f5 > MCA: Global Cap 0x0000000000000005, Status 0x0000000000000000 > MCA: Vendor "GenuineIntel", ID 0x695, APIC ID 0 > MCA: CPU 0 UNCOR PCC OVER DCACHE L1 ??? error > > But once again, it doesn't really cause any problems. A true uncorrected machine check would trigger a MC# fault and panic. I think this is just garbage in the MCx banks. Are you running the latest 8-stable? The change to reset the banks on resume was MFC'd in r210509 on July 26. -- John Baldwin From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 14:13:03 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 988F51065675 for ; Thu, 30 Sep 2010 14:13:03 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 68D688FC1B for ; Thu, 30 Sep 2010 14:13:03 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 1C7EE46B6C; Thu, 30 Sep 2010 10:13:03 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 70FA48A04F; Thu, 30 Sep 2010 10:13:01 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Date: Thu, 30 Sep 2010 09:40:43 -0400 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201009300940.43136.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Thu, 30 Sep 2010 10:13:02 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Adam Vande More Subject: Re: MCA messages in dmesg X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 14:13:03 -0000 On Thursday, September 30, 2010 2:49:24 am Adam Vande More wrote: > For awhile now, my home server has been acting up. Actually it had a bad > set of RAM long ago, replaced and it and worked fine. It's been weird again > now, and I've found this in dmesg: > > MCA: Bank 0, Status 0xf200000000000800 > MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 > MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 2 > MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory > MCA: Bank 0, Status 0xf200000000000800 > MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 > MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 3 > MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory Are you getting a panic when this happens? -- John Baldwin From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 14:55:03 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0D7251065670; Thu, 30 Sep 2010 14:55:03 +0000 (UTC) (envelope-from vmagerya@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 601968FC0C; Thu, 30 Sep 2010 14:55:02 +0000 (UTC) Received: by bwz15 with SMTP id 15so1875324bwz.13 for ; Thu, 30 Sep 2010 07:55:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=V2zef8Se8AjJuljOAU9jKog39DS4jx9HJ5gKa3h0NMk=; b=eRu3MFPG598QvSG+vwtpxknQIdOovvDmXMx408fJntOE7budGdmR/sw4R9RwDwvKqZ HSzed49J3+86+iUfBHzxDvbf4KoMIXdz97PpfcAt+CatwopQ1Qq5eEytgfV4TpW+7y+Z 5AmOLG7VE65sO/azE5djbeA3Olc4VdjH3Uf8s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=ckP9vU5LqCG0ayhWooAI0RD+vOXwSZliQwllWc8QZbIXi6BIotShy7zge/J2h8WpoQ lnpjXMSgW8ehOoNojKF1/Gkrb2WciA3JmCG97y42KXFfDDOq3J1RlHjQQrEZzHcDSf3g MUhCasQW4wpQy79I+LgNFOmtJOHD8U/umD7Ys= Received: by 10.204.117.13 with SMTP id o13mr2849339bkq.48.1285858501255; Thu, 30 Sep 2010 07:55:01 -0700 (PDT) Received: from [172.16.0.6] (tx97.net [85.198.160.156]) by mx.google.com with ESMTPS id g12sm7829535bkb.14.2010.09.30.07.54.57 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 30 Sep 2010 07:54:58 -0700 (PDT) Message-ID: <4CA4A45A.6010809@gmail.com> Date: Thu, 30 Sep 2010 17:53:14 +0300 From: Vitaly Magerya User-Agent: Thunderbird MIME-Version: 1.0 To: John Baldwin References: <20100224165203.GA10423@zod.isi.edu> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <201009300937.03434.jhb@freebsd.org> In-Reply-To: <201009300937.03434.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Ted Faber , freebsd-stable@freebsd.org, Ian Smith , Jung-uk Kim Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 14:55:03 -0000 John Baldwin wrote: > A true uncorrected machine check would trigger a MC# fault and panic. I think > this is just garbage in the MCx banks. Are you running the latest 8-stable? No, 8.1-RELEASE. From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 16:33:27 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E232F1065670 for ; Thu, 30 Sep 2010 16:33:26 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 3A5CC8FC12 for ; Thu, 30 Sep 2010 16:33:25 +0000 (UTC) Received: by fxm9 with SMTP id 9so1833970fxm.13 for ; Thu, 30 Sep 2010 09:33:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=oD+EMpo8S72DhzbKckLqss+62Lc4K6LNHlveg5Or92w=; b=KPIVXUSEjOunoCO+lCLG56+E5bqpF3LJfGHIihLsZBwENAhWWM+88Kl4YTDiWEKMPy zWopLbytiUDRbBNyuqn3eBAgvQzvE/GbT7kxAdsJNzJf4cQTMOfLb/3mxm5UpIPodjtS La28umc/btfO7dkokWnM9jjYPuMY922vzN9SY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=euZw9RCUaHcqYIefQ5LzFebzkbmSVpzwx+FqewQpJI/EODfeEJFmytsM3F/Ac8QkQu EdZPpiJKmIC1DTiUeQxQ/k5z76Q48ZMAmyHU6xvpI4gIdyOcgRojNRKu3f3f+dHipD7u dSolKSFpiJoes30LL79KyzpZpnj3DCOCnyRgA= MIME-Version: 1.0 Received: by 10.223.124.148 with SMTP id u20mr4054233far.57.1285864404439; Thu, 30 Sep 2010 09:33:24 -0700 (PDT) Received: by 10.223.120.139 with HTTP; Thu, 30 Sep 2010 09:33:24 -0700 (PDT) In-Reply-To: <201009300940.43136.jhb@freebsd.org> References: <201009300940.43136.jhb@freebsd.org> Date: Thu, 30 Sep 2010 11:33:24 -0500 Message-ID: From: Adam Vande More To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-stable@freebsd.org Subject: Re: MCA messages in dmesg X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 16:33:27 -0000 On Thu, Sep 30, 2010 at 8:40 AM, John Baldwin wrote: > On Thursday, September 30, 2010 2:49:24 am Adam Vande More wrote: > > For awhile now, my home server has been acting up. Actually it had a bad > > set of RAM long ago, replaced and it and worked fine. It's been weird > again > > now, and I've found this in dmesg: > > > > MCA: Bank 0, Status 0xf200000000000800 > > MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 > > MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 2 > > MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory > > MCA: Bank 0, Status 0xf200000000000800 > > MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 > > MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 3 > > MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory > > Are you getting a panic when this happens? > It's symptoms vary, but yes I think so. The box is headless, so I depend on logs after boot to see what happens. Sometimes the box panics and powers off with no warning, and other times it just seems to hit a stall state where everything become unresponsive and I have to manually power off. -- Adam Vande More From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 17:25:16 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E22101065756; Thu, 30 Sep 2010 17:25:16 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id B528F8FC1C; Thu, 30 Sep 2010 17:25:16 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 6A2E046B2C; Thu, 30 Sep 2010 13:25:16 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 91E768A03C; Thu, 30 Sep 2010 13:25:15 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Date: Thu, 30 Sep 2010 13:23:23 -0400 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; ) References: <20100224165203.GA10423@zod.isi.edu> <201009300937.03434.jhb@freebsd.org> <4CA4A45A.6010809@gmail.com> In-Reply-To: <4CA4A45A.6010809@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201009301323.23381.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Thu, 30 Sep 2010 13:25:15 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Ted Faber , Vitaly Magerya , Jung-uk Kim , Ian Smith Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 17:25:17 -0000 On Thursday, September 30, 2010 10:53:14 am Vitaly Magerya wrote: > John Baldwin wrote: > > A true uncorrected machine check would trigger a MC# fault and panic. I think > > this is just garbage in the MCx banks. Are you running the latest 8-stable? > > No, 8.1-RELEASE. Ok, that almost certainly explains it then. -- John Baldwin From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 17:25:20 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3435F106566B for ; Thu, 30 Sep 2010 17:25:20 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 025068FC0C for ; Thu, 30 Sep 2010 17:25:20 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 97D9846B2E; Thu, 30 Sep 2010 13:25:19 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 671E48A04E; Thu, 30 Sep 2010 13:25:16 -0400 (EDT) From: John Baldwin To: freebsd-stable@freebsd.org Date: Thu, 30 Sep 2010 13:25:15 -0400 User-Agent: KMail/1.13.5 (FreeBSD/7.3-CBSD-20100819; KDE/4.4.5; amd64; ; ) References: <201009300940.43136.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201009301325.15113.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Thu, 30 Sep 2010 13:25:18 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.6 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: Adam Vande More Subject: Re: MCA messages in dmesg X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 17:25:20 -0000 On Thursday, September 30, 2010 12:33:24 pm Adam Vande More wrote: > On Thu, Sep 30, 2010 at 8:40 AM, John Baldwin wrote: > > > On Thursday, September 30, 2010 2:49:24 am Adam Vande More wrote: > > > For awhile now, my home server has been acting up. Actually it had a bad > > > set of RAM long ago, replaced and it and worked fine. It's been weird > > again > > > now, and I've found this in dmesg: > > > > > > MCA: Bank 0, Status 0xf200000000000800 > > > MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 > > > MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 2 > > > MCA: CPU 2 UNCOR PCC OVER BUSL0 Source ERR Memory > > > MCA: Bank 0, Status 0xf200000000000800 > > > MCA: Global Cap 0x0000000000000806, Status 0x0000000000000000 > > > MCA: Vendor "GenuineIntel", ID 0x6fb, APIC ID 3 > > > MCA: CPU 3 UNCOR PCC OVER BUSL0 Source ERR Memory > > > > Are you getting a panic when this happens? > > > > It's symptoms vary, but yes I think so. The box is headless, so I depend on > logs after boot to see what happens. Sometimes the box panics and powers > off with no warning, and other times it just seems to hit a stall state > where everything become unresponsive and I have to manually power off. Ok, it is a memory error of some sort, but mcelog claims it is a transaction timeout rather than an ECC error, per se: HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 2 BANK 0 MCG status: MCi status: Error overflow Uncorrected error Error enabled Processor context corrupt MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-timeout Error BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE STATUS f200000000000800 MCGSTATUS 0 MCGCAP 806 APICID 2 SOCKETID 0 CPUID Vendor Intel Family 6 Model 15 HARDWARE ERROR. This is *NOT* a software problem! Please contact your hardware vendor CPU 3 BANK 0 MCG status: MCi status: Error overflow Uncorrected error Error enabled Processor context corrupt MCA: BUS Level-0 Local-CPU-originated-request Generic Memory-access Request-timeout Error BQ_DCU_READ_TYPE BQ_ERR_HARD_TYPE BQ_ERR_HARD_TYPE STATUS f200000000000800 MCGSTATUS 0 MCGCAP 806 APICID 3 SOCKETID 0 CPUID Vendor Intel Family 6 Model 15 I've no idea what specific hardware is busted (memory or motherboard or CPU), but I suspect something is likely broken. -- John Baldwin From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 18:45:37 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3074B106566C; Thu, 30 Sep 2010 18:45:37 +0000 (UTC) (envelope-from amvandemore@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8EE008FC08; Thu, 30 Sep 2010 18:45:36 +0000 (UTC) Received: by fxm9 with SMTP id 9so1997037fxm.13 for ; Thu, 30 Sep 2010 11:45:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=cH0gpt0NVscu96qXUXadzBHcGVkxxMPvuv+ZLDOEshs=; b=T7bMZUkE9QUmIOXyxjYV1dBmggevt3psM1vzm906I1xEhdHWyWsP8E4apnRbB4Kg9i deDWpxWPFixZs5oKC656WDCKep+W4+ESFlwQHKNWImkNHw059x9KT5SL2HNs+EeJ9izo govTJFU3jnXqEL9mVUNTILYFP1Aqn17bRYjEs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=PDbxNPZ2RZL5ZrlgiZqN+QAOdZexrPqEMR7bBSHONHpxd35MCWGwHiXl2hAOjMC1fI GLxTxSMb+sIixi9uTlVc4SMwtK//UkcxN87bHz48BIsAJikJSU/rf8Y9DD7tDaiUhuS8 4YcAKT57UHMDRJ4iS7hDuy1OVstEMYbrBRsHU= MIME-Version: 1.0 Received: by 10.223.114.69 with SMTP id d5mr2561700faq.58.1285872334979; Thu, 30 Sep 2010 11:45:34 -0700 (PDT) Received: by 10.223.120.139 with HTTP; Thu, 30 Sep 2010 11:45:34 -0700 (PDT) In-Reply-To: <201009301325.15113.jhb@freebsd.org> References: <201009300940.43136.jhb@freebsd.org> <201009301325.15113.jhb@freebsd.org> Date: Thu, 30 Sep 2010 13:45:34 -0500 Message-ID: From: Adam Vande More To: John Baldwin Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-stable@freebsd.org Subject: Re: MCA messages in dmesg X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 18:45:37 -0000 On Thu, Sep 30, 2010 at 12:25 PM, John Baldwin wrote: > Ok, it is a memory error of some sort, but mcelog claims it is a > transaction > timeout rather than an ECC error, per se: > > > I've no idea what specific hardware is busted (memory or motherboard or > CPU), > but I suspect something is likely broken. > Thanks for looking into it, I'm going to play around with BIOS voltages to see if I can achieve some stability since I don't have much to lose trying that first. The system may work fine for a week or more, then have a really bad day. I've made some raises to the cpu voltage and we'll see how that goes. -- Adam Vande More From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 18:57:54 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1CDEE1065782 for ; Thu, 30 Sep 2010 18:57:54 +0000 (UTC) (envelope-from luke@digital-crocus.com) Received: from mail.digital-crocus.com (node2.digital-crocus.com [91.209.244.128]) by mx1.freebsd.org (Postfix) with ESMTP id C87F18FC1E for ; Thu, 30 Sep 2010 18:57:53 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dkselector; d=hybrid-logic.co.uk; h=Received:Received:Subject:From:Reply-To:To:Cc:In-Reply-To:References:Content-Type:Organization:Date:Message-ID:Mime-Version:X-Mailer:Content-Transfer-Encoding:X-Spam-Score:X-Digital-Crocus-Maillimit:X-Authenticated-Sender:X-Complaints:X-Admin:X-Abuse; b=q7gr51cDFPn6rN+cYc3+rDOs468LYVBnbnZkjuYA+L0V31tkxZLNtUo1btdER+5GRme+p5zX0b4GLo7vFLzc6kaML+WpScROf778XEvTI9FgeO3cgss7iLDGL2S+2ZKq; Received: from luke by mail.digital-crocus.com with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P1OFc-0007Uz-31 for stable@freebsd.org; Thu, 30 Sep 2010 19:53:04 +0100 Received: from 127cr.net ([78.105.122.99] helo=[192.168.1.22]) by mail.digital-crocus.com with esmtpa (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P1OFb-0007Ur-On; Thu, 30 Sep 2010 19:53:04 +0100 From: Luke Marsden To: stable@freebsd.org In-Reply-To: <1285601367.31122.909.camel@pow> References: <1285587910.31122.633.camel@pow> <4CA08D0D.4030406@icyb.net.ua> <1285595631.31122.809.camel@pow> <1285601367.31122.909.camel@pow> Content-Type: text/plain; charset="UTF-8" Organization: Hybrid Web Cluster Date: Thu, 30 Sep 2010 19:57:51 +0100 Message-ID: <1285873071.21063.786.camel@pow> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Content-Transfer-Encoding: 7bit X-Spam-Score: -1.0 X-Digital-Crocus-Maillimit: done X-Authenticated-Sender: luke X-Complaints: abuse@digital-crocus.com X-Admin: admin@digital-crocus.com X-Abuse: abuse@digital-crocus.com (Please include full headers in abuse reports) Cc: team@hybrid-logic.co.uk, support@elastichosts.com Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: luke@hybrid-logic.co.uk List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 18:57:54 -0000 Hi FreeBSD-stable, > > 1. Please, build your kernel with debug symbols. > > 2. Show kgdb output I could not convince the kernel to dump (it was looping forever but not panicing), but I have managed to compiled a kernel with debugging symbols and DDB which immediately drops into the debugger when the problem occurs, see screenshot at: http://lukemarsden.net/kvm-panic.png Progress, I sense. I tried typing 'panic' on the understanding that this should force a panic and cause it would dump core to the configured swap device (I have set dump* in /etc/rc.conf) so that I could get you the kgdb output, but it just looped back into the debugger. This issue seems to occur very early in the boot process. I would like to invite anyone with the skills and the inclination to have a poke around with this directly over VNC to email me off-list and I will turn on the VM and send you the VNC credentials. My email address is: luke [at] hybrid-logic.co.uk Or you can catch me on Skype at luke.marsden. I'm in GMT+1. I look forward to hearing from you ;-) -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 22:55:59 2010 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from [127.0.0.1] (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by hub.freebsd.org (Postfix) with ESMTP id C81D7106564A; Thu, 30 Sep 2010 22:55:59 +0000 (UTC) (envelope-from jkim@FreeBSD.org) From: Jung-uk Kim To: freebsd-stable@FreeBSD.org, luke@hybrid-logic.co.uk Date: Thu, 30 Sep 2010 18:55:50 -0400 User-Agent: KMail/1.6.2 References: <1285587910.31122.633.camel@pow> <1285601367.31122.909.camel@pow> <1285873071.21063.786.camel@pow> In-Reply-To: <1285873071.21063.786.camel@pow> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201009301855.51841.jkim@FreeBSD.org> Cc: team@hybrid-logic.co.uk, support@elastichosts.com Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 22:56:00 -0000 On Thursday 30 September 2010 02:57 pm, Luke Marsden wrote: > Hi FreeBSD-stable, > > > > 1. Please, build your kernel with debug symbols. > > > 2. Show kgdb output > > I could not convince the kernel to dump (it was looping forever but > not panicing), but I have managed to compiled a kernel with > debugging symbols and DDB which immediately drops into the debugger > when the problem occurs, see screenshot at: > > http://lukemarsden.net/kvm-panic.png It seems MCA capability is advertised by the CPUID translator but writing to the MSRs causes GPF. In other words, it seems like a CPU emulator bug. A simple workaround is 'set hw.mca.enabled=0' from the loader prompt. If it works, add hw.mca.enabled="0" in /boot/loader.conf to make it permanent. MCA does not make any sense in emulation any way. Jung-uk Kim From owner-freebsd-stable@FreeBSD.ORG Thu Sep 30 23:19:04 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A09C106564A for ; Thu, 30 Sep 2010 23:19:04 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta07.westchester.pa.mail.comcast.net (qmta07.westchester.pa.mail.comcast.net [76.96.62.64]) by mx1.freebsd.org (Postfix) with ESMTP id 22F478FC16 for ; Thu, 30 Sep 2010 23:19:03 +0000 (UTC) Received: from omta21.westchester.pa.mail.comcast.net ([76.96.62.72]) by qmta07.westchester.pa.mail.comcast.net with comcast id D6yB1f0071ZXKqc57BK4Rp; Thu, 30 Sep 2010 23:19:04 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta21.westchester.pa.mail.comcast.net with comcast id DBK21f00W3LrwQ23hBK3Ya; Thu, 30 Sep 2010 23:19:04 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 59C929B418; Thu, 30 Sep 2010 16:19:01 -0700 (PDT) Date: Thu, 30 Sep 2010 16:19:01 -0700 From: Jeremy Chadwick To: luke@hybrid-logic.co.uk Message-ID: <20100930231901.GA30388@icarus.home.lan> References: <1285587910.31122.633.camel@pow> <4CA08D0D.4030406@icyb.net.ua> <1285595631.31122.809.camel@pow> <1285601367.31122.909.camel@pow> <1285873071.21063.786.camel@pow> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1285873071.21063.786.camel@pow> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@freebsd.org, team@hybrid-logic.co.uk, support@elastichosts.com Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 30 Sep 2010 23:19:04 -0000 On Thu, Sep 30, 2010 at 07:57:51PM +0100, Luke Marsden wrote: > Hi FreeBSD-stable, > > > > 1. Please, build your kernel with debug symbols. > > > 2. Show kgdb output > > I could not convince the kernel to dump (it was looping forever but not > panicing), but I have managed to compiled a kernel with debugging > symbols and DDB which immediately drops into the debugger when the > problem occurs, see screenshot at: > > http://lukemarsden.net/kvm-panic.png > > Progress, I sense. > > I tried typing 'panic' on the understanding that this should force a > panic and cause it would dump core to the configured swap device (I have > set dump* in /etc/rc.conf) so that I could get you the kgdb output, but > it just looped back into the debugger. Try "call doadump" instead. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 00:35:01 2010 Return-Path: Delivered-To: freebsd-stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C50A1065672 for ; Fri, 1 Oct 2010 00:35:01 +0000 (UTC) (envelope-from paul@gromit.dlib.vt.edu) Received: from lennier.cc.vt.edu (lennier.cc.vt.edu [198.82.162.213]) by mx1.freebsd.org (Postfix) with ESMTP id C519E8FC12 for ; Fri, 1 Oct 2010 00:35:00 +0000 (UTC) Received: from dagger.cc.vt.edu (dagger.cc.vt.edu [198.82.163.114]) by lennier.cc.vt.edu (8.13.8/8.13.8) with ESMTP id o8UCr7cQ018814; Thu, 30 Sep 2010 08:53:07 -0400 Received: from auth3.smtp.vt.edu (EHLO auth3.smtp.vt.edu) ([198.82.161.152]) by dagger.cc.vt.edu (MOS 4.1.8-GA FastPath queued) with ESMTP id MJI86570; Thu, 30 Sep 2010 08:53:07 -0400 (EDT) Received: from gromit.tower.lib.vt.edu (gromit.tower.lib.vt.edu [128.173.51.22]) (authenticated bits=0) by auth3.smtp.vt.edu (8.13.8/8.13.8) with ESMTP id o8UCr7oK008090 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Thu, 30 Sep 2010 08:53:07 -0400 Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Paul Mather In-Reply-To: <20100930075616.GA11519@icarus.home.lan> Date: Thu, 30 Sep 2010 08:53:07 -0400 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20100930065151.GA9634@icarus.home.lan> <20100930070333.GU87427@hoeg.nl> <20100930072819.GA10678@icarus.home.lan> <4CA43C91.5040000@FreeBSD.org> <20100930075616.GA11519@icarus.home.lan> To: Jeremy Chadwick X-Mailer: Apple Mail (2.1081) X-Mirapoint-Received-SPF: 198.82.161.152 auth3.smtp.vt.edu paul@gromit.dlib.vt.edu 5 none X-Mirapoint-IP-Reputation: reputation=neutral-1, source=Fixed, refid=n/a, actions=MAILHURDLE SPF TAG X-Junkmail-Status: score=10/50, host=dagger.cc.vt.edu X-Junkmail-SD-Raw: score=unknown, refid=str=0001.0A020203.4CA48833.028E,ss=1,fgs=0, ip=0.0.0.0, so=2009-09-22 00:05:22, dmn=2009-09-10 00:05:08, mode=single engine X-Junkmail-IWF: false Cc: Ed Schouten , freebsd-stable@FreeBSD.org, Alex Dupre Subject: Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 00:35:01 -0000 On Sep 30, 2010, at 3:56 AM, Jeremy Chadwick wrote: > The diff is pretty obvious/simple (2 line change), so the other > databases/mysqlXX-server ports can be upgraded in the same manner. >=20 > --- files/mysql-server.sh.in.orig 2010-03-27 03:24:53.000000000 = -0700 > +++ files/mysql-server.sh.in 2010-09-30 00:45:38.000000000 -0700 > @@ -35,8 +35,8 @@ > mysql_user=3D"mysql" > mysql_limits_args=3D"-e -U ${mysql_user}" > pidfile=3D"${mysql_dbdir}/`/bin/hostname`.pid" > -command=3D"%%PREFIX%%/bin/mysqld_safe" > -command_args=3D"--defaults-extra-file=3D${mysql_dbdir}/my.cnf = --user=3D${mysql_user} --datadir=3D${mysql_dbdir} --pid-file=3D${pidfile} = ${mysql_args} > /dev/null 2>&1 &" > +command=3D"/usr/sbin/daemon" > +command_args=3D"-c -f /usr/local/bin/mysqld_safe = --defaults-extra-file=3D${mysql_dbdir}/my.cnf --user=3D${mysql_user} = --datadir=3D${mysql_dbdir} --pid-file=3D${pidfile} ${mysql_args}" Shouldn't this be "-c -f %%PREFIX%%/bin/mysqld_safe ..." rather than = hard-coding /usr/local? > procname=3D"%%PREFIX%%/libexec/mysqld" > start_precmd=3D"${name}_prestart" > start_postcmd=3D"${name}_poststart" >=20 > --=20 > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP: 4BD6C0CB | Cheers, Paul.= From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 00:56:01 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 33BF41065672 for ; Fri, 1 Oct 2010 00:56:01 +0000 (UTC) (envelope-from luke@digital-crocus.com) Received: from mail.digital-crocus.com (node2.digital-crocus.com [91.209.244.128]) by mx1.freebsd.org (Postfix) with ESMTP id DEE378FC12 for ; Fri, 1 Oct 2010 00:56:00 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dkselector; d=hybrid-logic.co.uk; h=Received:Received:Subject:From:To:Cc:In-Reply-To:References:Content-Type:Organization:Date:Message-ID:Mime-Version:X-Mailer:Content-Transfer-Encoding:X-Spam-Score:X-Digital-Crocus-Maillimit:X-Authenticated-Sender:X-Complaints:X-Admin:X-Abuse; b=BdF8hksn6Yo4Q7HH25IoaMERaiI3t7Aq2G7A7utQlFNIlcOoQg90kaZAknfapp7UaDlyNIgFZV7B4eiUy3jMnjXL6EuiRZIQJ/jzFKapyawPjC58iRdjz2goTT5Hpu68; Received: from luke by mail.digital-crocus.com with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P1TMB-000JWu-0Z for freebsd-stable@freebsd.org; Fri, 01 Oct 2010 01:20:11 +0100 Received: from 127cr.net ([78.105.122.99] helo=[192.168.1.22]) by mail.digital-crocus.com with esmtpa (Exim 4.69 (FreeBSD)) (envelope-from ) id 1P1TM8-000JUW-Gf; Fri, 01 Oct 2010 01:20:10 +0100 From: Luke Marsden To: Jung-uk Kim In-Reply-To: <201009301855.51841.jkim@FreeBSD.org> References: <1285587910.31122.633.camel@pow> <1285601367.31122.909.camel@pow> <1285873071.21063.786.camel@pow> <201009301855.51841.jkim@FreeBSD.org> Content-Type: text/plain; charset="UTF-8" Organization: Hybrid Logic Date: Fri, 01 Oct 2010 01:22:02 +0100 Message-ID: <1285892522.21063.1437.camel@pow> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 Content-Transfer-Encoding: 7bit X-Spam-Score: -1.0 X-Digital-Crocus-Maillimit: done X-Authenticated-Sender: luke X-Complaints: abuse@digital-crocus.com X-Admin: admin@digital-crocus.com X-Abuse: abuse@digital-crocus.com (Please include full headers in abuse reports) Cc: support@elastichosts.com, team@hybrid-logic.co.uk, freebsd-stable@FreeBSD.org Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 00:56:01 -0000 On Thu, 2010-09-30 at 18:55 -0400, Jung-uk Kim wrote: > It seems MCA capability is advertised by the CPUID translator but > writing to the MSRs causes GPF. In other words, it seems like a CPU > emulator bug. A simple workaround is 'set hw.mca.enabled=0' from the > loader prompt. If it works, add hw.mca.enabled="0" > in /boot/loader.conf to make it permanent. MCA does not make any > sense in emulation any way. Awesome, this allows us to boot 8.1R on Linux KVM with AMD hardware! Thank you very much. This has just doubled our number of availability zones. -- Best Regards, Luke Marsden Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting based on FreeBSD and ZFS Mobile: +447791750420 From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 03:15:38 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0BB55106566B for ; Fri, 1 Oct 2010 03:15:38 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id ACA848FC0C for ; Fri, 1 Oct 2010 03:15:37 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o913FOhn099303; Thu, 30 Sep 2010 20:15:29 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201010010315.o913FOhn099303@gw.catspoiler.org> Date: Thu, 30 Sep 2010 20:15:24 -0700 (PDT) From: Don Lewis To: avg@icyb.net.ua In-Reply-To: <4CA42A0A.6090003@icyb.net.ua> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 03:15:38 -0000 On 30 Sep, Andriy Gapon wrote: > on 30/09/2010 02:27 Don Lewis said the following: >> I tried enabling apic and got worse results. I saw ping RTTs as high as >> 67 seconds. Here's the timer info with apic enabled: [snip] >> Here's the verbose boot info with apic: >> > > vmstat -i ? Here's the vmstat -i output at the time the machine starts experiencing freezes and ntp goes insane: Thu Sep 30 11:38:57 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 10 0 irq12: psm0 18 0 irq14: ata0 2845 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 74628 40 cpu0: timer 3676399 1999 irq256: nfe0 3915 2 Total 3758132 2043 remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 129 128 377 0.185 -0.307 0.020 Thu Sep 30 11:39:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 10 0 irq12: psm0 18 0 irq14: ata0 2935 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 78954 41 cpu0: timer 3796447 1998 irq256: nfe0 4090 2 Total 3882771 2043 remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 61 128 377 0.185 -0.307 0.023 Thu Sep 30 11:40:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 10 0 irq12: psm0 18 0 irq14: ata0 3025 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 85038 43 cpu0: timer 3916483 1998 irq256: nfe0 4247 2 Total 4009138 2045 remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 121 128 377 0.185 -0.307 0.023 Thu Sep 30 11:41:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 10 0 irq12: psm0 18 0 irq14: ata0 3115 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 89099 44 cpu0: timer 4036529 1998 irq256: nfe0 4384 2 Total 4133472 2046 remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 54 128 377 0.185 -0.307 43008.9 Thu Sep 30 11:42:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 11 0 irq12: psm0 18 0 irq14: ata0 3205 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 92111 44 cpu0: timer 4156575 1998 irq256: nfe0 4421 2 Total 4256658 2046 remote refid st t when poll reach delay offset jitter ============================================================================== *gw.catspoiler.o .GPS. 1 u 114 128 377 0.185 -0.307 43008.9 Thu Sep 30 11:43:59 PDT 2010 interrupt total rate irq1: atkbd0 6 0 irq9: acpi0 12 0 irq12: psm0 18 0 irq14: ata0 3295 1 irq17: ahc0 310 0 irq19: fwohci0 1 0 irq22: ehci0+ 92132 43 cpu0: timer 4276621 1998 irq256: nfe0 4444 2 Total 4376839 2045 remote refid st t when poll reach delay offset jitter ============================================================================== gw.catspoiler.o .GPS. 1 u 44 128 377 0.177 113790. 105350. I also hacked a kernel compiled with SMP so that it only finds one CPU core. The machine still freezes and causes long ping RTTs and ntp insanity. BTW, I first tried the above test by disabling the second core using the machdep.hlt_cpus sysctl knob. The results were most entertaining. When I tried to run "make index", it would hang early on. It looked like SCHED_ULE was trying to schedule the processes at both ends of a pipe on both CPU cores even though it should have only been trying to use one core. The downstream process would then wait forever in piperd. From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 07:45:10 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 039BF1065696 for ; Fri, 1 Oct 2010 07:45:10 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id B521A8FC1B for ; Fri, 1 Oct 2010 07:45:09 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1P1a0v-0009QQ-Rt for freebsd-stable@freebsd.org; Fri, 01 Oct 2010 09:26:41 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2 To: freebsd-stable@freebsd.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 01 Oct 2010 09:26:41 +0200 From: Daniel Braniss Message-ID: Subject: boot0cfg problems X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 07:45:10 -0000 In a not so distant past, boot0cfg -sn ... used to work, then it only partialy worked, it would modify the data in boot but not the mbr, for which 'gpart -s set active -in ...' modified the mbr. Now # boot0cfg -s1 -v /dev/mfid0 boot0cfg: write_mbr: /dev/mfid0: Operation not permitted but: # boot0cfg -v /dev/mfid0 # flag start chs type end chs offset size 1 0x80 0: 1: 1 0xa5 1023:212:63 63 41943006 2 0x00 1023:255:63 0xa5 1023:169:63 41943069 41943006 3 0x00 1023:255:63 0xa5 1023:126:63 83886075 41943006 4 0x00 1023:255:63 0xa5 1023:201:63 125829081 1046478825 version=2.0 drive=0x80 mask=0x3 ticks=182 bell=# (0x23) options=packet,update,nosetdrv volume serial ID 9090-9090 default_selection=F2 (Slice 2) From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 07:50:35 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 10EDC1065670 for ; Fri, 1 Oct 2010 07:50:35 +0000 (UTC) (envelope-from kamikaze@bsdforen.de) Received: from mail.bsdforen.de (bsdforen.de [212.204.60.79]) by mx1.freebsd.org (Postfix) with ESMTP id C2DF48FC08 for ; Fri, 1 Oct 2010 07:50:33 +0000 (UTC) Received: from mobileKamikaze.norad (HSI-KBW-078-042-098-160.hsi3.kabel-badenwuerttemberg.de [78.42.98.160]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.bsdforen.de (Postfix) with ESMTP id E4C2D8A2830 for ; Fri, 1 Oct 2010 09:50:32 +0200 (CEST) Message-ID: <4CA592C8.5040008@bsdforen.de> Date: Fri, 01 Oct 2010 09:50:32 +0200 From: Dominic Fandrey User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-GB; rv:1.9.1.12) Gecko/20100918 Thunderbird/3.0.8 MIME-Version: 1.0 To: freebsd-stable@freebsd.org X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Subject: pkg_version output does not match manual page X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 07:50:35 -0000 > pkg_version -IoL= graphics/dri > java/eclipse-eclemma ! graphics/libGL > graphics/libGLU > graphics/libdrm > graphics/libglut > graphics/mesa-demos > games/openarena > games/openarena-data ! games/openarena-oax ! The ports java/eclipse-eclemma, games/openarena-data and games/openarena-oax are all ports of mine that await commit (ports/144849 from March and ports/146818 from May). The pkg_version(1) manual page states: ! The installed package exists in the index but for some reason, pkg_version was unable to compare the version number of the installed package with the corresponding entry in the index. Well, these installed packages definitely do not exist in the INDEX. I would expect ? instead: ? The installed package does not appear in the index. This could be due to an out of date index or a package taken from a PR that has not yet been committed. Does anyone else have this problem? Regards -- A: Because it fouls the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing on usenet and in e-mail? From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 08:24:07 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 10140106566B; Fri, 1 Oct 2010 08:24:07 +0000 (UTC) (envelope-from chris@arachsys.com) Received: from alpha.arachsys.com (alpha.arachsys.com [91.203.57.7]) by mx1.freebsd.org (Postfix) with ESMTP id C96D88FC15; Fri, 1 Oct 2010 08:24:06 +0000 (UTC) Received: from [83.104.159.199] (helo=miranda.arachsys.com) by alpha.arachsys.com with esmtpa (Exim 4.52) id 1P1aXU-0000q6-7g; Fri, 01 Oct 2010 09:00:20 +0100 Date: Fri, 1 Oct 2010 09:00:18 +0100 From: Chris Webb To: Luke Marsden Message-ID: <20101001080017.GB2371@arachsys.com> Mail-Followup-To: support@elastichosts.com, Luke Marsden , Jung-uk Kim , freebsd-stable@FreeBSD.org, team@hybrid-logic.co.uk References: <1285587910.31122.633.camel@pow> <1285601367.31122.909.camel@pow> <1285873071.21063.786.camel@pow> <201009301855.51841.jkim@FreeBSD.org> <1285892522.21063.1437.camel@pow> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1285892522.21063.1437.camel@pow> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: support@elastichosts.com, team@hybrid-logic.co.uk, freebsd-stable@FreeBSD.org, Jung-uk Kim Subject: Re: Problem running 8.1R on KVM with AMD hosts X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: support@elastichosts.com List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 08:24:07 -0000 Luke Marsden writes: > On Thu, 2010-09-30 at 18:55 -0400, Jung-uk Kim wrote: > > It seems MCA capability is advertised by the CPUID translator but > > writing to the MSRs causes GPF. In other words, it seems like a CPU > > emulator bug. A simple workaround is 'set hw.mca.enabled=0' from the > > loader prompt. If it works, add hw.mca.enabled="0" > > in /boot/loader.conf to make it permanent. MCA does not make any > > sense in emulation any way. Many thanks for tracking this one down for us! > Awesome, this allows us to boot 8.1R on Linux KVM with AMD hardware! I'll patch the system on lon-b not to advertise mca this morning Luke. It'll be interesting to try again once I've done this to check if it fixes normal booting without the extra hw.mca_enabled="0" flag. Best wishes, Chris. From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 09:01:45 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9E2F0106566B for ; Fri, 1 Oct 2010 09:01:45 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta09.emeryville.ca.mail.comcast.net (qmta09.emeryville.ca.mail.comcast.net [76.96.30.96]) by mx1.freebsd.org (Postfix) with ESMTP id 860EF8FC08 for ; Fri, 1 Oct 2010 09:01:45 +0000 (UTC) Received: from omta04.emeryville.ca.mail.comcast.net ([76.96.30.35]) by qmta09.emeryville.ca.mail.comcast.net with comcast id DLxB1f0020lTkoCA9M1lYq; Fri, 01 Oct 2010 09:01:45 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta04.emeryville.ca.mail.comcast.net with comcast id DM1j1f00J3LrwQ28QM1kND; Fri, 01 Oct 2010 09:01:44 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 906D79B418; Fri, 1 Oct 2010 02:01:43 -0700 (PDT) Date: Fri, 1 Oct 2010 02:01:43 -0700 From: Jeremy Chadwick To: Daniel Braniss Message-ID: <20101001090143.GA40450@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: boot0cfg problems X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 09:01:45 -0000 On Fri, Oct 01, 2010 at 09:26:41AM +0200, Daniel Braniss wrote: > In a not so distant past, boot0cfg -sn ... used to work, then it only > partialy worked, it would modify the data in boot but not the mbr, for > which 'gpart -s set active -in ...' modified the mbr. Now > # boot0cfg -s1 -v /dev/mfid0 > boot0cfg: write_mbr: /dev/mfid0: Operation not permitted > but: > # boot0cfg -v /dev/mfid0 > # flag start chs type end chs offset size > 1 0x80 0: 1: 1 0xa5 1023:212:63 63 41943006 > 2 0x00 1023:255:63 0xa5 1023:169:63 41943069 41943006 > 3 0x00 1023:255:63 0xa5 1023:126:63 83886075 41943006 > 4 0x00 1023:255:63 0xa5 1023:201:63 125829081 1046478825 > > version=2.0 drive=0x80 mask=0x3 ticks=182 bell=# (0x23) > options=packet,update,nosetdrv > volume serial ID 9090-9090 > default_selection=F2 (Slice 2) Can you try doing "sysctl kern.geom.debugflags=16" first? -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 11:20:44 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1FC89106564A for ; Fri, 1 Oct 2010 11:20:44 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id CCC6F8FC0C for ; Fri, 1 Oct 2010 11:20:43 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1P1dfO-000BRI-MZ; Fri, 01 Oct 2010 13:20:42 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2 To: Jeremy Chadwick In-reply-to: <20101001090143.GA40450@icarus.home.lan> References: <20101001090143.GA40450@icarus.home.lan> Comments: In-reply-to Jeremy Chadwick message dated "Fri, 01 Oct 2010 02:01:43 -0700." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 01 Oct 2010 13:20:42 +0200 From: Daniel Braniss Message-ID: Cc: freebsd-stable@freebsd.org Subject: Re: boot0cfg problems X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 11:20:44 -0000 > On Fri, Oct 01, 2010 at 09:26:41AM +0200, Daniel Braniss wrote: > > In a not so distant past, boot0cfg -sn ... used to work, then it only > > partialy worked, it would modify the data in boot but not the mbr, for > > which 'gpart -s set active -in ...' modified the mbr. Now > > # boot0cfg -s1 -v /dev/mfid0 > > boot0cfg: write_mbr: /dev/mfid0: Operation not permitted > > but: > > # boot0cfg -v /dev/mfid0 > > # flag start chs type end chs offset size > > 1 0x80 0: 1: 1 0xa5 1023:212:63 63 41943006 > > 2 0x00 1023:255:63 0xa5 1023:169:63 41943069 41943006 > > 3 0x00 1023:255:63 0xa5 1023:126:63 83886075 41943006 > > 4 0x00 1023:255:63 0xa5 1023:201:63 125829081 1046478825 > > > > version=2.0 drive=0x80 mask=0x3 ticks=182 bell=# (0x23) > > options=packet,update,nosetdrv > > volume serial ID 9090-9090 > > default_selection=F2 (Slice 2) > > Can you try doing "sysctl kern.geom.debugflags=16" first? > this is not realy foot-shooting :-), but - the error msg is gone, - the slice info is updated, - but the active bit in the mbr is not! - some bioses rely on it. looking at changes done to boot0cfg.c there is now an err(...) call which does an exit, before the boot is updated. I changed it to a warn(...) and the old behaviour is back. BTW, a- gpart command should have been: gpart set -a active -i n ... b- this works with kern.geom.debugflags=0. thanks, danny From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 11:34:36 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4606A106566C for ; Fri, 1 Oct 2010 11:34:36 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [76.96.30.16]) by mx1.freebsd.org (Postfix) with ESMTP id 2BA5F8FC15 for ; Fri, 1 Oct 2010 11:34:35 +0000 (UTC) Received: from omta16.emeryville.ca.mail.comcast.net ([76.96.30.72]) by qmta01.emeryville.ca.mail.comcast.net with comcast id DPPN1f0061ZMdJ4A1Pab2f; Fri, 01 Oct 2010 11:34:35 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta16.emeryville.ca.mail.comcast.net with comcast id DPaa1f0093LrwQ28cPaaEn; Fri, 01 Oct 2010 11:34:35 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 354909B418; Fri, 1 Oct 2010 04:34:34 -0700 (PDT) Date: Fri, 1 Oct 2010 04:34:34 -0700 From: Jeremy Chadwick To: Daniel Braniss Message-ID: <20101001113434.GA43360@icarus.home.lan> References: <20101001090143.GA40450@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: boot0cfg problems X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 11:34:36 -0000 On Fri, Oct 01, 2010 at 01:20:42PM +0200, Daniel Braniss wrote: > > On Fri, Oct 01, 2010 at 09:26:41AM +0200, Daniel Braniss wrote: > > > In a not so distant past, boot0cfg -sn ... used to work, then it only > > > partialy worked, it would modify the data in boot but not the mbr, for > > > which 'gpart -s set active -in ...' modified the mbr. Now > > > # boot0cfg -s1 -v /dev/mfid0 > > > boot0cfg: write_mbr: /dev/mfid0: Operation not permitted > > > but: > > > # boot0cfg -v /dev/mfid0 > > > # flag start chs type end chs offset size > > > 1 0x80 0: 1: 1 0xa5 1023:212:63 63 41943006 > > > 2 0x00 1023:255:63 0xa5 1023:169:63 41943069 41943006 > > > 3 0x00 1023:255:63 0xa5 1023:126:63 83886075 41943006 > > > 4 0x00 1023:255:63 0xa5 1023:201:63 125829081 1046478825 > > > > > > version=2.0 drive=0x80 mask=0x3 ticks=182 bell=# (0x23) > > > options=packet,update,nosetdrv > > > volume serial ID 9090-9090 > > > default_selection=F2 (Slice 2) > > > > Can you try doing "sysctl kern.geom.debugflags=16" first? > > > this is not realy foot-shooting :-), but > - the error msg is gone, > - the slice info is updated, > - but the active bit in the mbr is not! - some bioses rely on it. > looking at changes done to boot0cfg.c there is now an err(...) call which > does an exit, before the boot is updated. I changed it to a warn(...) and the > old > behaviour is back. > BTW, > a- gpart command should have been: gpart set -a active -i n ... > b- this works with kern.geom.debugflags=0. Bit 4 (hence 0x10, or 16 decimal) in kern.geom.debugflags is described as: 0x10 (allow foot shooting) Allow writing to Rank 1 providers. This would, for example, allow the super-user to overwrite the MBR on the root disk or write random sectors elsewhere to a mounted disk. The implica‐ tions are obvious. I read this as: "you can't modify the MBR of a root disk unless bit 4 of this sysctl is set". Sector 0 holds the MBR, and boot0cfg modifies the MBR. So can you explain what you mean by "this really isn't foot-shooting?" I mean, even the NOTE section of the boot0cfg(8) man page documents what I'm trying to say. Anyway, if the MBR did get updated without kern.geom.debugflags having bit 4 set, then wouldn't this indicate there's a bug in GEOM's "sector 0" protection? -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 12:55:32 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A3151065694; Fri, 1 Oct 2010 12:55:32 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) by mx1.freebsd.org (Postfix) with ESMTP id 3C2108FC0C; Fri, 1 Oct 2010 12:55:32 +0000 (UTC) Received: from turtle.stack.nl (turtle.stack.nl [IPv6:2001:610:1108:5010::132]) by mx1.stack.nl (Postfix) with ESMTP id CDADC1DD687; Fri, 1 Oct 2010 14:55:30 +0200 (CEST) Received: by turtle.stack.nl (Postfix, from userid 1677) id B542C172A0; Fri, 1 Oct 2010 14:55:30 +0200 (CEST) Date: Fri, 1 Oct 2010 14:55:30 +0200 From: Jilles Tjoelker To: Ed Schouten Message-ID: <20101001125530.GA52375@stack.nl> References: <20100930065151.GA9634@icarus.home.lan> <20100930070333.GU87427@hoeg.nl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20100930070333.GU87427@hoeg.nl> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-stable@freebsd.org, ale@FreeBSD.org, Jeremy Chadwick Subject: Re: mysqld_safe holding open a pty/tty on FreeBSD (7.x and 8.x) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 12:55:32 -0000 On Thu, Sep 30, 2010 at 09:03:33AM +0200, Ed Schouten wrote: > * Jeremy Chadwick wrote: > > 1) "mysqld_safe > /dev/null 2>&1 &" never released the tty > > 2) "nohup mysqld_safe > /dev/null 2>&1 &" did release the tty > What happens if you run the following command? > daemon -cf mysqld_safe > The point is that FreeBSD's pts(4) driver only deallocates TTYs when > it's really sure nothing uses it anymore. Even if there is not a single > file descriptor referring to the slave device, it has to wait until > there exist no processes which have the TTY as its controlling TTY. In fact, POSIX allows dissociating the controlling terminal from the session when all file descriptors for it (in any session) have been closed. See SUSv4 XBD 11.1.3 The Controlling Terminal. Once the terminal has been dissociated, it is no longer in use at all and can, in case of a pty, be cleaned up. Implementing this may be an interesting idea. Of course, this will cause opening /dev/tty to fail in some cases where it previously succeeded, but it seems uncommon. Somewhat unrelated, I think that starting daemons with daemon(8), /dev/null 2>&1 or similar is inferior to implementing daemonizing in the program itself. Think of the poor soul who needs to install and start N daemons full of bugs and configuration errors: it is better if such errors show up on the console instead of being hidden away in a log file. -- Jilles Tjoelker From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 15:07:10 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 342BC1065696 for ; Fri, 1 Oct 2010 15:07:10 +0000 (UTC) (envelope-from bruce@cran.org.uk) Received: from muon.cran.org.uk (unknown [IPv6:2a01:348:0:15:5d59:5c40:0:1]) by mx1.freebsd.org (Postfix) with ESMTP id E922B8FC14 for ; Fri, 1 Oct 2010 15:07:09 +0000 (UTC) Received: from muon.cran.org.uk (localhost [127.0.0.1]) by muon.cran.org.uk (Postfix) with ESMTP id 32B46E615F; Fri, 1 Oct 2010 16:07:09 +0100 (BST) Received: from unknown (client-82-31-11-222.midd.adsl.virginmedia.com [82.31.11.222]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by muon.cran.org.uk (Postfix) with ESMTPSA; Fri, 1 Oct 2010 16:07:08 +0100 (BST) Date: Fri, 1 Oct 2010 16:07:03 +0100 From: Bruce Cran To: Jeremy Chadwick Message-ID: <20101001160703.00005fc3@unknown> In-Reply-To: <20101001113434.GA43360@icarus.home.lan> References: <20101001090143.GA40450@icarus.home.lan> <20101001113434.GA43360@icarus.home.lan> X-Mailer: Claws Mail 3.7.6 (GTK+ 2.16.6; i586-pc-mingw32msvc) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: freebsd-stable@freebsd.org Subject: Re: boot0cfg problems X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 15:07:10 -0000 On Fri, 1 Oct 2010 04:34:34 -0700 Jeremy Chadwick wrote: > Anyway, if the MBR did get updated without kern.geom.debugflags having > bit 4 set, then wouldn't this indicate there's a bug in GEOM's "sector > 0" protection? Or that it knows that updating the active byte is harmless. -- Bruce Cran From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 15:10:53 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0C5F31065670 for ; Fri, 1 Oct 2010 15:10:53 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id B6CE28FC0C for ; Fri, 1 Oct 2010 15:10:52 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1P1hG7-000EG1-ET; Fri, 01 Oct 2010 17:10:51 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.2 To: Jeremy Chadwick In-reply-to: <20101001113434.GA43360@icarus.home.lan> References: <20101001090143.GA40450@icarus.home.lan> <20101001113434.GA43360@icarus.home.lan> Comments: In-reply-to Jeremy Chadwick message dated "Fri, 01 Oct 2010 04:34:34 -0700." Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Fri, 01 Oct 2010 17:10:51 +0200 From: Daniel Braniss Message-ID: Cc: freebsd-stable@freebsd.org Subject: Re: boot0cfg problems X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 15:10:53 -0000 > On Fri, Oct 01, 2010 at 01:20:42PM +0200, Daniel Braniss wrote: > > > On Fri, Oct 01, 2010 at 09:26:41AM +0200, Daniel Braniss wrote: > > > > In a not so distant past, boot0cfg -sn ... used to work, then it = only > > > > partialy worked, it would modify the data in boot but not the mbr= , for > > > > which 'gpart -s set active -in ...' modified the mbr. Now > > > > =23 boot0cfg -s1 -v /dev/mfid0 > > > > boot0cfg: write_mbr: /dev/mfid0: Operation not permitted > > > > but: > > > > =23 boot0cfg -v /dev/mfid0 > > > > =23 flag start chs type end chs offset = size > > > > 1 0x80 0: 1: 1 0xa5 1023:212:63 63 4194= 3006 > > > > 2 0x00 1023:255:63 0xa5 1023:169:63 41943069 4194= 3006 > > > > 3 0x00 1023:255:63 0xa5 1023:126:63 83886075 4194= 3006 > > > > 4 0x00 1023:255:63 0xa5 1023:201:63 125829081 104647= 8825 > > > >=20 > > > > version=3D2.0 drive=3D0x80 mask=3D0x3 ticks=3D182 bell=3D=23 = (0x23) > > > > options=3Dpacket,update,nosetdrv > > > > volume serial ID 9090-9090 > > > > default_selection=3DF2 (Slice 2) > > >=20 > > > Can you try doing =22sysctl kern.geom.debugflags=3D16=22 first? > > > > > this is not realy foot-shooting :-), but > > - the error msg is gone, > > - the slice info is updated, > > - but the active bit in the mbr is not=21 - some bioses rely on it. > > looking at changes done to boot0cfg.c there is now an err(...) call w= hich > > does an exit, before the boot is updated. I changed it to a warn(...)= and the=20 > > old > > behaviour is back. > > BTW,=20 > > a- gpart command should have been: gpart set -a active -i n ... > > b- this works with kern.geom.debugflags=3D0. >=20 > Bit 4 (hence 0x10, or 16 decimal) in kern.geom.debugflags is described > as: >=20 > 0x10 (allow foot shooting) > Allow writing to Rank 1 providers. This would, for exampl= e, > allow the super-user to overwrite the MBR on the root disk= or > write random sectors elsewhere to a mounted disk. The imp= lica=E2=80=90 > tions are obvious. >=20 > I read this as: =22you can't modify the MBR of a root disk unless bit 4= of > this sysctl is set=22. Sector 0 holds the MBR, and boot0cfg modifies t= he > MBR. So can you explain what you mean by =22this really isn't > foot-shooting?=22 I mean, even the NOTE section of the boot0cfg(8) man= > page documents what I'm trying to say. >=20 > Anyway, if the MBR did get updated without kern.geom.debugflags having > bit 4 set, then wouldn't this indicate there's a bug in GEOM's =22secto= r > 0=22 protection? but mbr did NOT get updated by boot0cfg, gpart does however succeed, but = gpart=20 knows nothing about the other bits boot0cfg knows, like which slice to bo= ot=20 from (not to be confused with the current active slice), what bell to ring, et= c, these are (or used to be) updated before the last change. anyways, as you correctly pointed out, the problem is in GEOM, being some= what over protective :-) From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 15:45:35 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 39CED106566B for ; Fri, 1 Oct 2010 15:45:35 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id EB7798FC15 for ; Fri, 1 Oct 2010 15:45:34 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id D3422509A5; Fri, 1 Oct 2010 16:45:33 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 349bLhnPpn-4; Fri, 1 Oct 2010 16:45:33 +0100 (BST) Received: from nyi.unixathome.org (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id EED5D509A3; Fri, 1 Oct 2010 16:45:32 +0100 (BST) Received: from 68.64.144.221 (SquirrelMail authenticated user dan) by nyi.unixathome.org with HTTP; Fri, 1 Oct 2010 11:45:33 -0400 Message-ID: In-Reply-To: References: Date: Fri, 1 Oct 2010 11:45:33 -0400 From: "Dan Langille" To: "Artem Belevich" User-Agent: SquirrelMail/1.4.20-RC2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-stable@freebsd.org Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 15:45:35 -0000 On Wed, September 29, 2010 3:57 pm, Artem Belevich wrote: > On Wed, Sep 29, 2010 at 11:04 AM, Dan Langille wrote: >> It's taken about 15 hours to copy 800GB. I'm sure there's some tuning I >> can do. >> >> The system is now running: >> >> # zfs send storage/bacula@transfer | zfs receive >> storage/compressed/bacula > > Try piping zfs data through mbuffer (misc/mbuffer in ports). I've > found that it does help a lot to smooth out data flow and increase > send/receive throughput even when send/receive happens on the same > host. Run it with a buffer large enough to accommodate few seconds > worth of write throughput for your target disks. > > Here's an example: > http://blogs.everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/ I'm failing. In one session: # mbuffer -s 128k -m 1G -I 9090 | zfs receive storage/compressed/bacula-mbuffer Assertion failed: ((err == 0) && (bsize == sizeof(rcvsize))), function openNetworkInput, file mbuffer.c, line 1358. cannot receive: failed to read from stream In the other session: # time zfs send storage/bacula@transfer | mbuffer -s 128k -m 1G -O 10.55.0.44:9090 Assertion failed: ((err == 0) && (bsize == sizeof(sndsize))), function openNetworkOutput, file mbuffer.c, line 897. warning: cannot send 'storage/bacula@transfer': Broken pipe Abort trap: 6 (core dumped) real 0m17.709s user 0m0.000s sys 0m2.502s -- Dan Langille -- http://langille.org/ From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 15:53:26 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CAFC6106566B for ; Fri, 1 Oct 2010 15:53:26 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id 99EA58FC17 for ; Fri, 1 Oct 2010 15:53:26 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id CC225509A5; Fri, 1 Oct 2010 16:53:25 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QC+dUZquPq+Z; Fri, 1 Oct 2010 16:53:25 +0100 (BST) Received: from nyi.unixathome.org (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id EA8E4509A3; Fri, 1 Oct 2010 16:53:24 +0100 (BST) Received: from 68.64.144.221 (SquirrelMail authenticated user dan) by nyi.unixathome.org with HTTP; Fri, 1 Oct 2010 11:53:25 -0400 Message-ID: <55a0e58dd844285fbb50cb2904820943.squirrel@nyi.unixathome.org> In-Reply-To: References: Date: Fri, 1 Oct 2010 11:53:25 -0400 From: "Dan Langille" To: "Artem Belevich" User-Agent: SquirrelMail/1.4.20-RC2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Cc: freebsd-stable@freebsd.org Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 15:53:26 -0000 On Fri, October 1, 2010 11:45 am, Dan Langille wrote: > > On Wed, September 29, 2010 3:57 pm, Artem Belevich wrote: >> On Wed, Sep 29, 2010 at 11:04 AM, Dan Langille wrote: >>> It's taken about 15 hours to copy 800GB. I'm sure there's some tuning >>> I >>> can do. >>> >>> The system is now running: >>> >>> # zfs send storage/bacula@transfer | zfs receive >>> storage/compressed/bacula >> >> Try piping zfs data through mbuffer (misc/mbuffer in ports). I've >> found that it does help a lot to smooth out data flow and increase >> send/receive throughput even when send/receive happens on the same >> host. Run it with a buffer large enough to accommodate few seconds >> worth of write throughput for your target disks. >> >> Here's an example: >> http://blogs.everycity.co.uk/alasdair/2010/07/using-mbuffer-to-speed-up-slow-zfs-send-zfs-receive/ > > I'm failing. In one session: > > # mbuffer -s 128k -m 1G -I 9090 | zfs receive > storage/compressed/bacula-mbuffer > Assertion failed: ((err == 0) && (bsize == sizeof(rcvsize))), function > openNetworkInput, file mbuffer.c, line 1358. > cannot receive: failed to read from stream > > > In the other session: > > # time zfs send storage/bacula@transfer | mbuffer -s 128k -m 1G -O > 10.55.0.44:9090 > Assertion failed: ((err == 0) && (bsize == sizeof(sndsize))), function > openNetworkOutput, file mbuffer.c, line 897. > warning: cannot send 'storage/bacula@transfer': Broken pipe > Abort trap: 6 (core dumped) > > real 0m17.709s > user 0m0.000s > sys 0m2.502s My installed mbuffer was out of date. After an upgrade: # mbuffer -s 128k -m 1G -I 9090 | zfs receive storage/compressed/bacula-mbuffer mbuffer: warning: unable to set socket buffer size: No buffer space available in @ 0.0 kB/s, out @ 0.0 kB/s, 1897 MB total, buffer 100% full # time zfs send storage/bacula@transfer | mbuffer -s 128k -m 1G -O ::1:9090 mbuffer: warning: unable to set socket buffer size: No buffer space available in @ 4343 kB/s, out @ 2299 kB/s, 3104 MB total, buffer 85% full -- Dan Langille -- http://langille.org/ From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 16:07:46 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 123B9106566C; Fri, 1 Oct 2010 16:07:46 +0000 (UTC) (envelope-from tinderbox@freebsd.org) Received: from freebsd-current.sentex.ca (freebsd-current.sentex.ca [64.7.128.98]) by mx1.freebsd.org (Postfix) with ESMTP id BC3AF8FC12; Fri, 1 Oct 2010 16:07:45 +0000 (UTC) Received: from freebsd-current.sentex.ca (localhost [127.0.0.1]) by freebsd-current.sentex.ca (8.14.4/8.14.3) with ESMTP id o91G7iUS017699; Fri, 1 Oct 2010 12:07:44 -0400 (EDT) (envelope-from tinderbox@freebsd.org) Received: (from tinderbox@localhost) by freebsd-current.sentex.ca (8.14.4/8.14.3/Submit) id o91G7i5x017698; Fri, 1 Oct 2010 16:07:44 GMT (envelope-from tinderbox@freebsd.org) Date: Fri, 1 Oct 2010 16:07:44 GMT Message-Id: <201010011607.o91G7i5x017698@freebsd-current.sentex.ca> X-Authentication-Warning: freebsd-current.sentex.ca: tinderbox set sender to FreeBSD Tinderbox using -f Sender: FreeBSD Tinderbox From: FreeBSD Tinderbox To: FreeBSD Tinderbox , , Precedence: bulk Cc: Subject: [releng_8 tinderbox] failure on mips/mips X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 16:07:46 -0000 TB --- 2010-10-01 12:12:52 - tinderbox 2.6 running on freebsd-current.sentex.ca TB --- 2010-10-01 12:12:52 - starting RELENG_8 tinderbox run for mips/mips TB --- 2010-10-01 12:12:52 - cleaning the object tree TB --- 2010-10-01 12:14:04 - cvsupping the source tree TB --- 2010-10-01 12:14:04 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca /tinderbox/RELENG_8/mips/mips/supfile TB --- 2010-10-01 12:21:22 - building world TB --- 2010-10-01 12:21:22 - MAKEOBJDIRPREFIX=/obj TB --- 2010-10-01 12:21:22 - PATH=/usr/bin:/usr/sbin:/bin:/sbin TB --- 2010-10-01 12:21:22 - TARGET=mips TB --- 2010-10-01 12:21:22 - TARGET_ARCH=mips TB --- 2010-10-01 12:21:22 - TZ=UTC TB --- 2010-10-01 12:21:22 - __MAKE_CONF=/dev/null TB --- 2010-10-01 12:21:22 - cd /src TB --- 2010-10-01 12:21:22 - /usr/bin/make -B buildworld >>> World build started on Fri Oct 1 12:21:24 UTC 2010 >>> Rebuilding the temporary build tree >>> stage 1.1: legacy release compatibility shims >>> stage 1.2: bootstrap tools >>> stage 2.1: cleaning up the object tree >>> stage 2.2: rebuilding the object tree >>> stage 2.3: build tools >>> stage 3: cross tools >>> stage 4.1: building includes >>> stage 4.2: building libraries >>> stage 4.3: make dependencies >>> stage 4.4: building everything [...] /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1899 /obj/mips/src/tmp/usr/bin/ld: BFD 2.15 [FreeBSD] 2004-05-23 assertion fail /src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elfxx-mips.c:1902 *** Error code 1 Stop in /src/usr.bin/tftp. *** Error code 1 Stop in /src/usr.bin. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. *** Error code 1 Stop in /src. TB --- 2010-10-01 16:07:44 - WARNING: /usr/bin/make returned exit code 1 TB --- 2010-10-01 16:07:44 - ERROR: failed to build world TB --- 2010-10-01 16:07:44 - 2072.30 user 7750.23 system 14091.93 real http://tinderbox.freebsd.org/tinderbox-releng_8-RELENG_8-mips-mips.full From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 18:07:57 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3A97D1065673 for ; Fri, 1 Oct 2010 18:07:57 +0000 (UTC) (envelope-from cswiger@mac.com) Received: from asmtpout025.mac.com (asmtpout025.mac.com [17.148.16.100]) by mx1.freebsd.org (Postfix) with ESMTP id 1D9AA8FC15 for ; Fri, 1 Oct 2010 18:07:56 +0000 (UTC) MIME-version: 1.0 Content-transfer-encoding: 7BIT Content-type: text/plain; CHARSET=US-ASCII Received: from cswiger1.apple.com (unknown [17.209.4.71]) by asmtp025.mac.com (Oracle Communications Messaging Exchange Server 7u4-18.01 64bit (built Jul 15 2010)) with ESMTPSA id <0L9I00J4XQN16S00@asmtp025.mac.com> for freebsd-stable@freebsd.org; Wed, 29 Sep 2010 10:16:14 -0700 (PDT) X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.0.10011,1.0.148,0.0.0000 definitions=2010-09-29_08:2010-09-29, 2010-09-29, 1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 ipscore=0 phishscore=0 bulkscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx engine=6.0.2-1004200000 definitions=main-1009290108 From: Chuck Swiger In-reply-to: <20100929170757.GA94672@icarus.home.lan> Date: Wed, 29 Sep 2010 10:16:13 -0700 Message-id: <2B9D8374-AA0A-4F2C-9681-5216204859F8@mac.com> References: <20100224165203.GA10423@zod.isi.edu> <20100927170317.I90633@sola.nimnet.asn.au> <4CA0E892.4010204@gmail.com> <201009271621.17669.jkim@FreeBSD.org> <4CA2488D.7000101@gmail.com> <04FA16F2-26AD-425D-9E4A-2A923219B73E@mac.com> <4CA35E64.1040101@gmail.com> <0FDB4144-8BE4-4BA5-B911-8652E07D60C2@mac.com> <20100929170757.GA94672@icarus.home.lan> To: Jeremy Chadwick X-Mailer: Apple Mail (2.1081) Cc: freebsd-stable@freebsd.org, Vitaly Magerya Subject: Re: resume slow on Thinkpad T42 FreeBSD 8-STABLE X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 18:07:57 -0000 On Sep 29, 2010, at 10:07 AM, Jeremy Chadwick wrote: > On Wed, Sep 29, 2010 at 09:57:53AM -0700, Chuck Swiger wrote: >> >> I doubt repeated coincidences. :-) Is prime95 testing running stable after waking from sleep? > > He's not running Prime95 (native Win32 app), he's running > ports/math/mprime under FreeBSD natively. I don't know if this > application stresses hardware to the same degree Prime95 does; I've used > Prime95 many times to burn in new workstations. It's doing the same math operations; something like "mprime -t" is the same as the Win32 test mode per the docs: -t Run the torture test. Same as Options/Torture Test. > The Thinkpad hardware he's on is """old""" (note the quotes), so I > wouldn't be surprised if the CPU (Intel Pentium M) happens to induce a > strange/odd MCA event as a result of going in/out of sleep state. It > could be a general system bug of some sort as well (one which has no > repercussions). That sounds reasonable to me, but I'm wary of uncorrected errors which seem to be reproducible to specific circumstances. > Look at it this way: if his L1 cache was going bad, his system would be > freaking out doing literally anything (booting the kernel for example); > I'm under the impression Pentium M CPUs do not have ECC L1 cache. Sure, if the MCA report is reflecting a legitimate problem, and it was happening more often than every few minutes, and it happened after a cold reboot rather than after wakeup from sleep.... :-) I place more faith in ~17 hours of Prime95/mprime working OK to validate that the hardware is not obviously broken. Regards, -- -Chuck From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 18:47:15 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FA59106564A for ; Fri, 1 Oct 2010 18:47:15 +0000 (UTC) (envelope-from lioux@FreeBSD.org) Received: from goat.gigo.com (ipv6.gigo.com [IPv6:2001:470:1:18::2]) by mx1.freebsd.org (Postfix) with ESMTP id 705198FC12 for ; Fri, 1 Oct 2010 18:47:15 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by goat.gigo.com (Postfix) with ESMTP id 3492311571 for ; Fri, 1 Oct 2010 11:47:15 -0700 (PDT) Received: from goat.gigo.com ([127.0.0.1]) by localhost (vette.gigo.com [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id SheogTnyNfyp for ; Fri, 1 Oct 2010 11:47:15 -0700 (PDT) Received: from 189.72.204.181 (unknown [189.72.204.181]) by goat.gigo.com (Postfix) with ESMTPSA id 46D6E1156B for ; Fri, 1 Oct 2010 11:47:14 -0700 (PDT) Received: (qmail 89396 invoked by uid 80); 1 Oct 2010 15:46:37 -0300 Received: from exxodus.fedaykin.here (exxodus.fedaykin.here [10.0.0.2]) by exxodus.fedaykin.here (Horde Framework) with HTTP; Fri, 01 Oct 2010 15:47:01 -0300 Message-ID: <20101001154701.980319bwibw4wyw5@exxodus.fedaykin.here> Date: Fri, 01 Oct 2010 15:47:01 -0300 From: Mario Sergio Fujikawa Ferreira To: Kostik Belousov References: <20100919222837.70629.qmail@exxodus.fedaykin.here> <20100922132801.GV2389@deviant.kiev.zoral.com.ua> (sfid-20100922_11541_49CE72C2) In-Reply-To: <20100922132801.GV2389@deviant.kiev.zoral.com.ua> (sfid-20100922_11541_49CE72C2) MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4) Cc: freebsd-stable@freebsd.org Subject: Re: Panic with chromium and 8.1-STABLE (Thu Sep 16 09:52:17 BRT 2010) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 18:47:15 -0000 Quoting Kostik Belousov : > On Sun, Sep 19, 2010 at 07:28:13PM -0300, Mario Sergio Fujikawa > Ferreira wrote: >> Hi, >> >> I've just began trying chrome web browser from >> http://chromium.hybridsource.org/ but it triggered 2 panics on my >> 8.1-STABLE system. >> >> $ uname -a >> FreeBSD exxodus.fedaykin.here 8.1-STABLE FreeBSD 8.1-STABLE #26: >> Thu Sep 16 09:52:17 BRT 2010 >> lioux@exxodus:/usr/obj/usr/src/sys/LIOUX amd64 >> >> The panic information is: >> >> ------------ >> panic: vm_page_unwire: invalid wire count: 0 >> cpuid = 0 >> KDB: enter: panic >> >> 0xffffff006ecce000: tag ufs, type VREG >> usecount 1, writecount 1, refcount 4 mountedhere 0 >> flags () >> v_object 0xffffff0151489870 ref 0 pages 8 >> lock type ufs: EXCL by thread 0xffffff00200947c0 (pid 25025) >> ino 119526591, on dev ufs/fsusr >> >> 0xffffff011107f938: tag ufs, type VREG >> usecount 0, writecount 0, refcount 4 mountedhere 0 >> flags (VV_NOSYNC|VI_DOINGINACT) >> v_object 0xffffff0151f7f870 ref 0 pages 1284 >> lock type ufs: EXCL by thread 0xffffff01882cc7c0 (pid 26689) >> ino 263, on dev md0 >> ------------ >> >> I've made available 2 ddb textdumps at: >> >> http://people.freebsd.org/~lioux/panic/2010091900/textdump.tar.0 >> http://people.freebsd.org/~lioux/panic/2010091900/textdump.tar.1 >> >> I was able to use chrome prior to this latest kernel update. >> Now, I can reproduce a kernel panic even browsing www.google.com >> >> Please, let me know if I can provide any further information. > > Does it panic if you remove ZERO_COPY_SOCKETS option from the kernel > config ? > Right on the spot. Removing ZERO_COPY_SOCKETS stopped the panics. The panics restart if I add them again. Regards, -- Mario S F Ferreira - DF - Brazil - "I guess this is a signature." feature, n: a documented bug | bug, n: an undocumented feature From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 18:51:14 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 07CD51065672 for ; Fri, 1 Oct 2010 18:51:14 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id BC7F18FC1A for ; Fri, 1 Oct 2010 18:51:13 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 1CF9B509A5 for ; Fri, 1 Oct 2010 19:51:13 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Jz2Gjc73nKUC for ; Fri, 1 Oct 2010 19:51:12 +0100 (BST) Received: from nyi.unixathome.org (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 3C36D509A3 for ; Fri, 1 Oct 2010 19:51:12 +0100 (BST) Received: from 68.64.144.221 (SquirrelMail authenticated user dan) by nyi.unixathome.org with HTTP; Fri, 1 Oct 2010 14:51:12 -0400 Message-ID: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> In-Reply-To: References: Date: Fri, 1 Oct 2010 14:51:12 -0400 From: "Dan Langille" To: freebsd-stable@freebsd.org User-Agent: SquirrelMail/1.4.20-RC2 MIME-Version: 1.0 Content-Type: text/plain;charset=iso-8859-1 Content-Transfer-Encoding: 8bit X-Priority: 3 (Normal) Importance: Normal Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 18:51:14 -0000 On Wed, September 29, 2010 2:04 pm, Dan Langille wrote: > $ zpool iostat 10 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > storage 7.67T 5.02T 358 38 43.1M 1.96M > storage 7.67T 5.02T 317 475 39.4M 30.9M > storage 7.67T 5.02T 357 533 44.3M 34.4M > storage 7.67T 5.02T 371 556 46.0M 35.8M > storage 7.67T 5.02T 313 521 38.9M 28.7M > storage 7.67T 5.02T 309 457 38.4M 30.4M > storage 7.67T 5.02T 388 589 48.2M 37.8M > storage 7.67T 5.02T 377 581 46.8M 36.5M > storage 7.67T 5.02T 310 559 38.4M 30.4M > storage 7.67T 5.02T 430 611 53.4M 41.3M Now that I'm using mbuffer: $ zpool iostat 10 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- storage 9.96T 2.73T 2.01K 131 151M 6.72M storage 9.96T 2.73T 615 515 76.3M 33.5M storage 9.96T 2.73T 360 492 44.7M 33.7M storage 9.96T 2.73T 388 554 48.3M 38.4M storage 9.96T 2.73T 403 562 50.1M 39.6M storage 9.96T 2.73T 313 468 38.9M 28.0M storage 9.96T 2.73T 462 677 57.3M 22.4M storage 9.96T 2.73T 383 581 47.5M 21.6M storage 9.96T 2.72T 142 571 17.7M 15.4M storage 9.96T 2.72T 80 598 10.0M 18.8M storage 9.96T 2.72T 718 503 89.1M 13.6M storage 9.96T 2.72T 594 517 73.8M 14.1M storage 9.96T 2.72T 367 528 45.6M 15.1M storage 9.96T 2.72T 338 520 41.9M 16.4M storage 9.96T 2.72T 348 499 43.3M 21.5M storage 9.96T 2.72T 398 553 49.4M 14.4M storage 9.96T 2.72T 346 481 43.0M 6.78M If anything, it's slower. The above was without -s 128. The following used that setting: $ zpool iostat 10 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- storage 9.78T 2.91T 1.98K 137 149M 6.92M storage 9.78T 2.91T 761 577 94.4M 42.6M storage 9.78T 2.91T 462 411 57.4M 24.6M storage 9.78T 2.91T 492 497 61.1M 27.6M storage 9.78T 2.91T 632 446 78.5M 22.5M storage 9.78T 2.91T 554 414 68.7M 21.8M storage 9.78T 2.91T 459 434 57.0M 31.4M storage 9.78T 2.91T 398 570 49.4M 32.7M storage 9.78T 2.91T 338 495 41.9M 26.5M storage 9.78T 2.91T 358 526 44.5M 33.3M storage 9.78T 2.91T 385 555 47.8M 39.8M storage 9.78T 2.91T 271 453 33.6M 23.3M storage 9.78T 2.91T 270 456 33.5M 28.8M From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 19:43:34 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0E2B106566B for ; Fri, 1 Oct 2010 19:43:34 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta07.emeryville.ca.mail.comcast.net (qmta07.emeryville.ca.mail.comcast.net [76.96.30.64]) by mx1.freebsd.org (Postfix) with ESMTP id B588F8FC0C for ; Fri, 1 Oct 2010 19:43:34 +0000 (UTC) Received: from omta12.emeryville.ca.mail.comcast.net ([76.96.30.44]) by qmta07.emeryville.ca.mail.comcast.net with comcast id DWsz1f0060x6nqcA7Xja16; Fri, 01 Oct 2010 19:43:34 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta12.emeryville.ca.mail.comcast.net with comcast id DXjZ1f0033LrwQ28YXjZuy; Fri, 01 Oct 2010 19:43:33 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 18A819B418; Fri, 1 Oct 2010 12:43:33 -0700 (PDT) Date: Fri, 1 Oct 2010 12:43:33 -0700 From: Jeremy Chadwick To: Dan Langille Message-ID: <20101001194333.GA51297@icarus.home.lan> References: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 19:43:34 -0000 On Fri, Oct 01, 2010 at 02:51:12PM -0400, Dan Langille wrote: > > On Wed, September 29, 2010 2:04 pm, Dan Langille wrote: > > $ zpool iostat 10 > > capacity operations bandwidth > > pool used avail read write read write > > ---------- ----- ----- ----- ----- ----- ----- > > storage 7.67T 5.02T 358 38 43.1M 1.96M > > storage 7.67T 5.02T 317 475 39.4M 30.9M > > storage 7.67T 5.02T 357 533 44.3M 34.4M > > storage 7.67T 5.02T 371 556 46.0M 35.8M > > storage 7.67T 5.02T 313 521 38.9M 28.7M > > storage 7.67T 5.02T 309 457 38.4M 30.4M > > storage 7.67T 5.02T 388 589 48.2M 37.8M > > storage 7.67T 5.02T 377 581 46.8M 36.5M > > storage 7.67T 5.02T 310 559 38.4M 30.4M > > storage 7.67T 5.02T 430 611 53.4M 41.3M > > Now that I'm using mbuffer: > > $ zpool iostat 10 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > storage 9.96T 2.73T 2.01K 131 151M 6.72M > storage 9.96T 2.73T 615 515 76.3M 33.5M > storage 9.96T 2.73T 360 492 44.7M 33.7M > storage 9.96T 2.73T 388 554 48.3M 38.4M > storage 9.96T 2.73T 403 562 50.1M 39.6M > storage 9.96T 2.73T 313 468 38.9M 28.0M > storage 9.96T 2.73T 462 677 57.3M 22.4M > storage 9.96T 2.73T 383 581 47.5M 21.6M > storage 9.96T 2.72T 142 571 17.7M 15.4M > storage 9.96T 2.72T 80 598 10.0M 18.8M > storage 9.96T 2.72T 718 503 89.1M 13.6M > storage 9.96T 2.72T 594 517 73.8M 14.1M > storage 9.96T 2.72T 367 528 45.6M 15.1M > storage 9.96T 2.72T 338 520 41.9M 16.4M > storage 9.96T 2.72T 348 499 43.3M 21.5M > storage 9.96T 2.72T 398 553 49.4M 14.4M > storage 9.96T 2.72T 346 481 43.0M 6.78M > > If anything, it's slower. > > The above was without -s 128. The following used that setting: > > $ zpool iostat 10 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > storage 9.78T 2.91T 1.98K 137 149M 6.92M > storage 9.78T 2.91T 761 577 94.4M 42.6M > storage 9.78T 2.91T 462 411 57.4M 24.6M > storage 9.78T 2.91T 492 497 61.1M 27.6M > storage 9.78T 2.91T 632 446 78.5M 22.5M > storage 9.78T 2.91T 554 414 68.7M 21.8M > storage 9.78T 2.91T 459 434 57.0M 31.4M > storage 9.78T 2.91T 398 570 49.4M 32.7M > storage 9.78T 2.91T 338 495 41.9M 26.5M > storage 9.78T 2.91T 358 526 44.5M 33.3M > storage 9.78T 2.91T 385 555 47.8M 39.8M > storage 9.78T 2.91T 271 453 33.6M 23.3M > storage 9.78T 2.91T 270 456 33.5M 28.8M For what it's worth, this mimics the behaviour I saw long ago when using flexbackup[1] (which used SSH) to back up numerous machines on our local gigE network. flexbackup strongly advocates use of mbuffer or afio to attempt to buffer I/O between source and destination. What I witnessed was I/O rates that were either identical or worse (most of the time, worse) when mbuffer was used (regardless of what I chose for -s and -m). I switched to rsnapshot (which uses rsync via SSH) for a lot of reasons which are outside of the scope of this topic. I don't care to get into a discussion about the I/O bottlenecks stock OpenSSH has (vs. one patched with the high-performance patches) either; the point is that mbuffer did absolutely nothing or made things worse. This[2] didn't impress me either. [1]: http://www.edwinh.org/flexbackup/ [2]: http://www.edwinh.org/flexbackup/faq.html#Common%20problems4 -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 21:56:42 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5AF021065672 for ; Fri, 1 Oct 2010 21:56:42 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 124658FC1B for ; Fri, 1 Oct 2010 21:56:41 +0000 (UTC) Received: by gwb15 with SMTP id 15so1653695gwb.13 for ; Fri, 01 Oct 2010 14:56:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=w6a3yxfmlPCGR0iAhodqDc9Gamw86bkaXPcR6NFxw8g=; b=u4fv0oS9vV8U4n/kwCVPo7DD0cexCtjLTSCT5TZQpDmWO6n0b937up1OIdRFbwaaM9 ECCXdoWnsPV5OFyFNDEXnuAmj8YAyQbBC2re+ZBCHcwjxcmSYFDKRf73iUdRdJTz1ySJ u4CbjJTMDLMQKn48862H5DUDet/wgYNRrJwCA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=sXWPVpduKNJ+6XaqEbiYa0SgmiJhO94u2dkLeqdWzQmCNR62zQe3hnrQFAgJytUFg9 xGA6VE5SqR0Sn6NTUikNEyHAvv5PKF9azL9cMQFP9PZ/mYoTDzKziCTgFva1JaBvcLsB vG0bLPbdyBXCMYg5R5Q8VpsIs6oLFjWgp0Fxg= MIME-Version: 1.0 Received: by 10.236.102.147 with SMTP id d19mr1526266yhg.69.1285970201271; Fri, 01 Oct 2010 14:56:41 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.220.176.77 with HTTP; Fri, 1 Oct 2010 14:56:41 -0700 (PDT) In-Reply-To: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> References: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> Date: Fri, 1 Oct 2010 14:56:41 -0700 X-Google-Sender-Auth: ela4voIjdbjswM2V8g7FeJPleRw Message-ID: From: Artem Belevich To: Dan Langille Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable@freebsd.org Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 21:56:42 -0000 Hmm. It did help me a lot when I was replicating ~2TB worth of data over GigE. Without mbuffer things were roughly in the ballpark of your numbers. With mbuffer I've got around 100MB/s. Assuming that you have two boxes connected via ethernet, it would be good to check that nobody generates PAUSE frames. Some time back I've discovered that el-cheapo switch I've been using for some reason could not keep up with traffic bursts and generated tons of PAUSE frames that severely limited throughput. If you're using Intel adapters, check xon/xoff counters in "sysctl dev.em.0.mac_stats". If you see them increasing, that may explain slow speed. If you have a switch between your boxes, try bypassing it and connect boxes directly. --Artem On Fri, Oct 1, 2010 at 11:51 AM, Dan Langille wrote: > > On Wed, September 29, 2010 2:04 pm, Dan Langille wrote: >> $ zpool iostat 10 >> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0capacity =A0 =A0 operations =A0 =A0bandwi= dth >> pool =A0 =A0 =A0 =A0 used =A0avail =A0 read =A0write =A0 read =A0write >> ---------- =A0----- =A0----- =A0----- =A0----- =A0----- =A0----- >> storage =A0 =A0 7.67T =A05.02T =A0 =A0358 =A0 =A0 38 =A043.1M =A01.96M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0317 =A0 =A0475 =A039.4M =A030.9M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0357 =A0 =A0533 =A044.3M =A034.4M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0371 =A0 =A0556 =A046.0M =A035.8M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0313 =A0 =A0521 =A038.9M =A028.7M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0309 =A0 =A0457 =A038.4M =A030.4M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0388 =A0 =A0589 =A048.2M =A037.8M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0377 =A0 =A0581 =A046.8M =A036.5M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0310 =A0 =A0559 =A038.4M =A030.4M >> storage =A0 =A0 7.67T =A05.02T =A0 =A0430 =A0 =A0611 =A053.4M =A041.3M > > Now that I'm using mbuffer: > > $ zpool iostat 10 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 capacity =A0 =A0 operations =A0 =A0bandwidth > pool =A0 =A0 =A0 =A0 used =A0avail =A0 read =A0write =A0 read =A0write > ---------- =A0----- =A0----- =A0----- =A0----- =A0----- =A0----- > storage =A0 =A0 9.96T =A02.73T =A02.01K =A0 =A0131 =A0 151M =A06.72M > storage =A0 =A0 9.96T =A02.73T =A0 =A0615 =A0 =A0515 =A076.3M =A033.5M > storage =A0 =A0 9.96T =A02.73T =A0 =A0360 =A0 =A0492 =A044.7M =A033.7M > storage =A0 =A0 9.96T =A02.73T =A0 =A0388 =A0 =A0554 =A048.3M =A038.4M > storage =A0 =A0 9.96T =A02.73T =A0 =A0403 =A0 =A0562 =A050.1M =A039.6M > storage =A0 =A0 9.96T =A02.73T =A0 =A0313 =A0 =A0468 =A038.9M =A028.0M > storage =A0 =A0 9.96T =A02.73T =A0 =A0462 =A0 =A0677 =A057.3M =A022.4M > storage =A0 =A0 9.96T =A02.73T =A0 =A0383 =A0 =A0581 =A047.5M =A021.6M > storage =A0 =A0 9.96T =A02.72T =A0 =A0142 =A0 =A0571 =A017.7M =A015.4M > storage =A0 =A0 9.96T =A02.72T =A0 =A0 80 =A0 =A0598 =A010.0M =A018.8M > storage =A0 =A0 9.96T =A02.72T =A0 =A0718 =A0 =A0503 =A089.1M =A013.6M > storage =A0 =A0 9.96T =A02.72T =A0 =A0594 =A0 =A0517 =A073.8M =A014.1M > storage =A0 =A0 9.96T =A02.72T =A0 =A0367 =A0 =A0528 =A045.6M =A015.1M > storage =A0 =A0 9.96T =A02.72T =A0 =A0338 =A0 =A0520 =A041.9M =A016.4M > storage =A0 =A0 9.96T =A02.72T =A0 =A0348 =A0 =A0499 =A043.3M =A021.5M > storage =A0 =A0 9.96T =A02.72T =A0 =A0398 =A0 =A0553 =A049.4M =A014.4M > storage =A0 =A0 9.96T =A02.72T =A0 =A0346 =A0 =A0481 =A043.0M =A06.78M > > If anything, it's slower. > > The above was without -s 128. =A0The following used that setting: > > =A0$ zpool iostat 10 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 capacity =A0 =A0 operations =A0 =A0bandwidth > pool =A0 =A0 =A0 =A0 used =A0avail =A0 read =A0write =A0 read =A0write > ---------- =A0----- =A0----- =A0----- =A0----- =A0----- =A0----- > storage =A0 =A0 9.78T =A02.91T =A01.98K =A0 =A0137 =A0 149M =A06.92M > storage =A0 =A0 9.78T =A02.91T =A0 =A0761 =A0 =A0577 =A094.4M =A042.6M > storage =A0 =A0 9.78T =A02.91T =A0 =A0462 =A0 =A0411 =A057.4M =A024.6M > storage =A0 =A0 9.78T =A02.91T =A0 =A0492 =A0 =A0497 =A061.1M =A027.6M > storage =A0 =A0 9.78T =A02.91T =A0 =A0632 =A0 =A0446 =A078.5M =A022.5M > storage =A0 =A0 9.78T =A02.91T =A0 =A0554 =A0 =A0414 =A068.7M =A021.8M > storage =A0 =A0 9.78T =A02.91T =A0 =A0459 =A0 =A0434 =A057.0M =A031.4M > storage =A0 =A0 9.78T =A02.91T =A0 =A0398 =A0 =A0570 =A049.4M =A032.7M > storage =A0 =A0 9.78T =A02.91T =A0 =A0338 =A0 =A0495 =A041.9M =A026.5M > storage =A0 =A0 9.78T =A02.91T =A0 =A0358 =A0 =A0526 =A044.5M =A033.3M > storage =A0 =A0 9.78T =A02.91T =A0 =A0385 =A0 =A0555 =A047.8M =A039.8M > storage =A0 =A0 9.78T =A02.91T =A0 =A0271 =A0 =A0453 =A033.6M =A023.3M > storage =A0 =A0 9.78T =A02.91T =A0 =A0270 =A0 =A0456 =A033.5M =A028.8M > > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 23:00:16 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1987A106566B for ; Fri, 1 Oct 2010 23:00:16 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id E18C18FC1F for ; Fri, 1 Oct 2010 23:00:14 +0000 (UTC) Received: by qwd6 with SMTP id 6so2154038qwd.13 for ; Fri, 01 Oct 2010 16:00:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=sCXAY6tIx6/OL29Y7iATwJ63wKUXk4/VUhVJV/Ygs1g=; b=PUgfmbIlWO7Fp7K16dLlNekQRT3xffbL2oa5sRBtIySs28RP1hS0ffyRP36NK3MMwz VACHs+csKYeLvGaQcCJs0B626EP47HkDFs/LdSwsCOL7ZzamOgWNoiWuU4hiF1vaUkd7 uNcwIIHZRxnratjwSOuW6LUCi3Yg9borXv9hA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=WkPQ2EbB/ckyS5k12wcGJQwjHC3v5kWvT9K2hlCkDq80Sk7qHJzB/OUiuiAoHo9x1w eYrwmUluyAB6BagePKiOXj69uH8gdS4g8ZMT5OzZDXNsyB+0GLfgY8aJ95SJSC+aHIck sfHgAHaudlSrACXfHuEtzTfTuTId+SY/46dMw= MIME-Version: 1.0 Received: by 10.220.63.15 with SMTP id z15mr1453796vch.70.1285974013979; Fri, 01 Oct 2010 16:00:13 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.220.176.77 with HTTP; Fri, 1 Oct 2010 16:00:13 -0700 (PDT) In-Reply-To: <4C511EF8-591C-4BB9-B7AA-30D5C3DDC0FF@langille.org> References: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> <4C511EF8-591C-4BB9-B7AA-30D5C3DDC0FF@langille.org> Date: Fri, 1 Oct 2010 16:00:13 -0700 X-Google-Sender-Auth: aVwa1-MUmdu9FqDXfCCWBCNQlWk Message-ID: From: Artem Belevich To: Dan Langille Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: "freebsd-stable@freebsd.org" Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 23:00:16 -0000 On Fri, Oct 1, 2010 at 3:49 PM, Dan Langille wrote: > FYI: this is all on the same box. In one of the previous emails you've used this command line: > # mbuffer -s 128k -m 1G -I 9090 | zfs receive You've used mbuffer in network client mode. I assumed that you did do your transfer over network. If you're running send/receive locally just pipe the data through mbuffer -- zfs send|mbuffer|zfs receive --Artem > > -- > Dan Langille > http://langille.org/ > > > On Oct 1, 2010, at 5:56 PM, Artem Belevich wrote: > >> Hmm. It did help me a lot when I was replicating ~2TB worth of data >> over GigE. Without mbuffer things were roughly in the ballpark of your >> numbers. With mbuffer I've got around 100MB/s. >> >> Assuming that you have two boxes connected via ethernet, it would be >> good to check that nobody generates PAUSE frames. Some time back I've >> discovered that el-cheapo switch I've been using for some reason could >> not keep up with traffic bursts and generated tons of PAUSE frames >> that severely limited throughput. >> >> If you're using Intel adapters, check xon/xoff counters in "sysctl >> dev.em.0.mac_stats". If you see them increasing, that may explain slow >> speed. >> If you have a switch between your boxes, try bypassing it and connect >> boxes directly. >> >> --Artem >> >> >> >> On Fri, Oct 1, 2010 at 11:51 AM, Dan Langille wrote: >>> >>> On Wed, September 29, 2010 2:04 pm, Dan Langille wrote: >>>> $ zpool iostat 10 >>>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0capacity =A0 =A0 operations =A0 =A0band= width >>>> pool =A0 =A0 =A0 =A0 used =A0avail =A0 read =A0write =A0 read =A0write >>>> ---------- =A0----- =A0----- =A0----- =A0----- =A0----- =A0----- >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0358 =A0 =A0 38 =A043.1M =A01.96M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0317 =A0 =A0475 =A039.4M =A030.9M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0357 =A0 =A0533 =A044.3M =A034.4M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0371 =A0 =A0556 =A046.0M =A035.8M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0313 =A0 =A0521 =A038.9M =A028.7M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0309 =A0 =A0457 =A038.4M =A030.4M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0388 =A0 =A0589 =A048.2M =A037.8M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0377 =A0 =A0581 =A046.8M =A036.5M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0310 =A0 =A0559 =A038.4M =A030.4M >>>> storage =A0 =A0 7.67T =A05.02T =A0 =A0430 =A0 =A0611 =A053.4M =A041.3M >>> >>> Now that I'm using mbuffer: >>> >>> $ zpool iostat 10 >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 capacity =A0 =A0 operations =A0 =A0bandwidt= h >>> pool =A0 =A0 =A0 =A0 used =A0avail =A0 read =A0write =A0 read =A0write >>> ---------- =A0----- =A0----- =A0----- =A0----- =A0----- =A0----- >>> storage =A0 =A0 9.96T =A02.73T =A02.01K =A0 =A0131 =A0 151M =A06.72M >>> storage =A0 =A0 9.96T =A02.73T =A0 =A0615 =A0 =A0515 =A076.3M =A033.5M >>> storage =A0 =A0 9.96T =A02.73T =A0 =A0360 =A0 =A0492 =A044.7M =A033.7M >>> storage =A0 =A0 9.96T =A02.73T =A0 =A0388 =A0 =A0554 =A048.3M =A038.4M >>> storage =A0 =A0 9.96T =A02.73T =A0 =A0403 =A0 =A0562 =A050.1M =A039.6M >>> storage =A0 =A0 9.96T =A02.73T =A0 =A0313 =A0 =A0468 =A038.9M =A028.0M >>> storage =A0 =A0 9.96T =A02.73T =A0 =A0462 =A0 =A0677 =A057.3M =A022.4M >>> storage =A0 =A0 9.96T =A02.73T =A0 =A0383 =A0 =A0581 =A047.5M =A021.6M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0142 =A0 =A0571 =A017.7M =A015.4M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0 80 =A0 =A0598 =A010.0M =A018.8M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0718 =A0 =A0503 =A089.1M =A013.6M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0594 =A0 =A0517 =A073.8M =A014.1M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0367 =A0 =A0528 =A045.6M =A015.1M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0338 =A0 =A0520 =A041.9M =A016.4M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0348 =A0 =A0499 =A043.3M =A021.5M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0398 =A0 =A0553 =A049.4M =A014.4M >>> storage =A0 =A0 9.96T =A02.72T =A0 =A0346 =A0 =A0481 =A043.0M =A06.78M >>> >>> If anything, it's slower. >>> >>> The above was without -s 128. =A0The following used that setting: >>> >>> =A0$ zpool iostat 10 >>> =A0 =A0 =A0 =A0 =A0 =A0 =A0 capacity =A0 =A0 operations =A0 =A0bandwidt= h >>> pool =A0 =A0 =A0 =A0 used =A0avail =A0 read =A0write =A0 read =A0write >>> ---------- =A0----- =A0----- =A0----- =A0----- =A0----- =A0----- >>> storage =A0 =A0 9.78T =A02.91T =A01.98K =A0 =A0137 =A0 149M =A06.92M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0761 =A0 =A0577 =A094.4M =A042.6M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0462 =A0 =A0411 =A057.4M =A024.6M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0492 =A0 =A0497 =A061.1M =A027.6M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0632 =A0 =A0446 =A078.5M =A022.5M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0554 =A0 =A0414 =A068.7M =A021.8M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0459 =A0 =A0434 =A057.0M =A031.4M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0398 =A0 =A0570 =A049.4M =A032.7M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0338 =A0 =A0495 =A041.9M =A026.5M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0358 =A0 =A0526 =A044.5M =A033.3M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0385 =A0 =A0555 =A047.8M =A039.8M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0271 =A0 =A0453 =A033.6M =A023.3M >>> storage =A0 =A0 9.78T =A02.91T =A0 =A0270 =A0 =A0456 =A033.5M =A028.8M >>> >>> >>> _______________________________________________ >>> freebsd-stable@freebsd.org mailing list >>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.or= g" >>> >> > From owner-freebsd-stable@FreeBSD.ORG Fri Oct 1 23:02:34 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E3549106566B for ; Fri, 1 Oct 2010 23:02:34 +0000 (UTC) (envelope-from dan@langille.org) Received: from schemailmta04.cingularme.com (schemailmta04.cingularme.com [209.183.37.58]) by mx1.freebsd.org (Postfix) with ESMTP id A22C48FC0A for ; Fri, 1 Oct 2010 23:02:34 +0000 (UTC) Received: from [10.113.171.20] (really [172.16.130.170]) by schemailmta05.cingularme.com (InterMail vM.6.01.04.00 201-2131-118-20041027) with ESMTP id <20101001224938.SMOZ2109.schemailmta05.cingularme.com@[10.113.171.20]>; Fri, 1 Oct 2010 17:49:38 -0500 References: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> In-Reply-To: Mime-Version: 1.0 (iPhone Mail 8B117) Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii Message-Id: <4C511EF8-591C-4BB9-B7AA-30D5C3DDC0FF@langille.org> X-Mailer: iPhone Mail (8B117) From: Dan Langille Date: Fri, 1 Oct 2010 18:49:17 -0400 To: Artem Belevich X-Cloudmark-Analysis: v=1.0 c=1 a=QmPrZqdC1joA:10 a=kj9zAlcOel0A:10 a=MBIA0dz7AAAA:8 a=6I5d2MoRAAAA:8 a=7pBNprJxB9wivWM7RxUA:9 a=h-A2-4QYhiGPzBT9tXYA:7 a=Qu1MvQ0wV3ggC2VjVtHLjQMgLzIA:4 a=CjuIK1q_8ugA:10 a=9ZX6gfnTnSoA:10 a=SV7veod9ZcQA:10 Cc: "freebsd-stable@freebsd.org" Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 01 Oct 2010 23:02:35 -0000 FYI: this is all on the same box. -- Dan Langille http://langille.org/ On Oct 1, 2010, at 5:56 PM, Artem Belevich wrote: > Hmm. It did help me a lot when I was replicating ~2TB worth of data > over GigE. Without mbuffer things were roughly in the ballpark of your > numbers. With mbuffer I've got around 100MB/s. > > Assuming that you have two boxes connected via ethernet, it would be > good to check that nobody generates PAUSE frames. Some time back I've > discovered that el-cheapo switch I've been using for some reason could > not keep up with traffic bursts and generated tons of PAUSE frames > that severely limited throughput. > > If you're using Intel adapters, check xon/xoff counters in "sysctl > dev.em.0.mac_stats". If you see them increasing, that may explain slow > speed. > If you have a switch between your boxes, try bypassing it and connect > boxes directly. > > --Artem > > > > On Fri, Oct 1, 2010 at 11:51 AM, Dan Langille wrote: >> >> On Wed, September 29, 2010 2:04 pm, Dan Langille wrote: >>> $ zpool iostat 10 >>> capacity operations bandwidth >>> pool used avail read write read write >>> ---------- ----- ----- ----- ----- ----- ----- >>> storage 7.67T 5.02T 358 38 43.1M 1.96M >>> storage 7.67T 5.02T 317 475 39.4M 30.9M >>> storage 7.67T 5.02T 357 533 44.3M 34.4M >>> storage 7.67T 5.02T 371 556 46.0M 35.8M >>> storage 7.67T 5.02T 313 521 38.9M 28.7M >>> storage 7.67T 5.02T 309 457 38.4M 30.4M >>> storage 7.67T 5.02T 388 589 48.2M 37.8M >>> storage 7.67T 5.02T 377 581 46.8M 36.5M >>> storage 7.67T 5.02T 310 559 38.4M 30.4M >>> storage 7.67T 5.02T 430 611 53.4M 41.3M >> >> Now that I'm using mbuffer: >> >> $ zpool iostat 10 >> capacity operations bandwidth >> pool used avail read write read write >> ---------- ----- ----- ----- ----- ----- ----- >> storage 9.96T 2.73T 2.01K 131 151M 6.72M >> storage 9.96T 2.73T 615 515 76.3M 33.5M >> storage 9.96T 2.73T 360 492 44.7M 33.7M >> storage 9.96T 2.73T 388 554 48.3M 38.4M >> storage 9.96T 2.73T 403 562 50.1M 39.6M >> storage 9.96T 2.73T 313 468 38.9M 28.0M >> storage 9.96T 2.73T 462 677 57.3M 22.4M >> storage 9.96T 2.73T 383 581 47.5M 21.6M >> storage 9.96T 2.72T 142 571 17.7M 15.4M >> storage 9.96T 2.72T 80 598 10.0M 18.8M >> storage 9.96T 2.72T 718 503 89.1M 13.6M >> storage 9.96T 2.72T 594 517 73.8M 14.1M >> storage 9.96T 2.72T 367 528 45.6M 15.1M >> storage 9.96T 2.72T 338 520 41.9M 16.4M >> storage 9.96T 2.72T 348 499 43.3M 21.5M >> storage 9.96T 2.72T 398 553 49.4M 14.4M >> storage 9.96T 2.72T 346 481 43.0M 6.78M >> >> If anything, it's slower. >> >> The above was without -s 128. The following used that setting: >> >> $ zpool iostat 10 >> capacity operations bandwidth >> pool used avail read write read write >> ---------- ----- ----- ----- ----- ----- ----- >> storage 9.78T 2.91T 1.98K 137 149M 6.92M >> storage 9.78T 2.91T 761 577 94.4M 42.6M >> storage 9.78T 2.91T 462 411 57.4M 24.6M >> storage 9.78T 2.91T 492 497 61.1M 27.6M >> storage 9.78T 2.91T 632 446 78.5M 22.5M >> storage 9.78T 2.91T 554 414 68.7M 21.8M >> storage 9.78T 2.91T 459 434 57.0M 31.4M >> storage 9.78T 2.91T 398 570 49.4M 32.7M >> storage 9.78T 2.91T 338 495 41.9M 26.5M >> storage 9.78T 2.91T 358 526 44.5M 33.3M >> storage 9.78T 2.91T 385 555 47.8M 39.8M >> storage 9.78T 2.91T 271 453 33.6M 23.3M >> storage 9.78T 2.91T 270 456 33.5M 28.8M >> >> >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >> > From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 00:32:47 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8EAD6106566C for ; Sat, 2 Oct 2010 00:32:47 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (adsl-75-1-14-242.dsl.scrm01.sbcglobal.net [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 5EDA38FC08 for ; Sat, 2 Oct 2010 00:32:47 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id o920WZKG028379; Fri, 1 Oct 2010 17:32:39 -0700 (PDT) (envelope-from truckman@FreeBSD.org) Message-Id: <201010020032.o920WZKG028379@gw.catspoiler.org> Date: Fri, 1 Oct 2010 17:32:35 -0700 (PDT) From: Don Lewis To: avg@icyb.net.ua In-Reply-To: <201009300849.o8U8nr8r081019@gw.catspoiler.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: stable@FreeBSD.org, sterling@camdensoftware.com, freebsd@jdc.parodius.com Subject: Re: CPU time accounting broken on 8-STABLE machine after a few hours of uptime X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 00:32:47 -0000 On 30 Sep, Don Lewis wrote: > The silent reboots that I was seeing with WITNESS go away if I add > WITNESS_SKIPSPIN. Witness doesn't complain about anything. I've tracked down the the silent reboot problem. It happens when a userland sysctl call gets down into calcru1(), which tries to print a "calcu: .." message. Eventually sc_puts() wants to grab a spin lock, which causes a call to witness, which detects a lock order reversal. This recurses into printf(), which dives back into the console code and eventually triggers a panic. I'm still gathering the details on this and I see what I can come up with for a fix. > I tested -CURRENT and !SMP seems to work ok. One difference in terms of > hardware between the two tests is that I'm using a SATA drive when > testing -STABLE and a SCSI drive when testing -CURRENT. I'm not able to trigger the problem with -CURRENT when it is running on a SCSI drive, but I do see the freezes, long ping RTTs, and ntp insanity when running a !SMP -CURRENT kernel on my SATA drive with an 8.1-STABLE world. From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 01:32:48 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8D790106564A for ; Sat, 2 Oct 2010 01:32:48 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id 51B4F8FC0A for ; Sat, 2 Oct 2010 01:32:47 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 7EFCC509A8; Sat, 2 Oct 2010 02:32:46 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IbRvDlQGMV5q; Sat, 2 Oct 2010 02:32:46 +0100 (BST) Received: from smtp-auth.unixathome.org (smtp-auth.unixathome.org [10.4.7.7]) (Authenticated sender: hidden) by nyi.unixathome.org (Postfix) with ESMTPSA id 2BCE9508AD ; Sat, 2 Oct 2010 02:32:46 +0100 (BST) Message-ID: <4CA68BBD.6060601@langille.org> Date: Fri, 01 Oct 2010 21:32:45 -0400 From: Dan Langille Organization: The FreeBSD Diary User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Artem Belevich , freebsd-stable References: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> <4C511EF8-591C-4BB9-B7AA-30D5C3DDC0FF@langille.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 01:32:48 -0000 On 10/1/2010 7:00 PM, Artem Belevich wrote: > On Fri, Oct 1, 2010 at 3:49 PM, Dan Langille wrote: >> FYI: this is all on the same box. > > In one of the previous emails you've used this command line: >> # mbuffer -s 128k -m 1G -I 9090 | zfs receive > > You've used mbuffer in network client mode. I assumed that you did do > your transfer over network. > > If you're running send/receive locally just pipe the data through > mbuffer -- zfs send|mbuffer|zfs receive As soon as I opened this email I knew what it would say. # time zfs send storage/bacula@transfer | mbuffer | zfs receive storage/compressed/bacula-mbuffer in @ 197 MB/s, out @ 205 MB/s, 1749 MB total, buffer 0% full $ zpool iostat 10 10 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- storage 9.78T 2.91T 1.11K 336 92.0M 17.3M storage 9.78T 2.91T 769 436 95.5M 30.5M storage 9.78T 2.91T 797 853 98.9M 78.5M storage 9.78T 2.91T 865 962 107M 78.0M storage 9.78T 2.91T 828 881 103M 82.6M storage 9.78T 2.90T 1023 1.12K 127M 91.0M storage 9.78T 2.90T 1.01K 1.01K 128M 89.3M storage 9.79T 2.90T 962 1.08K 119M 89.1M storage 9.79T 2.90T 1.09K 1.25K 139M 67.8M Big difference. :) > > --Artem > >> >> -- >> Dan Langille >> http://langille.org/ >> >> >> On Oct 1, 2010, at 5:56 PM, Artem Belevich wrote: >> >>> Hmm. It did help me a lot when I was replicating ~2TB worth of data >>> over GigE. Without mbuffer things were roughly in the ballpark of your >>> numbers. With mbuffer I've got around 100MB/s. >>> >>> Assuming that you have two boxes connected via ethernet, it would be >>> good to check that nobody generates PAUSE frames. Some time back I've >>> discovered that el-cheapo switch I've been using for some reason could >>> not keep up with traffic bursts and generated tons of PAUSE frames >>> that severely limited throughput. >>> >>> If you're using Intel adapters, check xon/xoff counters in "sysctl >>> dev.em.0.mac_stats". If you see them increasing, that may explain slow >>> speed. >>> If you have a switch between your boxes, try bypassing it and connect >>> boxes directly. >>> >>> --Artem >>> >>> >>> >>> On Fri, Oct 1, 2010 at 11:51 AM, Dan Langille wrote: >>>> >>>> On Wed, September 29, 2010 2:04 pm, Dan Langille wrote: >>>>> $ zpool iostat 10 >>>>> capacity operations bandwidth >>>>> pool used avail read write read write >>>>> ---------- ----- ----- ----- ----- ----- ----- >>>>> storage 7.67T 5.02T 358 38 43.1M 1.96M >>>>> storage 7.67T 5.02T 317 475 39.4M 30.9M >>>>> storage 7.67T 5.02T 357 533 44.3M 34.4M >>>>> storage 7.67T 5.02T 371 556 46.0M 35.8M >>>>> storage 7.67T 5.02T 313 521 38.9M 28.7M >>>>> storage 7.67T 5.02T 309 457 38.4M 30.4M >>>>> storage 7.67T 5.02T 388 589 48.2M 37.8M >>>>> storage 7.67T 5.02T 377 581 46.8M 36.5M >>>>> storage 7.67T 5.02T 310 559 38.4M 30.4M >>>>> storage 7.67T 5.02T 430 611 53.4M 41.3M >>>> >>>> Now that I'm using mbuffer: >>>> >>>> $ zpool iostat 10 >>>> capacity operations bandwidth >>>> pool used avail read write read write >>>> ---------- ----- ----- ----- ----- ----- ----- >>>> storage 9.96T 2.73T 2.01K 131 151M 6.72M >>>> storage 9.96T 2.73T 615 515 76.3M 33.5M >>>> storage 9.96T 2.73T 360 492 44.7M 33.7M >>>> storage 9.96T 2.73T 388 554 48.3M 38.4M >>>> storage 9.96T 2.73T 403 562 50.1M 39.6M >>>> storage 9.96T 2.73T 313 468 38.9M 28.0M >>>> storage 9.96T 2.73T 462 677 57.3M 22.4M >>>> storage 9.96T 2.73T 383 581 47.5M 21.6M >>>> storage 9.96T 2.72T 142 571 17.7M 15.4M >>>> storage 9.96T 2.72T 80 598 10.0M 18.8M >>>> storage 9.96T 2.72T 718 503 89.1M 13.6M >>>> storage 9.96T 2.72T 594 517 73.8M 14.1M >>>> storage 9.96T 2.72T 367 528 45.6M 15.1M >>>> storage 9.96T 2.72T 338 520 41.9M 16.4M >>>> storage 9.96T 2.72T 348 499 43.3M 21.5M >>>> storage 9.96T 2.72T 398 553 49.4M 14.4M >>>> storage 9.96T 2.72T 346 481 43.0M 6.78M >>>> >>>> If anything, it's slower. >>>> >>>> The above was without -s 128. The following used that setting: >>>> >>>> $ zpool iostat 10 >>>> capacity operations bandwidth >>>> pool used avail read write read write >>>> ---------- ----- ----- ----- ----- ----- ----- >>>> storage 9.78T 2.91T 1.98K 137 149M 6.92M >>>> storage 9.78T 2.91T 761 577 94.4M 42.6M >>>> storage 9.78T 2.91T 462 411 57.4M 24.6M >>>> storage 9.78T 2.91T 492 497 61.1M 27.6M >>>> storage 9.78T 2.91T 632 446 78.5M 22.5M >>>> storage 9.78T 2.91T 554 414 68.7M 21.8M >>>> storage 9.78T 2.91T 459 434 57.0M 31.4M >>>> storage 9.78T 2.91T 398 570 49.4M 32.7M >>>> storage 9.78T 2.91T 338 495 41.9M 26.5M >>>> storage 9.78T 2.91T 358 526 44.5M 33.3M >>>> storage 9.78T 2.91T 385 555 47.8M 39.8M >>>> storage 9.78T 2.91T 271 453 33.6M 23.3M >>>> storage 9.78T 2.91T 270 456 33.5M 28.8M >>>> >>>> >>>> _______________________________________________ >>>> freebsd-stable@freebsd.org mailing list >>>> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >>>> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >>>> >>> >> > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > -- Dan Langille - http://langille.org/ From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 01:43:19 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F3FA71065670 for ; Sat, 2 Oct 2010 01:43:18 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id A18BE8FC1A for ; Sat, 2 Oct 2010 01:43:18 +0000 (UTC) Received: by qyk8 with SMTP id 8so47712qyk.13 for ; Fri, 01 Oct 2010 18:43:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type:content-transfer-encoding; bh=3WXaW0LJDHjhW5mR857r7RpxoUacsxzNPoYkyIZ+aM0=; b=e7k4N8uzl5Z+ai5/fCSUJee30jRXD0nRSf3a5e4IleXt4ke1iDL8boqysBmkVl/oOG y22QGdf41aRNvh0zKMk1ZGZ3j7GGqtnT6Y9dlLyPghxQk2xcvObRLUsKoX6ieMVOBSlV /GyqZ6jA5XoVWwJMiy8UZfiij0wXW3SiKTWVM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=gyntrWQuP72ZzXEbmOQwRpjB0f5vjiGlv4jWikHvBwAGTiBunapiHdQMizVVDLFg6E ESogij02Zegs6TYaAtUNklE6jfKN/u+oKmp13xATbR+eccE/ZK45i3O7XIuj0lurgAHN rMbOAEdJaCKddRVh4yiyWq4FUQYueDegsPen4= MIME-Version: 1.0 Received: by 10.220.57.15 with SMTP id a15mr1589769vch.142.1285983795447; Fri, 01 Oct 2010 18:43:15 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.220.176.77 with HTTP; Fri, 1 Oct 2010 18:43:15 -0700 (PDT) In-Reply-To: <4CA68BBD.6060601@langille.org> References: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> <4C511EF8-591C-4BB9-B7AA-30D5C3DDC0FF@langille.org> <4CA68BBD.6060601@langille.org> Date: Fri, 1 Oct 2010 18:43:15 -0700 X-Google-Sender-Auth: 1MRLxlT7OdslGm4_fsfC7xqEdOs Message-ID: From: Artem Belevich To: Dan Langille Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 01:43:19 -0000 > As soon as I opened this email I knew what it would say. > > > # time zfs send storage/bacula@transfer | mbuffer | zfs receive > storage/compressed/bacula-mbuffer > in @ =A0197 MB/s, out @ =A0205 MB/s, 1749 MB total, buffer =A0 0% full ... > Big difference. =A0:) I'm glad it helped. Does anyone know why sending/receiving stuff via loopback is so much slower compared to pipe? --Artem From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 02:58:47 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 08634106564A for ; Sat, 2 Oct 2010 02:58:46 +0000 (UTC) (envelope-from jamesbrandongooch@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id 569AF8FC12 for ; Sat, 2 Oct 2010 02:58:45 +0000 (UTC) Received: by wwb17 with SMTP id 17so4846335wwb.31 for ; Fri, 01 Oct 2010 19:58:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=BppFsgJFemV+PdKx6XgHtVK2PQVePkDmXWzKd6ySCYg=; b=oNQf4sYR6JrK6OxCwBjV+sNm+OCMMeI+0Uk/wH96ct0us34wQEdVDC6lfh1ZhyC/2R SFV1agR1X4QkUNBU2B0qqHw4YuzdJ1AnpRWMj9LXiOL14l9kU0we8xXYwKNBfCucAArd hiUBcVI8ofdHt/SJYTJYJPHkWCWP2moZYl4Bk= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=LRBy0pMJL6cep05z5B4d309gqDa/mjXZQsSMrqvQCJ0LLJ5cWUL9mLMJxCmTUOCZj4 fHDxbHJ3sKn7KjHkaZV0uUamgmz8JO3BqaVvvB6FBQihs6RZ76dsH0iAel1u4FXbivEH 2PAhNoR8aLkLF3W66tGCxbNPUPJarKecHuzvA= MIME-Version: 1.0 Received: by 10.216.23.206 with SMTP id v56mr2798831wev.67.1285986684347; Fri, 01 Oct 2010 19:31:24 -0700 (PDT) Received: by 10.216.133.133 with HTTP; Fri, 1 Oct 2010 19:31:24 -0700 (PDT) In-Reply-To: References: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> <4C511EF8-591C-4BB9-B7AA-30D5C3DDC0FF@langille.org> <4CA68BBD.6060601@langille.org> Date: Fri, 1 Oct 2010 21:31:24 -0500 Message-ID: From: Brandon Gooch To: Artem Belevich Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-stable , Dan Langille Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 02:58:47 -0000 On Fri, Oct 1, 2010 at 8:43 PM, Artem Belevich wrote: >> As soon as I opened this email I knew what it would say. >> >> >> # time zfs send storage/bacula@transfer | mbuffer | zfs receive >> storage/compressed/bacula-mbuffer >> in @ =A0197 MB/s, out @ =A0205 MB/s, 1749 MB total, buffer =A0 0% full > ... >> Big difference. =A0:) > > I'm glad it helped. > > Does anyone know why sending/receiving stuff via loopback is so much > slower compared to pipe? This may shed some light on that topic: http://lists.freebsd.org/pipermail/freebsd-current/2010-September/019877.ht= ml From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 03:07:28 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 54B5D106564A for ; Sat, 2 Oct 2010 03:07:28 +0000 (UTC) (envelope-from sean@gothic.net.au) Received: from visi.gothic.net.au (visi.gothic.net.au [115.64.131.102]) by mx1.freebsd.org (Postfix) with ESMTP id AA8818FC08 for ; Sat, 2 Oct 2010 03:07:27 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by visi.gothic.net.au (Postfix) with SMTP id 8C0CB12ADE for ; Sat, 2 Oct 2010 12:51:19 +1000 (EST) Received: from visi.gothic.net.au (localhost [127.0.0.1]) by visi.gothic.net.au (Postfix) with ESMTP id F2A0312ADB; Sat, 2 Oct 2010 12:51:18 +1000 (EST) X-Virus-Scanned: amavisd-new at gothic.net.au Received: from visi.gothic.net.au ([127.0.0.1]) by visi.gothic.net.au (visi.gothic.net.au [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id Q68LpTNgnQoE; Sat, 2 Oct 2010 12:51:09 +1000 (EST) Received: from sean-mbookpro.gothic.net.au (sean-mbookpro.gothic.net.au [10.168.1.40]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: sean) by visi.gothic.net.au (Postfix) with ESMTPSA id 74CAB12AD4; Sat, 2 Oct 2010 12:51:09 +1000 (EST) Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=us-ascii From: Sean In-Reply-To: Date: Sat, 2 Oct 2010 12:51:09 +1000 Content-Transfer-Encoding: quoted-printable Message-Id: References: <45cfd27021fb93f9b0877a1596089776.squirrel@nyi.unixathome.org> <4C511EF8-591C-4BB9-B7AA-30D5C3DDC0FF@langille.org> <4CA68BBD.6060601@langille.org> To: Artem Belevich X-Mailer: Apple Mail (2.1081) Cc: freebsd-stable , Dan Langille Subject: Re: zfs send/receive: is this slow? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 03:07:28 -0000 On 02/10/2010, at 11:43 AM, Artem Belevich wrote: >> As soon as I opened this email I knew what it would say. >>=20 >>=20 >> # time zfs send storage/bacula@transfer | mbuffer | zfs receive >> storage/compressed/bacula-mbuffer >> in @ 197 MB/s, out @ 205 MB/s, 1749 MB total, buffer 0% full > .. >> Big difference. :) >=20 > I'm glad it helped. >=20 > Does anyone know why sending/receiving stuff via loopback is so much > slower compared to pipe? Up and down the entire network stack, in and out of TCP buffers at both = ends... might add some overhead, and other factors in limiting it. Increasing TCP buffers, and disabling delayed acks might help. Nagle = might also have to be disabled too. (delayed acks and nagle in = combination can interact in odd ways) >=20 > --Artem > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org" From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 07:37:41 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E70921065673 for ; Sat, 2 Oct 2010 07:37:41 +0000 (UTC) (envelope-from telbizov@gmail.com) Received: from mail-qy0-f175.google.com (mail-qy0-f175.google.com [209.85.216.175]) by mx1.freebsd.org (Postfix) with ESMTP id 9C54E8FC0A for ; Sat, 2 Oct 2010 07:37:41 +0000 (UTC) Received: by qyk8 with SMTP id 8so318020qyk.13 for ; Sat, 02 Oct 2010 00:37:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=TBzSi9/4I0v85UDULLj2CkV6Lm1SlFvlVhGAfs5KPqY=; b=Ymul3nSZUCmWluB6z7sT1tS9E1Kw9UfgdT5lVT34cmqwYdiPywXMkNjE42c7P++h0o 61G1hrFvDD2ZV4qG+4oP++hBeSBaWJp0KugDV85xZufrITT1X5I37RqPiayvKi45EPa8 zjiA6dYTD2GqiToxRsyFL7NH0gG8H2c3nnp8c= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=VVQnMoScXr7YBuCDDjJj6qI1fT/pQITF0PktKPhVPIOVavgSbubm0IXHXg41BL7c8U 69t2qF21VZI75ECInf5IY1DQ4g58KL63Hxe9Y/ECsTgCn1PBU/+5EIKGYyIbzNfV9cSp mu1ywoYZev0WI9gbcJe7+MtCjxxPj/wq/MX60= MIME-Version: 1.0 Received: by 10.229.96.16 with SMTP id f16mr4672720qcn.255.1286003212105; Sat, 02 Oct 2010 00:06:52 -0700 (PDT) Received: by 10.229.191.132 with HTTP; Sat, 2 Oct 2010 00:06:52 -0700 (PDT) Date: Sat, 2 Oct 2010 00:06:52 -0700 Message-ID: From: Rumen Telbizov To: freebsd-stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: MySQL performance concern X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 07:37:42 -0000 Hello everyone, I am experimenting with MySQL running on FreeBSD and comparing with another (older) setup running on a Linux box. My results show that performance on Linux is significantly better than FreeBSD although the hardware is weaker. I'd appreciate your comments and ideas. Here's the setup: 1) FreeBSD 8.1-STABLE amd64 (Tue Sep 14 15:29:22 PDT 2010) running on a SuperMicro machine with 2 x Dual Core Xeon E5502 1.87Ghz ; 4 x SAS 15K in RAID10 setup under ZFS (two mirrored pairs) and 2 x SSD X25-E partitioned for: 8G for ZIL and the rest for L2ARC; 16G ram with 8 of them given to mysql and tons of free. 2) Linux Gentoo with 3 SATA disks in hardware RAID5 with similar cpu/motherboard and same memory size. The sole application that runs is a python script which inserts a batch of lines at a time. Only myisam is used as a format. Here's the problem: On the Linux box it manages to push around *5800*inserts/second while on the FreeBSD box it's only *4000/*second. MySQL version is 5.1.51 During this load the disk subsystem on FreeBSD is pretty much idle (both the SSDs and the SAS disks). CPU utilization contributed to mysqld is only around 30%. So I am clearly heavily under-utilizing the hardware. Linuxthreads support for 64bit architectures is not available so I couldn't try that but aside from that I tried recompiling mysql with all the different Makefile options available without any effect. Changing the recordsize in zfs to 8K doesn't make any difference. Tried percona binary without any luck. Let me know what additional information would be useful and I'll provide it here. Thank you in advance for your comments and suggestions. Cheers, Rumen Telbizov From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 08:20:47 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 488E8106566B for ; Sat, 2 Oct 2010 08:20:47 +0000 (UTC) (envelope-from m.seaman@infracaninophile.co.uk) Received: from smtp.infracaninophile.co.uk (smtp6.infracaninophile.co.uk [IPv6:2001:8b0:151:1:3fd3:cd67:fafa:3d78]) by mx1.freebsd.org (Postfix) with ESMTP id 986F08FC0C for ; Sat, 2 Oct 2010 08:20:46 +0000 (UTC) Received: from seedling.black-earth.co.uk (seedling.black-earth.co.uk [81.187.76.163]) (authenticated bits=0) by smtp.infracaninophile.co.uk (8.14.4/8.14.4) with ESMTP id o928KgLg002097 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 2 Oct 2010 09:20:42 +0100 (BST) (envelope-from m.seaman@infracaninophile.co.uk) X-DKIM: Sendmail DKIM Filter v2.8.3 smtp.infracaninophile.co.uk o928KgLg002097 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infracaninophile.co.uk; s=201001-infracaninophile; t=1286007642; bh=rg5n/rYhvsVhugPlEGxn+8dKZcVYn5qvO/nJbB3XHLw=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Cc:Content-Type:Date:From:In-Reply-To: Message-ID:Mime-Version:References:To; z=Message-ID:=20<4CA6EB52.1080504@infracaninophile.co.uk>|Date:=20S at,=2002=20Oct=202010=2009:20:34=20+0100|From:=20Matthew=20Seaman= 20|Organization:=20Infracaninophi le|User-Agent:=20Mozilla/5.0=20(Macintosh=3B=20U=3B=20Intel=20Mac= 20OS=20X=2010.6=3B=20en-GB=3B=20rv:1.9.2.9)=20Gecko/20100915=20Thu nderbird/3.1.4|MIME-Version:=201.0|To:=20Rumen=20Telbizov=20|CC:=20freebsd-stable@freebsd.org|Subject:=20Re:=20M ySQL=20performance=20concern|References:=20|In-Reply-To:=20|X-Enigmail-Ver sion:=201.1.1|OpenPGP:=20id=3D60AE908C|Content-Type:=20multipart/s igned=3B=20micalg=3Dpgp-sha1=3B=0D=0A=20protocol=3D"application/pg p-signature"=3B=0D=0A=20boundary=3D"------------enigFA676F08284032 5875787B01"; b=sGieHEUMiNOMJ1NChLklVkUYRKG8GqPXrTdCiMkxuyo6L03CfYAyfmwTeExIjXJY/ ZTyUf22EYwHAdmyZIcSzNBkFjYxN5rbhqiw4WLF9qovYt4tFAf4vTee6xCK9Darm03 FSZKJiG3D3QdzWMlxdhkSlUv5KMjHsCfHTgsIGCs= Message-ID: <4CA6EB52.1080504@infracaninophile.co.uk> Date: Sat, 02 Oct 2010 09:20:34 +0100 From: Matthew Seaman Organization: Infracaninophile User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Rumen Telbizov References: In-Reply-To: X-Enigmail-Version: 1.1.1 OpenPGP: id=60AE908C Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigFA676F082840325875787B01" X-Virus-Scanned: clamav-milter 0.96.3 at lucid-nonsense.infracaninophile.co.uk X-Virus-Status: Clean X-Spam-Status: No, score=-0.6 required=5.0 tests=BAYES_05,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,SPF_FAIL autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on lucid-nonsense.infracaninophile.co.uk Cc: freebsd-stable@freebsd.org Subject: Re: MySQL performance concern X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 08:20:47 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigFA676F082840325875787B01 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 02/10/2010 08:06:52, Rumen Telbizov wrote: > Hello everyone, >=20 > I am experimenting with MySQL running on FreeBSD and comparing with ano= ther > (older) setup running on a Linux box. > My results show that performance on Linux is significantly better than > FreeBSD although the hardware is weaker. > I'd appreciate your comments and ideas. >=20 > Here's the setup: >=20 > 1) FreeBSD 8.1-STABLE amd64 (Tue Sep 14 15:29:22 PDT 2010) running on a= > SuperMicro machine with 2 x Dual Core > Xeon E5502 1.87Ghz ; 4 x SAS 15K in RAID10 setup under ZFS (two mirrore= d > pairs) and 2 x SSD X25-E partitioned > for: 8G for ZIL and the rest for L2ARC; 16G ram with 8 of them given to= > mysql and tons of free. >=20 > 2) Linux Gentoo with 3 SATA disks in hardware RAID5 with similar > cpu/motherboard and same memory size. >=20 > The sole application that runs is a python script which inserts a batch= of > lines at a time. Only myisam is used as a format. > Here's the problem: On the Linux box it manages to push around > *5800*inserts/second while on the FreeBSD box > it's only *4000/*second. >=20 > MySQL version is 5.1.51 >=20 > During this load the disk subsystem on FreeBSD is pretty much idle (bot= h the > SSDs and the SAS disks). CPU utilization > contributed to mysqld is only around 30%. So I am clearly heavily > under-utilizing the hardware. > Linuxthreads support for 64bit architectures is not available so I coul= dn't > try that but aside from that I tried recompiling > mysql with all the different Makefile options available without any eff= ect. > Changing the recordsize in zfs to 8K doesn't make any difference. > Tried percona binary without any luck. >=20 > Let me know what additional information would be useful and I'll provid= e it > here. >=20 > Thank you in advance for your comments and suggestions. Um... a fairly obvious point, but have you tuned the mysql configuration appropriately on both machines? I'd guess you have, but you didn't mention it. As I recall, the default configuration you get out of the box with mysql is suitable for a machine with something like 64MB RAM. Not at all appropriate nowadays where dedicated DB server hardware would be more likely to have 64*G*B than 64*M*B... Cheers, Matthew --=20 Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate JID: matthew@infracaninophile.co.uk Kent, CT11 9PW --------------enigFA676F082840325875787B01 Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.14 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkym61oACgkQ8Mjk52CukIy28QCggeJLygQzvIyswTHstLCA6TDN mcwAnRJFLKOSK/rBKOIn2BgAMYZm4PpU =JNnG -----END PGP SIGNATURE----- --------------enigFA676F082840325875787B01-- From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 09:17:23 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ECA6F106566B for ; Sat, 2 Oct 2010 09:17:23 +0000 (UTC) (envelope-from bruce@cran.org.uk) Received: from muon.cran.org.uk (unknown [IPv6:2a01:348:0:15:5d59:5c40:0:1]) by mx1.freebsd.org (Postfix) with ESMTP id 86C488FC15 for ; Sat, 2 Oct 2010 09:17:23 +0000 (UTC) Received: from muon.cran.org.uk (localhost [127.0.0.1]) by muon.cran.org.uk (Postfix) with ESMTP id 2DC5DE7F82 for ; Sat, 2 Oct 2010 10:17:22 +0100 (BST) Received: from unknown (client-82-31-11-222.midd.adsl.virginmedia.com [82.31.11.222]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by muon.cran.org.uk (Postfix) with ESMTPSA for ; Sat, 2 Oct 2010 10:17:21 +0100 (BST) Date: Sat, 2 Oct 2010 10:17:19 +0100 From: Bruce Cran To: freebsd-stable@freebsd.org Message-ID: <20101002101719.000043c7@unknown> X-Mailer: Claws Mail 3.7.6 (GTK+ 2.16.6; i586-pc-mingw32msvc) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Subject: 60s boot hang on Xen running 8-STABLE when using ATA_CAM X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 09:17:24 -0000 I rebuilt my 8-STABLE kernel today using ATA_CAM to use the ada driver on my Xen VPS and found that the boot will hang for 60s, apparently because a CD image hasn't been configured for the virtual CD-ROM drive? After the "run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config" message, the boot continues and the system appears to work fine. Timecounter "TSC" frequency 2000085054 Hz quality 800 Timecounters tick every 10.000 msec lo0: bpf attached ata0: reset tp1 mask=03 ostat0=50 ostat1=00 ata0: stat0=0x50 err=0x01 lsb=0x00 msb=0x00 ata0: stat1=0x00 err=0x01 lsb=0xff msb=0xff ata0: reset tp2 stat0=50 stat1=00 devices=0x1 (aprobe0:ata0:0:0:0): SIGNATURE: 0000 ata1: reset tp1 mask=03 ostat0=50 ostat1=41 ata1: stat0=0x50 err=0x01 lsb=0xff msb=0xff ata1: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: reset tp2 stat0=50 stat1=00 devices=0x20001 (aprobe0:ata1:0:0:0): SIGNATURE: 0000 ata1: reset tp1 mask=03 ostat0=50 ostat1=00 ata1: stat0=0x50 err=0x01 lsb=0xff msb=0xff ata1: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: reset tp2 stat0=50 stat1=00 devices=0x20001 (aprobe0:ata1:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ata1:0:0:0): CAM status: Command timeout (aprobe0:ata1:0:0:0): SIGNATURE: 0000 run_interrupt_driven_hooks: still waiting after 60 seconds for xpt_config ata1: reset tp1 mask=03 ostat0=50 ostat1=00 ata1: stat0=0x50 err=0x01 lsb=0xff msb=0xff ata1: stat1=0x00 err=0x01 lsb=0x14 msb=0xeb ata1: reset tp2 stat0=50 stat1=00 devices=0x20001 (aprobe0:ata1:0:0:0): ATA_IDENTIFY. ACB: ec 00 00 00 00 40 00 00 00 00 00 00 (aprobe0:ata1:0:0:0): CAM status: Command timeout (aprobe0:ata1:0:1:0): SIGNATURE: eb14 ada0 at ata0 bus 0 scbus0 target 0 lun 0 ada0: ATA-7GEOM: new disk ada0 device ada0: Serial Number QM00001 ada0: 16.700MB/s transfers (WDMA2, PIO 8192bytes) ada0: 10752MB (22020096 512 byte sectors: 16H 63S/T 16383C) A full verbose dmesg is available at http://www.cran.org.uk/~brucec/freebsd/dmesg.ATA_CAM.hang.txt -- Bruce Cran From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 11:44:11 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA9D71065670 for ; Sat, 2 Oct 2010 11:44:11 +0000 (UTC) (envelope-from ronald-freebsd8@klop.yi.org) Received: from smtp-out1.tiscali.nl (smtp-out1.tiscali.nl [195.241.79.176]) by mx1.freebsd.org (Postfix) with ESMTP id 616568FC14 for ; Sat, 2 Oct 2010 11:44:11 +0000 (UTC) Received: from [212.123.145.58] (helo=sjakie.klop.ws) by smtp-out1.tiscali.nl with esmtp (Exim) (envelope-from ) id 1P20Ve-0003UR-AW; Sat, 02 Oct 2010 13:44:10 +0200 Received: from 212-123-145-58.ip.telfort.nl (localhost [127.0.0.1]) by sjakie.klop.ws (Postfix) with ESMTP id 413F1421A; Sat, 2 Oct 2010 13:44:07 +0200 (CEST) Content-Type: text/plain; charset=us-ascii; format=flowed; delsp=yes To: freebsd-stable@freebsd.org, "Rumen Telbizov" References: Date: Sat, 02 Oct 2010 13:44:06 +0200 MIME-Version: 1.0 From: "Ronald Klop" Message-ID: In-Reply-To: User-Agent: Opera Mail/10.62 (FreeBSD) Content-Transfer-Encoding: quoted-printable Cc: Subject: Re: MySQL performance concern X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 11:44:11 -0000 On Sat, 02 Oct 2010 09:06:52 +0200, Rumen Telbizov =20 wrote: > Hello everyone, > > I am experimenting with MySQL running on FreeBSD and comparing with =20 > another > (older) setup running on a Linux box. > My results show that performance on Linux is significantly better than > FreeBSD although the hardware is weaker. > I'd appreciate your comments and ideas. > > Here's the setup: > > 1) FreeBSD 8.1-STABLE amd64 (Tue Sep 14 15:29:22 PDT 2010) running on a > SuperMicro machine with 2 x Dual Core > Xeon E5502 1.87Ghz ; 4 x SAS 15K in RAID10 setup under ZFS (two mirrore= d > pairs) and 2 x SSD X25-E partitioned > for: 8G for ZIL and the rest for L2ARC; 16G ram with 8 of them given to > mysql and tons of free. > > 2) Linux Gentoo with 3 SATA disks in hardware RAID5 with similar > cpu/motherboard and same memory size. > > The sole application that runs is a python script which inserts a batch= =20 > of > lines at a time. Only myisam is used as a format. > Here's the problem: On the Linux box it manages to push around > *5800*inserts/second while on the FreeBSD box > it's only *4000/*second. > > MySQL version is 5.1.51 > > During this load the disk subsystem on FreeBSD is pretty much idle (bot= h =20 > the > SSDs and the SAS disks). CPU utilization > contributed to mysqld is only around 30%. So I am clearly heavily > under-utilizing the hardware. > Linuxthreads support for 64bit architectures is not available so I =20 > couldn't > try that but aside from that I tried recompiling > mysql with all the different Makefile options available without any =20 > effect. > Changing the recordsize in zfs to 8K doesn't make any difference. > Tried percona binary without any luck. > > Let me know what additional information would be useful and I'll provid= e =20 > it > here. > > Thank you in advance for your comments and suggestions. > > Cheers, > Rumen Telbizov Your app is singlethreaded I presume, so the multi-cores are not relevant= =20 in this story. Do you have the same indexes on the tables on both servers? Do they both have the same way to connect with mysql? Unix sockets or =20 localhost? Do they both run mysql 5.1.51, because you mention the Linux one is older= ? Ronald. From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 13:17:00 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9DC891065674 for ; Sat, 2 Oct 2010 13:17:00 +0000 (UTC) (envelope-from peter@pean.org) Received: from smtprelay-b12.telenor.se (smtprelay-b12.telenor.se [62.127.194.21]) by mx1.freebsd.org (Postfix) with ESMTP id 5D0DC8FC1E for ; Sat, 2 Oct 2010 13:17:00 +0000 (UTC) Received: from ipb4.telenor.se (ipb4.telenor.se [195.54.127.167]) by smtprelay-b12.telenor.se (Postfix) with ESMTP id 7FFC0EA5CC for ; Sat, 2 Oct 2010 14:55:03 +0200 (CEST) X-SENDER-IP: [85.225.7.221] X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ar8VAEbIpkxV4QfdPGdsb2JhbAAHlDCOEgEBAQE1wVOFRASKPA X-IronPort-AV: E=Sophos;i="4.57,271,1283724000"; d="scan'208";a="1677914032" Received: from c-dd07e155.166-7-64736c14.cust.bredbandsbolaget.se (HELO [172.25.0.40]) ([85.225.7.221]) by ipb4.telenor.se with ESMTP; 02 Oct 2010 14:55:03 +0200 From: =?iso-8859-1?Q?Peter_Ankerst=E5l?= Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Sat, 2 Oct 2010 14:55:02 +0200 Message-Id: <1C68D21A-9539-473D-AEDB-9A8CCE4956F1@pean.org> To: stable@freebsd.org Mime-Version: 1.0 (Apple Message framework v1081) X-Mailer: Apple Mail (2.1081) Cc: Subject: device names changes for adX. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 13:17:00 -0000 Hi, When I installed FreeBSD 8.1-RELEASE (freebsd-update) the adX devices = changed index number and the machine obviously didnt boot. Due to this I hesitate to install 8.1 = on my servers remote. How do I know if and to what the devices will change? -- Peter Ankerst=E5l peter@pean.org http://www.pean.org/ From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 13:23:21 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0A990106566B for ; Sat, 2 Oct 2010 13:23:21 +0000 (UTC) (envelope-from amarat@ksu.ru) Received: from mx7.ksu.ru (honey.ksu.ru [193.232.252.54]) by mx1.freebsd.org (Postfix) with ESMTP id 057D18FC19 for ; Sat, 2 Oct 2010 13:23:19 +0000 (UTC) X-IronPort-AV: E=Sophos;i="4.57,271,1283716800"; d="p7s'?scan'208";a="1111223" Received: from mail.ksu.ru (HELO ruby.ksu.ru) ([193.232.252.56]) by iport2.ksu.ru with ESMTP; 02 Oct 2010 17:23:16 +0400 X-Pass-Through: Kazan State University Network Received: from zealot.ksu.ru ([194.85.245.161]) by ksu.ru (8.13.4/8.13.4) with ESMTP id o92DNHJ2016893; Sat, 2 Oct 2010 13:23:17 GMT Received: from zealot.ksu.ru (localhost.lnet [127.0.0.1]) by zealot.ksu.ru (8.14.4/8.14.4) with ESMTP id o92DMVN0001325; Sat, 2 Oct 2010 17:22:31 +0400 (MSD) (envelope-from amarat@ksu.ru) Message-ID: <4CA73215.9030908@ksu.ru> Date: Sat, 02 Oct 2010 17:22:29 +0400 From: "Marat N.Afanasyev" User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.13) Gecko/20101001 Firefox/3.0.7 MIME-Version: 1.0 To: =?UTF-8?B?UGV0ZXIgQW5rZXJzdMOlbA==?= References: <1C68D21A-9539-473D-AEDB-9A8CCE4956F1@pean.org> In-Reply-To: <1C68D21A-9539-473D-AEDB-9A8CCE4956F1@pean.org> Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha1; boundary="------------ms050804030000040101070903" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: stable@freebsd.org Subject: Re: device names changes for adX. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 13:23:21 -0000 This is a cryptographically signed message in MIME format. --------------ms050804030000040101070903 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: quoted-printable Peter Ankerst=C3=A5l wrote: > Hi, > > When I installed FreeBSD 8.1-RELEASE (freebsd-update) the adX devices c= hanged index number and > the machine obviously didnt boot. Due to this I hesitate to install 8.1= on my servers remote. How do I know > if and to what the devices will change? > > > -- > Peter Ankerst=C3=A5l > peter@pean.org > http://www.pean.org/ > > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.or= g" > label your filesystems and mount them by label rather than by device=20 name. see man glabel --=20 SY, Marat --------------ms050804030000040101070903-- From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 13:42:45 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1C29C106566C for ; Sat, 2 Oct 2010 13:42:45 +0000 (UTC) (envelope-from martin@saturn.pcs.ms) Received: from mail4.hostpark.net (mail4.hostpark.net [212.243.197.34]) by mx1.freebsd.org (Postfix) with ESMTP id 9D5A68FC12 for ; Sat, 2 Oct 2010 13:42:44 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail4.hostpark.net (Postfix) with ESMTP id E878A4B9EB; Sat, 2 Oct 2010 15:11:30 +0200 (CEST) X-Virus-Scanned: by Hostpark/NetZone Mailprotection at hostpark.net Received: from mail4.hostpark.net ([127.0.0.1]) by localhost (mail4.hostpark.net [127.0.0.1]) (amavisd-new, port 10124) with ESMTP id CGskBMOquUAZ; Sat, 2 Oct 2010 15:11:30 +0200 (CEST) Received: from saturn.pcs.ms (246-171.62-81.cust.bluewin.ch [81.62.171.246]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail4.hostpark.net (Postfix) with ESMTP id 70D064B9B2; Sat, 2 Oct 2010 15:11:30 +0200 (CEST) Received: from saturn.pcs.ms (localhost [127.0.0.1]) by saturn.pcs.ms (8.14.4/8.14.4) with ESMTP id o92DB7Ou031073 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NOT); Sat, 2 Oct 2010 15:11:07 +0200 (CEST) (envelope-from martin@saturn.pcs.ms) Received: (from martin@localhost) by saturn.pcs.ms (8.14.4/8.14.4/Submit) id o92DB7Q7031072; Sat, 2 Oct 2010 15:11:07 +0200 (CEST) (envelope-from martin) Date: Sat, 2 Oct 2010 15:11:07 +0200 From: Martin Schweizer To: freebsd-stable@freebsd.org Message-ID: <20101002131106.GH74320@saturn.pcs.ms> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Organization: PC-Service M. Schweizer GmbH, CH-8608 Bubikon, Switzerland User-Agent: Mutt/1.5.20 (2009-06-14) Subject: Broken SASL/Kerberos authentication: openldap client GSSAPI authentication segfaults on FreeBSD 8.1 Release too X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Schweizer List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 13:42:45 -0000 Hello I use the system as a mail server (Cyrus Impad) which I authenticate against Kerberos5 (Windows Active Directory) with Cyrus SASL (saslauthd -a kerberos5). Here are the details: cyrus-imapd-2.3.16_2 The cyrus mail server, supporting POP3 and IMAP4 protocols cyrus-sasl-2.1.23 RFC 2222 SASL (Simple Authentication and Security Layer) cyrus-sasl-saslauthd-2.1.23 SASL authentication server for cyrus-sasl2 My system: FreeBSD acsvfbsd04.acutronic.ch 8.1-RELEASE FreeBSD 8.1-RELEASE #0: Thu Sep 30 12:33:18 CEST 2010 martin@acsvfbsd04.acutronic.ch:/usr/obj/usr/src/sys/GENERIC i386 After I upgaded from 7.2 to 8.1 the SASL authentication (with Kerberos5) is broken. See http://docs.freebsd.org/cgi/getmsg.cgi?fetch=301304+0+archive/2010/freebsd-stable/20100718.freebsd-stable and the following threads. See alo PR 147454. I did what you suggested in different threads around july regarding the subject: 1. cvsup a fresh copy of RELEASE 8.1 in /usr/src 2. Now I apply the patch in /usr/src with patch -p1 -E < patch name 3. Now I make buildworld && make buidlkernel && make installkernel and I get the following messages: cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb acc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/ker cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb make: don't know how to make /usr/obj/usr/src/tmp/usr/lib/libpthread.a. Stop *** Error code 2 Stop in /usr/src. What I'm doing wrong? Kind regards, -- Martin Schweizer PC-Service M. Schweizer GmbH; Bannholzstrasse 6; CH-8608 Bubikon Tel. +41 55 243 30 00; Fax: +41 55 243 33 22 From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 13:43:32 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D35E1065670 for ; Sat, 2 Oct 2010 13:43:32 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id 4DF008FC16 for ; Sat, 2 Oct 2010 13:43:32 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 7369F509E3 for ; Sat, 2 Oct 2010 14:43:31 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gw8UFyjs+vXi for ; Sat, 2 Oct 2010 14:43:31 +0100 (BST) Received: from smtp-auth.unixathome.org (smtp-auth.unixathome.org [10.4.7.7]) (Authenticated sender: hidden) by nyi.unixathome.org (Postfix) with ESMTPSA id 0084D509A3 for ; Sat, 2 Oct 2010 14:43:30 +0100 (BST) Message-ID: <4CA73702.5080203@langille.org> Date: Sat, 02 Oct 2010 09:43:30 -0400 From: Dan Langille Organization: The FreeBSD Diary User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: freebsd-stable Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: out of HDD space - zfs degraded X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 13:43:32 -0000 Overnight I was running a zfs send | zfs receive (both within the same system / zpool). The system ran out of space, a drive went off line, and the system is degraded. This is a raidz2 array running on FreeBSD 8.1-STABLE #0: Sat Sep 18 23:43:48 EDT 2010. The following logs are also available at http://www.langille.org/tmp/zfs-space.txt <- no line wrapping This is what was running: # time zfs send storage/bacula@transfer | mbuffer | zfs receive storage/compressed/bacula-mbuffer in @ 0.0 kB/s, out @ 0.0 kB/s, 3670 GB total, buffer 100% fullcannot receive new filesystem stream: out of space mbuffer: error: outputThread: error writing to at offset 0x395917c4000: Broken pipe summary: 3670 GByte in 10 h 40 min 97.8 MB/s mbuffer: warning: error during output to : Broken pipe warning: cannot send 'storage/bacula@transfer': Broken pipe real 640m48.423s user 8m52.660s sys 211m40.862s Looking in the logs, I see this: Oct 2 00:50:53 kraken kernel: (ada0:siisch0:0:0:0): lost device Oct 2 00:50:54 kraken kernel: siisch0: Timeout on slot 30 Oct 2 00:50:54 kraken kernel: siisch0: siis_timeout is 00040000 ss 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 Oct 2 00:50:54 kraken kernel: siisch0: Error while READ LOG EXT Oct 2 00:50:55 kraken kernel: siisch0: Timeout on slot 30 Oct 2 00:50:55 kraken kernel: siisch0: siis_timeout is 00040000 ss 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 Oct 2 00:50:55 kraken kernel: siisch0: Error while READ LOG EXT Oct 2 00:50:56 kraken kernel: siisch0: Timeout on slot 30 Oct 2 00:50:56 kraken kernel: siisch0: siis_timeout is 00040000 ss 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 Oct 2 00:50:56 kraken kernel: siisch0: Error while READ LOG EXT Oct 2 00:50:57 kraken kernel: siisch0: Timeout on slot 30 Oct 2 00:50:57 kraken kernel: siisch0: siis_timeout is 00040000 ss 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 Oct 2 00:50:57 kraken kernel: siisch0: Error while READ LOG EXT Oct 2 00:50:58 kraken kernel: siisch0: Timeout on slot 30 Oct 2 00:50:58 kraken kernel: siisch0: siis_timeout is 00040000 ss 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 Oct 2 00:50:58 kraken kernel: siisch0: Error while READ LOG EXT Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage path=/dev/gpt/disk06-live offset=270336 size=8192 error=6 Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): Synchronize cache failed Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): removing device entry Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage path=/dev/gpt/disk06-live offset=2000187564032 size=8192 error=6 Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage path=/dev/gpt/disk06-live offset=2000187826176 size=8192 error=6 $ zpool status pool: storage state: DEGRADED scrub: scrub in progress for 5h32m, 17.16% done, 26h44m to go config: NAME STATE READ WRITE CKSUM storage DEGRADED 0 0 0 raidz2 DEGRADED 0 0 0 gpt/disk01-live ONLINE 0 0 0 gpt/disk02-live ONLINE 0 0 0 gpt/disk03-live ONLINE 0 0 0 gpt/disk04-live ONLINE 0 0 0 gpt/disk05-live ONLINE 0 0 0 gpt/disk06-live REMOVED 0 0 0 gpt/disk07-live ONLINE 0 0 0 $ zfs list NAME USED AVAIL REFER MOUNTPOINT storage 6.97T 1.91T 1.75G /storage storage/bacula 4.72T 1.91T 4.29T /storage/bacula storage/compressed 2.25T 1.91T 46.9K /storage/compressed storage/compressed/bacula 2.25T 1.91T 42.7K /storage/compressed/bacula storage/pgsql 5.50G 1.91T 5.50G /storage/pgsql $ sudo camcontrol devlist Password: at scbus2 target 0 lun 0 (pass1,ada1) at scbus3 target 0 lun 0 (pass2,ada2) at scbus4 target 0 lun 0 (pass3,ada3) at scbus5 target 0 lun 0 (pass4,ada4) at scbus6 target 0 lun 0 (pass5,ada5) at scbus7 target 0 lun 0 (pass6,ada6) at scbus8 target 0 lun 0 (pass7,ada7) at scbus9 target 0 lun 0 (cd0,pass8) at scbus10 target 0 lun 0 (pass9,ada8) I'm not yet sure if the drive is fully dead or not. This is not a hot-swap box. I'm guessing the first step is to get ada0 back online and then in the zpool. However, I'm reluctant to do a 'camcontrol scan' on this box as it it froze up the system the last time I tried that: http://docs.freebsd.org/cgi/mid.cgi?4C78FF01.5020500 Any suggestions for getting the drive back online and the zpool stabilized? -- Dan Langille - http://langille.org/ From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 14:08:40 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DEC11106566B for ; Sat, 2 Oct 2010 14:08:40 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta12.westchester.pa.mail.comcast.net (qmta12.westchester.pa.mail.comcast.net [76.96.59.227]) by mx1.freebsd.org (Postfix) with ESMTP id A11118FC0C for ; Sat, 2 Oct 2010 14:08:40 +0000 (UTC) Received: from omta04.westchester.pa.mail.comcast.net ([76.96.62.35]) by qmta12.westchester.pa.mail.comcast.net with comcast id DpXp1f0020ldTLk5Cq8gZ9; Sat, 02 Oct 2010 14:08:40 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta04.westchester.pa.mail.comcast.net with comcast id Dq8f1f00g3LrwQ23Qq8gvz; Sat, 02 Oct 2010 14:08:40 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 52E899B418; Sat, 2 Oct 2010 07:08:38 -0700 (PDT) Date: Sat, 2 Oct 2010 07:08:38 -0700 From: Jeremy Chadwick To: Peter =?iso-8859-1?Q?Ankerst=E5l?= Message-ID: <20101002140838.GA70283@icarus.home.lan> References: <1C68D21A-9539-473D-AEDB-9A8CCE4956F1@pean.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <1C68D21A-9539-473D-AEDB-9A8CCE4956F1@pean.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: stable@freebsd.org Subject: Re: device names changes for adX. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 14:08:40 -0000 On Sat, Oct 02, 2010 at 02:55:02PM +0200, Peter Ankerstl wrote: > When I installed FreeBSD 8.1-RELEASE (freebsd-update) the adX devices changed index number and > the machine obviously didnt boot. Due to this I hesitate to install 8.1 on my servers remote. How do I know > if and to what the devices will change? Please see this thread: http://www.mail-archive.com/freebsd-stable@freebsd.org/msg112349.html -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 14:11:48 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8F4C11065674 for ; Sat, 2 Oct 2010 14:11:48 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta02.emeryville.ca.mail.comcast.net (qmta02.emeryville.ca.mail.comcast.net [76.96.30.24]) by mx1.freebsd.org (Postfix) with ESMTP id 754DB8FC08 for ; Sat, 2 Oct 2010 14:11:48 +0000 (UTC) Received: from omta15.emeryville.ca.mail.comcast.net ([76.96.30.71]) by qmta02.emeryville.ca.mail.comcast.net with comcast id DpdG1f0061Y3wxoA2qBnyU; Sat, 02 Oct 2010 14:11:47 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta15.emeryville.ca.mail.comcast.net with comcast id DqBm1f00P3LrwQ28bqBnoX; Sat, 02 Oct 2010 14:11:47 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id CAC519B418; Sat, 2 Oct 2010 07:11:46 -0700 (PDT) Date: Sat, 2 Oct 2010 07:11:46 -0700 From: Jeremy Chadwick To: Martin Schweizer Message-ID: <20101002141146.GB70283@icarus.home.lan> References: <20101002131106.GH74320@saturn.pcs.ms> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101002131106.GH74320@saturn.pcs.ms> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org Subject: Re: Broken SASL/Kerberos authentication: openldap client GSSAPI authentication segfaults on FreeBSD 8.1 Release too X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 14:11:48 -0000 On Sat, Oct 02, 2010 at 03:11:07PM +0200, Martin Schweizer wrote: > [...] > 3. Now I make buildworld && make buidlkernel && make installkernel and I get the following messages: > [...] > > cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb > make: don't know how to make /usr/obj/usr/src/tmp/usr/lib/libpthread.a. Stop > *** Error code 2 > Stop in /usr/src. > > What I'm doing wrong? Did you specify any -j flags during your "make buildworld" (ex. "make -j2 buildworld")? If so, please remove them and restart the build. Then you will see where the actual compile/make error happens. From the above output, it doesn't look like it's related to the Kerberos or libgssapi stuff. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 14:19:24 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A54AB106566B for ; Sat, 2 Oct 2010 14:19:24 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta15.westchester.pa.mail.comcast.net (qmta15.westchester.pa.mail.comcast.net [76.96.59.228]) by mx1.freebsd.org (Postfix) with ESMTP id 533B48FC08 for ; Sat, 2 Oct 2010 14:19:23 +0000 (UTC) Received: from omta24.westchester.pa.mail.comcast.net ([76.96.62.76]) by qmta15.westchester.pa.mail.comcast.net with comcast id Doo91f0051ei1Bg5FqKQTJ; Sat, 02 Oct 2010 14:19:24 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta24.westchester.pa.mail.comcast.net with comcast id DqKP1f00A3LrwQ23kqKPQS; Sat, 02 Oct 2010 14:19:24 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 0A5619B418; Sat, 2 Oct 2010 07:19:22 -0700 (PDT) Date: Sat, 2 Oct 2010 07:19:22 -0700 From: Jeremy Chadwick To: Dan Langille Message-ID: <20101002141921.GC70283@icarus.home.lan> References: <4CA73702.5080203@langille.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA73702.5080203@langille.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable Subject: Re: out of HDD space - zfs degraded X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 14:19:24 -0000 On Sat, Oct 02, 2010 at 09:43:30AM -0400, Dan Langille wrote: > Overnight I was running a zfs send | zfs receive (both within the > same system / zpool). The system ran out of space, a drive went off > line, and the system is degraded. > > This is a raidz2 array running on FreeBSD 8.1-STABLE #0: Sat Sep 18 > 23:43:48 EDT 2010. > > The following logs are also available at > http://www.langille.org/tmp/zfs-space.txt <- no line wrapping > > This is what was running: > > # time zfs send storage/bacula@transfer | mbuffer | zfs receive > storage/compressed/bacula-mbuffer > in @ 0.0 kB/s, out @ 0.0 kB/s, 3670 GB total, buffer 100% > fullcannot receive new filesystem stream: out of space > mbuffer: error: outputThread: error writing to at offset > 0x395917c4000: Broken pipe > > summary: 3670 GByte in 10 h 40 min 97.8 MB/s > mbuffer: warning: error during output to : Broken pipe > warning: cannot send 'storage/bacula@transfer': Broken pipe > > real 640m48.423s > user 8m52.660s > sys 211m40.862s > > > Looking in the logs, I see this: > > Oct 2 00:50:53 kraken kernel: (ada0:siisch0:0:0:0): lost device > Oct 2 00:50:54 kraken kernel: siisch0: Timeout on slot 30 > Oct 2 00:50:54 kraken kernel: siisch0: siis_timeout is 00040000 ss > 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > Oct 2 00:50:54 kraken kernel: siisch0: Error while READ LOG EXT > Oct 2 00:50:55 kraken kernel: siisch0: Timeout on slot 30 > Oct 2 00:50:55 kraken kernel: siisch0: siis_timeout is 00040000 ss > 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > Oct 2 00:50:55 kraken kernel: siisch0: Error while READ LOG EXT > Oct 2 00:50:56 kraken kernel: siisch0: Timeout on slot 30 > Oct 2 00:50:56 kraken kernel: siisch0: siis_timeout is 00040000 ss > 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > Oct 2 00:50:56 kraken kernel: siisch0: Error while READ LOG EXT > Oct 2 00:50:57 kraken kernel: siisch0: Timeout on slot 30 > Oct 2 00:50:57 kraken kernel: siisch0: siis_timeout is 00040000 ss > 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > Oct 2 00:50:57 kraken kernel: siisch0: Error while READ LOG EXT > Oct 2 00:50:58 kraken kernel: siisch0: Timeout on slot 30 > Oct 2 00:50:58 kraken kernel: siisch0: siis_timeout is 00040000 ss > 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > Oct 2 00:50:58 kraken kernel: siisch0: Error while READ LOG EXT > Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > path=/dev/gpt/disk06-live offset=270336 size=8192 error=6 > > Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): Synchronize > cache failed > Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): removing device entry > > Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > path=/dev/gpt/disk06-live offset=2000187564032 size=8192 error=6 > Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > path=/dev/gpt/disk06-live offset=2000187826176 size=8192 error=6 > > $ zpool status > pool: storage > state: DEGRADED > scrub: scrub in progress for 5h32m, 17.16% done, 26h44m to go > config: > > NAME STATE READ WRITE CKSUM > storage DEGRADED 0 0 0 > raidz2 DEGRADED 0 0 0 > gpt/disk01-live ONLINE 0 0 0 > gpt/disk02-live ONLINE 0 0 0 > gpt/disk03-live ONLINE 0 0 0 > gpt/disk04-live ONLINE 0 0 0 > gpt/disk05-live ONLINE 0 0 0 > gpt/disk06-live REMOVED 0 0 0 > gpt/disk07-live ONLINE 0 0 0 > > $ zfs list > NAME USED AVAIL REFER MOUNTPOINT > storage 6.97T 1.91T 1.75G /storage > storage/bacula 4.72T 1.91T 4.29T /storage/bacula > storage/compressed 2.25T 1.91T 46.9K /storage/compressed > storage/compressed/bacula 2.25T 1.91T 42.7K /storage/compressed/bacula > storage/pgsql 5.50G 1.91T 5.50G /storage/pgsql > > $ sudo camcontrol devlist > Password: > at scbus2 target 0 lun 0 (pass1,ada1) > at scbus3 target 0 lun 0 (pass2,ada2) > at scbus4 target 0 lun 0 (pass3,ada3) > at scbus5 target 0 lun 0 (pass4,ada4) > at scbus6 target 0 lun 0 (pass5,ada5) > at scbus7 target 0 lun 0 (pass6,ada6) > at scbus8 target 0 lun 0 (pass7,ada7) > at scbus9 target 0 lun 0 (cd0,pass8) > at scbus10 target 0 lun 0 (pass9,ada8) > > I'm not yet sure if the drive is fully dead or not. This is not a > hot-swap box. It looks to me like the disk labelled gpt/disk06-live literally stopped responding to commands. The errors you see are coming from the OS and the siis(4) controller, and both indicate the actual hard disk isn't responding to the ATA command READ LOG EXT. error=6 means Device not configured. I can't see how/why running out of space would cause this. It looks more like that you had a hardware issue of some sort happen during the course of the operations you were running. It may not have happened until now because you hadn't utilised writes to that area of the disk (could have bad sectors there, or physical media/platter problems). Please provide smartctl -a output for the drive that's gpt/disk06-live, which I assume is /dev/ada6 (glabel sure makes correlation easy, doesn't it? Sigh...). Please put the results up on the web somewhere, not copy-pasted, otherwise I have to do a bunch of manual work with regarsd to line wrapping/etc... I'll provide an analysis of SMART stats for you, to see if anything crazy happened to the disk itself. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 14:29:52 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 67BD61065673 for ; Sat, 2 Oct 2010 14:29:52 +0000 (UTC) (envelope-from peter@pean.org) Received: from smtprelay-h22.telenor.se (smtprelay-h22.telenor.se [195.54.99.197]) by mx1.freebsd.org (Postfix) with ESMTP id 232AD8FC0C for ; Sat, 2 Oct 2010 14:29:51 +0000 (UTC) Received: from ipb4.telenor.se (ipb4.telenor.se [195.54.127.167]) by smtprelay-h22.telenor.se (Postfix) with ESMTP id 7CCCFEA6E5 for ; Sat, 2 Oct 2010 16:29:50 +0200 (CEST) X-SENDER-IP: [85.225.7.221] X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AicSAIvepkxV4QfdPGdsb2JhbAAHokIBAQEBNcF1hUQEiU9t X-IronPort-AV: E=Sophos;i="4.57,271,1283724000"; d="scan'208";a="1677950914" Received: from c-dd07e155.166-7-64736c14.cust.bredbandsbolaget.se (HELO [172.25.0.40]) ([85.225.7.221]) by ipb4.telenor.se with ESMTP; 02 Oct 2010 16:29:50 +0200 Mime-Version: 1.0 (Apple Message framework v1081) Content-Type: text/plain; charset=iso-8859-1 From: =?iso-8859-1?Q?Peter_Ankerst=E5l?= In-Reply-To: <4CA73215.9030908@ksu.ru> Date: Sat, 2 Oct 2010 16:29:49 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: References: <1C68D21A-9539-473D-AEDB-9A8CCE4956F1@pean.org> <4CA73215.9030908@ksu.ru> To: "Marat N.Afanasyev" X-Mailer: Apple Mail (2.1081) Cc: stable@freebsd.org Subject: Re: device names changes for adX. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 14:29:52 -0000 > Peter Ankerst=E5l wrote: >> Hi, >>=20 >> When I installed FreeBSD 8.1-RELEASE (freebsd-update) the adX devices = changed index number and >> the machine obviously didnt boot. Due to this I hesitate to install = 8.1 on my servers remote. How do I know >> if and to what the devices will change? >>=20 >>=20 >> -- >> Peter Ankerst=E5l >> peter@pean.org >> http://www.pean.org/ >>=20 >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to = "freebsd-stable-unsubscribe@freebsd.org" >>=20 > label your filesystems and mount them by label rather than by device = name. see >=20 > man glabel >=20 > --=20 > SY, Marat >=20 Thanks, I may try that. But how will this affect ZFS raidz set up to use = ad-drives? Like this: tank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad10s2 ONLINE 0 0 0 ad12 ONLINE 0 0 0 ad14 ONLINE 0 0 0 ad16 ONLINE 0 0 0 From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 14:32:19 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8B616106564A for ; Sat, 2 Oct 2010 14:32:19 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2-6.sentex.ca [IPv6:2607:f3e0:80:80::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2E9988FC12 for ; Sat, 2 Oct 2010 14:32:19 +0000 (UTC) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by smarthost2.sentex.ca (8.14.4/8.14.4) with ESMTP id o92EWCuW017129 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 2 Oct 2010 10:32:12 -0400 (EDT) (envelope-from mike@sentex.net) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.14.4/8.14.4) with ESMTP id o92EWAIs033670; Sat, 2 Oct 2010 10:32:10 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <201010021432.o92EWAIs033670@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Sat, 02 Oct 2010 10:32:02 -0400 To: Jack Vogel From: Mike Tancsa In-Reply-To: References: <201006102031.o5AKVCH2016467@lava.sentex.ca> <201007021739.o62HdMOU092319@lava.sentex.ca> <20100702193654.GD10862@michelle.cdnetworks.com> <201008162107.o7GL76pA080191@lava.sentex.ca> <20100817185208.GA6482@michelle.cdnetworks.com> <201008171955.o7HJt67T087902@lava.sentex.ca> <20100817200020.GE6482@michelle.cdnetworks.com> <201009141759.o8EHxcZ0013539@lava.sentex.ca> <201009262157.o8QLvR0L012171@lava.sentex.ca> <201009262343.o8QNhgDG012676@lava.sentex.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Scanned-By: MIMEDefang 2.67 on 205.211.164.50 Cc: pyunyh@gmail.com, freebsd-stable@freebsd.org Subject: Re: RELENG_7 em problems (and RELENG_8) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 14:32:19 -0000 Hi Jack, Two quick notes about the new driver. On the server that was having nic lockups, so far so good. Saturday AM, the box would take a lot of level0 dumps as well as do about 70Mb/s of outbound rsync traffic. By now, the nic would have wedged at least once So far so good! On different, new box, I decided to try HEAD, with the new driver, and ran into problems with the onboard nic em0@pci0:0:25:0: class=0x020000 card=0x00368086 chip=0x10f08086 rev=0x06 hdr=0x00 vendor = 'Intel Corporation' class = network subclass = ethernet cap 01[c8] = powerspec 2 supports D0 D3 current D0 cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message cap 13[e0] = PCI Advanced Features: FLR TP em0: port 0xf020-0xf03f mem 0xfe500000-0xfe51ffff,0xfe527000-0xfe527fff irq 20 at device 25.0 on pci0 em0: Using MSI interrupt em0: [FILTER] em0: Ethernet address: 70:71:bc:09:5e:aa This is an intel branded desktop board acpi0: on motherboard I find I have to disable rx and tx csum on the interface, otherwise there are a lot of re-transmits due to missed packets. tcpdump implies the packets are going out, but it seems never to get out. The mother board is at the office on an unmanaged switch right now, so I dont have any stats from the switch. But tcpdump shows a lot of outbound re-transmits. Turning off rxcsum and txcsum fixes the problem. dev.em.0.%desc: Intel(R) PRO/1000 Network Connection 7.0.8 dev.em.0.%driver: em dev.em.0.%location: slot=25 function=0 handle=\_SB_.PCI0.GBE_ dev.em.0.%pnpinfo: vendor=0x8086 device=0x10f0 subvendor=0x8086 subdevice=0x0036 class=0x020000 dev.em.0.%parent: pci0 dev.em.0.nvm: -1 dev.em.0.rx_int_delay: 0 dev.em.0.tx_int_delay: 66 dev.em.0.rx_abs_int_delay: 66 dev.em.0.tx_abs_int_delay: 66 dev.em.0.rx_processing_limit: 100 dev.em.0.link_irq: 0 dev.em.0.mbuf_alloc_fail: 0 dev.em.0.cluster_alloc_fail: 0 dev.em.0.dropped: 0 dev.em.0.tx_dma_fail: 0 dev.em.0.rx_overruns: 0 dev.em.0.watchdog_timeouts: 0 dev.em.0.device_control: 1074790976 dev.em.0.rx_control: 67141634 dev.em.0.fc_high_water: 8192 dev.em.0.fc_low_water: 6692 dev.em.0.queue0.txd_head: 15 dev.em.0.queue0.txd_tail: 17 dev.em.0.queue0.tx_irq: 0 dev.em.0.queue0.no_desc_avail: 0 dev.em.0.queue0.rxd_head: 843 dev.em.0.queue0.rxd_tail: 842 dev.em.0.queue0.rx_irq: 0 dev.em.0.mac_stats.excess_coll: 0 dev.em.0.mac_stats.single_coll: 0 dev.em.0.mac_stats.multiple_coll: 0 dev.em.0.mac_stats.late_coll: 0 dev.em.0.mac_stats.collision_count: 0 dev.em.0.mac_stats.symbol_errors: 0 dev.em.0.mac_stats.sequence_errors: 0 dev.em.0.mac_stats.defer_count: 0 dev.em.0.mac_stats.missed_packets: 0 dev.em.0.mac_stats.recv_no_buff: 0 dev.em.0.mac_stats.recv_undersize: 0 dev.em.0.mac_stats.recv_fragmented: 0 dev.em.0.mac_stats.recv_oversize: 0 dev.em.0.mac_stats.recv_jabber: 0 dev.em.0.mac_stats.recv_errs: 0 dev.em.0.mac_stats.crc_errs: 0 dev.em.0.mac_stats.alignment_errs: 0 dev.em.0.mac_stats.coll_ext_errs: 0 dev.em.0.mac_stats.xon_recvd: 80 dev.em.0.mac_stats.xon_txd: 0 dev.em.0.mac_stats.xoff_recvd: 82 dev.em.0.mac_stats.xoff_txd: 0 dev.em.0.mac_stats.total_pkts_recvd: 35697 dev.em.0.mac_stats.good_pkts_recvd: 35535 dev.em.0.mac_stats.bcast_pkts_recvd: 231 dev.em.0.mac_stats.mcast_pkts_recvd: 85 dev.em.0.mac_stats.rx_frames_64: 0 dev.em.0.mac_stats.rx_frames_65_127: 0 dev.em.0.mac_stats.rx_frames_128_255: 0 dev.em.0.mac_stats.rx_frames_256_511: 0 dev.em.0.mac_stats.rx_frames_512_1023: 0 dev.em.0.mac_stats.rx_frames_1024_1522: 0 dev.em.0.mac_stats.good_octets_recvd: 14878015 dev.em.0.mac_stats.good_octets_txd: 14051783 dev.em.0.mac_stats.total_pkts_txd: 45313 dev.em.0.mac_stats.good_pkts_txd: 45313 dev.em.0.mac_stats.bcast_pkts_txd: 3 dev.em.0.mac_stats.mcast_pkts_txd: 5 dev.em.0.mac_stats.tx_frames_64: 0 dev.em.0.mac_stats.tx_frames_65_127: 0 dev.em.0.mac_stats.tx_frames_128_255: 0 dev.em.0.mac_stats.tx_frames_256_511: 0 dev.em.0.mac_stats.tx_frames_512_1023: 0 dev.em.0.mac_stats.tx_frames_1024_1522: 0 dev.em.0.mac_stats.tso_txd: 2788 dev.em.0.mac_stats.tso_ctx_fail: 0 dev.em.0.interrupts.asserts: 48733 dev.em.0.interrupts.rx_pkt_timer: 0 dev.em.0.interrupts.rx_abs_timer: 0 dev.em.0.interrupts.tx_pkt_timer: 0 dev.em.0.interrupts.tx_abs_timer: 0 dev.em.0.interrupts.tx_queue_empty: 0 dev.em.0.interrupts.tx_queue_min_thresh: 0 dev.em.0.interrupts.rx_desc_min_thresh: 0 dev.em.0.interrupts.rx_overrun: 0 dev.em.0.wake: 0 At 08:00 PM 9/26/2010, Jack Vogel wrote: >The system I've had stress tests running on has 82574 LOMs, so I hope it >will solve the problem, will see tomorrow morning at how things have held >up... > >Jack > > >On Sun, Sep 26, 2010 at 4:43 PM, Mike Tancsa ><mike@sentex.net> wrote: >At 06:19 PM 9/26/2010, Jack Vogel wrote: >Your em1 is using MSI not MSIX and thus can't have multiple queues. I'm >not sure whats broken from what you show here. I will try to get the new >driver out shortly for you to try. > > >With this particular NIC, it will wedge under high load. I tried 2 >different motherboards and chipsets the same behaviour. > > ---Mike > > >Jack > > > >On Sun, Sep 26, 2010 at 2:57 PM, Mike Tancsa ><mike@sentex.net> wrote: >At 06:36 PM 9/24/2010, Jack Vogel wrote: >There is a new revision of the em driver coming next week, its going thru some >stress pounding over the weekend, if no issues show up I'll put it into HEAD. > >Yongari's changes in TX context handling which effects checksum and tso >are added. I've also decided that multiple queues in 82574 just are a source >of problems without a lot of benefit, so it still uses MSIX but with >only 3 vectors, >meaning it seperates TX and RX but has a single queue. > > >Thanks, looking forward to trying it out! With respect to the >multiple queues, I thought the driver already used just the one on >RELENG_8 ? If not, is there a way to force the existing driver to >use just the one queue ? > >On the box that has the NIC locking up, it shows > >em1@pci0:9:0:0: class=0x020000 card=0x34ec8086 chip=0x10d38086 >rev=0x00 hdr=0x00 > > vendor = 'Intel Corporation' > device = 'Intel 82574L Gigabit Ethernet Controller (82574L)' > class = network > subclass = ethernet > cap 01[c8] = powerspec 2 supports D0 D3 current D0 > cap 05[d0] = MSI supports 1 message, 64 bit enabled with 1 message > cap 10[e0] = PCI-Express 1 endpoint max data 128(256) link x1(x1) > >and > >vmstat -i shows > >irq256: em0 5129063 353 >irq257: em1 531251 36 > >in a wedged state, stats look like > >dev.em.1.%desc: Intel(R) PRO/1000 Network Connection 7.0.5 >dev.em.1.%driver: em >dev.em.1.%location: slot=0 function=0 handle=\_SB_.PCI0.PEX4.HART >dev.em.1.%pnpinfo: vendor=0x8086 device=0x10d3 subvendor=0x8086 >subdevice=0x34ec class=0x020000 >dev.em.1.%parent: pci9 >dev.em.1.nvm: -1 >dev.em.1.rx_int_delay: 0 >dev.em.1.tx_int_delay: 66 >dev.em.1.rx_abs_int_delay: 66 >dev.em.1.tx_abs_int_delay: 66 >dev.em.1.rx_processing_limit: 100 >dev.em.1.link_irq: 0 >dev.em.1.mbuf_alloc_fail: 0 >dev.em.1.cluster_alloc_fail: 0 >dev.em.1.dropped: 0 >dev.em.1.tx_dma_fail: 0 >dev.em.1.fc_high_water: 18432 >dev.em.1.fc_low_water: 16932 >dev.em.1.mac_stats.excess_coll: 0 >dev.em.1.mac_stats.symbol_errors: 0 >dev.em.1.mac_stats.sequence_errors: 0 >dev.em.1.mac_stats.defer_count: 0 >dev.em.1.mac_stats.missed_packets: 41522 >dev.em.1.mac_stats.recv_no_buff: 19 >dev.em.1.mac_stats.recv_errs: 0 >dev.em.1.mac_stats.crc_errs: 0 >dev.em.1.mac_stats.alignment_errs: 0 >dev.em.1.mac_stats.coll_ext_errs: 0 >dev.em.1.mac_stats.rx_overruns: 41398 >dev.em.1.mac_stats.watchdog_timeouts: 0 >dev.em.1.mac_stats.xon_recvd: 0 >dev.em.1.mac_stats.xon_txd: 0 >dev.em.1.mac_stats.xoff_recvd: 0 >dev.em.1.mac_stats.xoff_txd: 0 >dev.em.1.mac_stats.total_pkts_recvd: 95229129 >dev.em.1.mac_stats.good_pkts_recvd: 95187607 >dev.em.1.mac_stats.bcast_pkts_recvd: 79244 >dev.em.1.mac_stats.mcast_pkts_recvd: 0 >dev.em.1.mac_stats.rx_frames_64: 93680 >dev.em.1.mac_stats.rx_frames_65_127: 1516349 >dev.em.1.mac_stats.rx_frames_128_255: 4464941 >dev.em.1.mac_stats.rx_frames_256_511: 4024 >dev.em.1.mac_stats.rx_frames_512_1023: 2096067 >dev.em.1.mac_stats.rx_frames_1024_1522: 87012546 >dev.em.1.mac_stats.good_octets_recvd: 0 >dev.em.1.mac_stats.good_octest_txd: 0 >dev.em.1.mac_stats.total_pkts_txd: 66775098 >dev.em.1.mac_stats.good_pkts_txd: 66775098 >dev.em.1.mac_stats.bcast_pkts_txd: 509 >dev.em.1.mac_stats.mcast_pkts_txd: 7 >dev.em.1.mac_stats.tx_frames_64: 48038472 >dev.em.1.mac_stats.tx_frames_65_127: 13402833 >dev.em.1.mac_stats.tx_frames_128_255: 5324413 >dev.em.1.mac_stats.tx_frames_256_511: 957 >dev.em.1.mac_stats.tx_frames_512_1023: 319 >dev.em.1.mac_stats.tx_frames_1024_1522: 8104 >dev.em.1.mac_stats.tso_txd: 1069 >dev.em.1.mac_stats.tso_ctx_fail: 0 >dev.em.1.interrupts.asserts: 0 >dev.em.1.interrupts.rx_pkt_timer: 0 >dev.em.1.interrupts.rx_abs_timer: 0 >dev.em.1.interrupts.tx_pkt_timer: 0 >dev.em.1.interrupts.tx_abs_timer: 0 >dev.em.1.interrupts.tx_queue_empty: 0 >dev.em.1.interrupts.tx_queue_min_thresh: 0 >dev.em.1.interrupts.rx_desc_min_thresh: 0 >dev.em.1.interrupts.rx_overrun: 0 >dev.em.1.host.breaker_tx_pkt: 0 >dev.em.1.host.host_tx_pkt_discard: 0 >dev.em.1.host.rx_pkt: 0 >dev.em.1.host.breaker_rx_pkts: 0 >dev.em.1.host.breaker_rx_pkt_drop: 0 >dev.em.1.host.tx_good_pkt: 0 >dev.em.1.host.breaker_tx_pkt_drop: 0 >dev.em.1.host.rx_good_bytes: 0 >dev.em.1.host.tx_good_bytes: 0 >dev.em.1.host.length_errors: 0 >dev.em.1.host.serdes_violation_pkt: 0 >dev.em.1.host.header_redir_missed: 0 > >ifconfig down/up just panics or locks up the box when its in this >state. I also have IPMI enabled on this nic, but it shows the same >issue with it disabled. > > ---Mike > > > >-------------------------------------------------------------------- >Mike Tancsa, tel +1 519 651 3400 >Sentex Communications, >mike@sentex.net >Providing Internet since >1994 ><http://www.sentex.net>www.sentex.net >Cambridge, Ontario >Canada ><http://www.sentex.net/mike>www.sentex.net/mike > > >-------------------------------------------------------------------- >Mike Tancsa, tel +1 519 651 3400 >Sentex >Communications, >mike@sentex.net >Providing Internet since >1994 www.sentex.net >Cambridge, Ontario >Canada www.sentex.net/mike > -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 14:37:20 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E3DAF106566C; Sat, 2 Oct 2010 14:37:19 +0000 (UTC) (envelope-from mike@sentex.net) Received: from smarthost2.sentex.ca (smarthost2-6.sentex.ca [IPv6:2607:f3e0:80:80::2]) by mx1.freebsd.org (Postfix) with ESMTP id 23AF58FC14; Sat, 2 Oct 2010 14:37:19 +0000 (UTC) Received: from lava.sentex.ca (pyroxene.sentex.ca [199.212.134.18]) by smarthost2.sentex.ca (8.14.4/8.14.4) with ESMTP id o92EbB1j017473 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 2 Oct 2010 10:37:11 -0400 (EDT) (envelope-from mike@sentex.net) Received: from mdt-xp.sentex.net (simeon.sentex.ca [192.168.43.27]) by lava.sentex.ca (8.14.4/8.14.4) with ESMTP id o92EbAIl033701; Sat, 2 Oct 2010 10:37:10 -0400 (EDT) (envelope-from mike@sentex.net) Message-Id: <201010021437.o92EbAIl033701@lava.sentex.ca> X-Mailer: QUALCOMM Windows Eudora Version 7.1.0.9 Date: Sat, 02 Oct 2010 10:37:01 -0400 To: "Li, Qing" From: Mike Tancsa In-Reply-To: <201009171759.o8HHxCJM037780@lava.sentex.ca> References: <201008312102.o7VL2MJr000894@lava.sentex.ca> <201009012255.o81MtMXn009701@lava.sentex.ca> <201009081512.o88FCIq8064280@lava.sentex.ca> <201009081535.o88FZKQS064396@lava.sentex.ca> <201009101651.o8AGp8uU080952@lava.sentex.ca> <201009171759.o8HHxCJM037780@lava.sentex.ca> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; format=flowed X-Scanned-By: MIMEDefang 2.67 on 205.211.164.50 Cc: freebsd-stable@freebsd.org Subject: RE: if_rtdel: error 47 (netgraph or mpd issue?) X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 14:37:20 -0000 FYI, I disabled ipv6 in mpd as well as set ipv6_enable="NO" and the box has been stable for 2 weeks now. Previously, it would crash every 5 days or so. Something in inet6 or mpd ? ---Mike At 01:59 PM 9/17/2010, Mike Tancsa wrote: >At 12:51 PM 9/10/2010, Mike Tancsa wrote: > > >>FYI, I enabled witness in the kernel and am seeing the following >> >> >>uma_zalloc_arg: zone "128" with the following non-sleepable locks held: >>exclusive rw ifnet_rw (ifnet_rw) r = 0 (0xc0b56ec4) locked @ >>/usr/src/sys/net/if.c:419 > > >Hi, > Another crash. I had it break to the serial debugger this time > > >Fatal trap 12: page fault while in kernel mode >cpuid = 1; apic id = 01 >fault virtual address = 0x24 >fault code = supervisor read, page not present >instruction pointer = 0x20:0xc64c79e4 >stack pointer = 0x28:0xe7c84864 >frame pointer = 0x28:0xe7c84a9c >code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 >processor eflags = interrupt enabled, resume, IOPL = 0 >current process = 1280 (mpd5) >[thread pid 1280 tid 100096 ] >Stopped at ng_path2noderef+0x174: testb $0x1,0x24(%esi) >db> bt >Tracing pid 1280 tid 100096 td 0xc58f7780 >ng_path2noderef(cace4b80,cb0a5350,e7c84ab8,e7c84ab4,0,...) at >ng_path2noderef+0x174 >ng_address_path(cace4b80,c64d4400,cb0a5350,0,28885ba0,...) at >ng_address_path+0x40 >ngc_send(cb66db44,0,cb2f4500,cba946f0,0,...) at ngc_send+0x182 >sosend_generic(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend_generic+0x50d >sosend(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend+0x3f >kern_sendit(c58f7780,8d,e7c84c60,0,0,...) at kern_sendit+0x107 >sendit(0,cba946f0,7,e7c84c7c,1,...) at sendit+0xb1 >sendto(c58f7780,e7c84cf8,c093d225,c091bcfe,282,...) at sendto+0x48 >syscall(e7c84d38) at syscall+0x1da >Xint0x80_syscall() at Xint0x80_syscall+0x21 >--- syscall (133, FreeBSD ELF32, sendto), eip = 0x284b13c7, esp = >0xbf9fe4cc, ebp = 0xbf9fe4f8 --- >db> where >Tracing pid 1280 tid 100096 td 0xc58f7780 >ng_path2noderef(cace4b80,cb0a5350,e7c84ab8,e7c84ab4,0,...) at >ng_path2noderef+0x174 >ng_address_path(cace4b80,c64d4400,cb0a5350,0,28885ba0,...) at >ng_address_path+0x40 >ngc_send(cb66db44,0,cb2f4500,cba946f0,0,...) at ngc_send+0x182 >sosend_generic(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend_generic+0x50d >sosend(cb66db44,cba946f0,e7c84bec,0,0,...) at sosend+0x3f >kern_sendit(c58f7780,8d,e7c84c60,0,0,...) at kern_sendit+0x107 >sendit(0,cba946f0,7,e7c84c7c,1,...) at sendit+0xb1 >sendto(c58f7780,e7c84cf8,c093d225,c091bcfe,282,...) at sendto+0x48 >syscall(e7c84d38) at syscall+0x1da >Xint0x80_syscall() at Xint0x80_syscall+0x21 >--- syscall (133, FreeBSD ELF32, sendto), eip = 0x284b13c7, esp = >0xbf9fe4cc, ebp = 0xbf9fe4f8 --- >db> show locks >exclusive sx so_snd_sx (so_snd_sx) r = 0 (0xcb66dc64) locked @ >/usr/src/sys/kern/uipc_sockbuf.c:148 >db> show alllocks >Process 1928 (sshd) thread 0xc6402a00 (100094) >exclusive sx so_rcv_sx (so_rcv_sx) r = 0 (0xc669a898) locked @ >/usr/src/sys/kern/uipc_sockbuf.c:148 >Process 1281 (ng_queue) thread 0xc58f6a00 (100057) >shared rw radix node head (radix node head) r = 0 (0xc56e1580) >locked @ /usr/src/sys/net/route.c:362 >Process 1280 (mpd5) thread 0xc58f7780 (100096) >exclusive sx so_snd_sx (so_snd_sx) r = 0 (0xcb66dc64) locked @ >/usr/src/sys/kern/uipc_sockbuf.c:148 >db> call doadump() >Physical memory: 2032 MB >Dumping 274 MB: 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3 >Dump complete > > > > >panic: > >GNU gdb 6.1.1 [FreeBSD] >Copyright 2004 Free Software Foundation, Inc. >GDB is free software, covered by the GNU General Public License, and you are >welcome to change it and/or distribute copies of it under certain conditions. >Type "show copying" to see the conditions. >There is absolutely no warranty for GDB. Type "show warranty" for details. >This GDB was configured as "i386-marcel-freebsd"... > >Unread portion of the kernel message buffer: > > >Fatal trap 12: page fault while in kernel mode >cpuid = 1; apic id = 01 >fault virtual address = 0x24 >fault code = supervisor read, page not present >instruction pointer = 0x20:0xc64c79e4 >stack pointer = 0x28:0xe7c84864 >frame pointer = 0x28:0xe7c84a9c >code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, def32 1, gran 1 >processor eflags = interrupt enabled, resume, IOPL = 0 >current process = 1280 (mpd5) >Physical memory: 2032 MB >Dumping 274 MB: 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3 > >#0 doadump () at pcpu.h:231 >231 pcpu.h: No such file or directory. > in pcpu.h >(kgdb) #0 doadump () at pcpu.h:231 >#1 0xc04a5899 in db_fncall (dummy1=1, dummy2=0, dummy3=-1061510048, > dummy4=0xe7c84600 "") at /usr/src/sys/ddb/db_command.c:548 >#2 0xc04a5c91 in db_command (last_cmdp=0xc09cf71c, cmd_table=0x0, dopager=1) > at /usr/src/sys/ddb/db_command.c:445 >#3 0xc04a5dea in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 >#4 0xc04a7c6d in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_main.c:229 >#5 0xc069c7ae in kdb_trap (type=12, code=0, tf=0xe7c84824) > at /usr/src/sys/kern/subr_kdb.c:535 >#6 0xc08aabcf in trap_fatal (frame=0xe7c84824, eva=36) > at /usr/src/sys/i386/i386/trap.c:929 >#7 0xc08aadf0 in trap_pfault (frame=0xe7c84824, usermode=0, eva=36) > at /usr/src/sys/i386/i386/trap.c:851 >#8 0xc08ab5e3 in trap (frame=0xe7c84824) at /usr/src/sys/i386/i386/trap.c:533 >#9 0xc088ecdc in calltrap () at /usr/src/sys/i386/i386/exception.s:166 >#10 0xc64c79e4 in ng_path2noderef (here=0xcace4b80, > address=0xcb0a5350 "ctrl", destp=0xe7c84ab8, lasthook=0xe7c84ab4) > at > /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:1757 >#11 0xc64c7d40 in ng_address_path (here=0xcace4b80, item=0xc64d4400, > address=0xcb0a5350 "ctrl", retaddr=0) > at > /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:3536 >#12 0xc64c2662 in ngc_send (so=0xcb66db44, flags=0, m=0xcb2f4500, > addr=0xcba946f0, control=0x0, td=0xc58f7780) > at /usr/src/sys/modules/netgraph/socket/../../../netgraph/ng_socket.c:296 >#13 0xc06cf68d in sosend_generic (so=0xcb66db44, addr=0xcba946f0, > uio=0xe7c84bec, top=0xcb2f4500, control=0x0, flags=0, td=0xc58f7780) > at /usr/src/sys/kern/uipc_socket.c:1260 >#14 0xc06cbe2f in sosend (so=0xcb66db44, addr=0xcba946f0, uio=0xe7c84bec, > top=0x0, control=0x0, flags=0, td=0xc58f7780) > at /usr/src/sys/kern/uipc_socket.c:1304 >#15 0xc06d21f7 in kern_sendit (td=0xc58f7780, s=141, mp=0xe7c84c60, flags=0, > control=0x0, segflg=UIO_USERSPACE) > at /usr/src/sys/kern/uipc_syscalls.c:788 >#16 0xc06d23f1 in sendit (td=0xc58f7780, s=141, mp=0xe7c84c60, flags=0) > at /usr/src/sys/kern/uipc_syscalls.c:724 >#17 0xc06d2508 in sendto (td=0xc58f7780, uap=0xe7c84cf8) > at /usr/src/sys/kern/uipc_syscalls.c:840 >#18 0xc08aafea in syscall (frame=0xe7c84d38) > at /usr/src/sys/i386/i386/trap.c:1111 >#19 0xc088ed41 in Xint0x80_syscall () > at /usr/src/sys/i386/i386/exception.s:264 >#20 0x00000033 in ?? () >Previous frame inner to this frame (corrupt stack?) > > >(kgdb) up 10 >#10 0xc64c79e4 in ng_path2noderef (here=0xcace4b80, >address=0xcb0a5350 "ctrl", destp=0xe7c84ab8, lasthook=0xe7c84ab4) > at > /usr/src/sys/modules/netgraph/netgraph/../../../netgraph/ng_base.c:1757 >1757 NG_NODE_UNREF(oldnode); /* XXX another race */ >(kgdb) list >1752 * instead of the direct hook in this crawl? >1753 */ >1754 oldnode = node; >1755 if ((node = NG_PEER_NODE(hook))) >1756 NG_NODE_REF(node); /* XXX RACE */ >1757 NG_NODE_UNREF(oldnode); /* XXX another race */ >1758 if (NG_NODE_NOT_VALID(node)) { >1759 NG_NODE_UNREF(node); /* XXX more races */ >1760 node = NULL; >1761 } >(kgdb) > >(kgdb) p *hook >$3 = {hk_name = "ctrl", '\0' , hk_private = >0xcb90a5c0, hk_flags = 0, hk_type = 0, hk_peer = 0xcab92e80, > hk_node = 0xcace4b80, hk_hooks = {le_next = 0x0, le_prev = > 0xcace4bb4}, hk_rcvmsg = 0, hk_rcvdata = 0, hk_refs = 2} >(kgdb) >(kgdb) p *node >Cannot access memory at address 0x0 >(kgdb) > >_______________________________________________ >freebsd-stable@freebsd.org mailing list >http://lists.freebsd.org/mailman/listinfo/freebsd-stable >To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" -------------------------------------------------------------------- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, mike@sentex.net Providing Internet since 1994 www.sentex.net Cambridge, Ontario Canada www.sentex.net/mike From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 14:59:53 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE67D1065670 for ; Sat, 2 Oct 2010 14:59:53 +0000 (UTC) (envelope-from prvs=189184ef51=killing@multiplay.co.uk) Received: from mail1.multiplay.co.uk (mail1.multiplay.co.uk [85.236.96.23]) by mx1.freebsd.org (Postfix) with ESMTP id 42A748FC16 for ; Sat, 2 Oct 2010 14:59:53 +0000 (UTC) X-MDAV-Processed: mail1.multiplay.co.uk, Sat, 02 Oct 2010 15:49:26 +0100 X-Spam-Processed: mail1.multiplay.co.uk, Sat, 02 Oct 2010 15:49:22 +0100 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on mail1.multiplay.co.uk X-Spam-Level: X-Spam-Status: No, score=-5.0 required=6.0 tests=USER_IN_WHITELIST shortcircuit=ham autolearn=disabled version=3.2.5 Received: from r2d2 by mail1.multiplay.co.uk (MDaemon PRO v10.0.4) with ESMTP id md50011344263.msg for ; Sat, 02 Oct 2010 15:49:21 +0100 X-Authenticated-Sender: Killing@multiplay.co.uk X-MDRemoteIP: 188.220.16.49 X-Return-Path: prvs=189184ef51=killing@multiplay.co.uk X-Envelope-From: killing@multiplay.co.uk X-MDaemon-Deliver-To: freebsd-stable@freebsd.org Message-ID: From: "Steven Hartland" To: "Rumen Telbizov" , References: Date: Sat, 2 Oct 2010 15:48:10 +0100 MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset="iso-8859-1"; reply-type=original Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.5931 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5994 Cc: Subject: Re: MySQL performance concern X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 14:59:54 -0000 When you say similar hardware whets the actual spec? What do you have set in my.cnf? What config options are you using for zfs? ----- Original Message ----- From: "Rumen Telbizov" To: Sent: Saturday, October 02, 2010 8:06 AM Subject: MySQL performance concern > Hello everyone, > > I am experimenting with MySQL running on FreeBSD and comparing with another > (older) setup running on a Linux box. > My results show that performance on Linux is significantly better than > FreeBSD although the hardware is weaker. > I'd appreciate your comments and ideas. > > Here's the setup: > > 1) FreeBSD 8.1-STABLE amd64 (Tue Sep 14 15:29:22 PDT 2010) running on a > SuperMicro machine with 2 x Dual Core > Xeon E5502 1.87Ghz ; 4 x SAS 15K in RAID10 setup under ZFS (two mirrored > pairs) and 2 x SSD X25-E partitioned > for: 8G for ZIL and the rest for L2ARC; 16G ram with 8 of them given to > mysql and tons of free. > > 2) Linux Gentoo with 3 SATA disks in hardware RAID5 with similar > cpu/motherboard and same memory size. > > The sole application that runs is a python script which inserts a batch of > lines at a time. Only myisam is used as a format. > Here's the problem: On the Linux box it manages to push around > *5800*inserts/second while on the FreeBSD box > it's only *4000/*second. > > MySQL version is 5.1.51 > > During this load the disk subsystem on FreeBSD is pretty much idle (both the > SSDs and the SAS disks). CPU utilization > contributed to mysqld is only around 30%. So I am clearly heavily > under-utilizing the hardware. > Linuxthreads support for 64bit architectures is not available so I couldn't > try that but aside from that I tried recompiling > mysql with all the different Makefile options available without any effect. > Changing the recordsize in zfs to 8K doesn't make any difference. > Tried percona binary without any luck. > > Let me know what additional information would be useful and I'll provide it > here. > > Thank you in advance for your comments and suggestions. > > Cheers, > Rumen Telbizov > _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" > ================================================ This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337 or return the E.mail to postmaster@multiplay.co.uk. From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 15:13:41 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 91F36106566B for ; Sat, 2 Oct 2010 15:13:41 +0000 (UTC) (envelope-from m.seaman@infracaninophile.co.uk) Received: from smtp.infracaninophile.co.uk (smtp6.infracaninophile.co.uk [IPv6:2001:8b0:151:1:3fd3:cd67:fafa:3d78]) by mx1.freebsd.org (Postfix) with ESMTP id DF4568FC0A for ; Sat, 2 Oct 2010 15:13:40 +0000 (UTC) Received: from seedling.black-earth.co.uk (seedling.black-earth.co.uk [81.187.76.163]) (authenticated bits=0) by smtp.infracaninophile.co.uk (8.14.4/8.14.4) with ESMTP id o92FDaVP027329 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO); Sat, 2 Oct 2010 16:13:37 +0100 (BST) (envelope-from m.seaman@infracaninophile.co.uk) X-DKIM: Sendmail DKIM Filter v2.8.3 smtp.infracaninophile.co.uk o92FDaVP027329 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=infracaninophile.co.uk; s=201001-infracaninophile; t=1286032417; bh=J0WRu2aRXcyLdJSx8R6qIEim7nHywIqco3sZoWwOHqE=; h=Message-ID:Date:From:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Cc:Content-Type:Date:From:In-Reply-To: Message-ID:Mime-Version:References:To; z=Message-ID:=20<4CA74C18.2020405@infracaninophile.co.uk>|Date:=20S at,=2002=20Oct=202010=2016:13:28=20+0100|From:=20Matthew=20Seaman= 20|Organization:=20Infracaninophi le|User-Agent:=20Mozilla/5.0=20(Macintosh=3B=20U=3B=20Intel=20Mac= 20OS=20X=2010.6=3B=20en-GB=3B=20rv:1.9.2.9)=20Gecko/20100915=20Thu nderbird/3.1.4|MIME-Version:=201.0|To:=20=3D?UTF-8?B?UGV0ZXIgQW5rZ XJzdMOlbA=3D=3D?=3D=20|CC:=20"Marat=20N.Afanasyev" =20,=20stable@freebsd.org|Subject:=20Re:=20device=2 0names=20changes=20for=20adX.|References:=20<1C68D21A-9539-473D-AE DB-9A8CCE4956F1@pean.org>=09<4CA73215.9030908@ksu.ru>=20|In-Reply-To:=20|X-Enigmail-Version:=201.1.1|Ope nPGP:=20id=3D60AE908C|Content-Type:=20multipart/signed=3B=20micalg =3Dpgp-sha1=3B=0D=0A=20protocol=3D"application/pgp-signature"=3B=0 D=0A=20boundary=3D"------------enig6F1BD6F6B75A7669257774CC"; b=uFi24ekZCP2XIR9SLf+ahsTjgcReV0APDYol6OjcSCg1mKoymaOf9gqGjyk9haA8y 6VRVQehmfdlydJxZQB2WZ2BSFArZgNbabns9tHNuWWkcxGin5g8Iej6QEVStzc7KZx UDuB6qfRvq2RB/EbFxit9o8auMzV6U0NAZHxGLM4= Message-ID: <4CA74C18.2020405@infracaninophile.co.uk> Date: Sat, 02 Oct 2010 16:13:28 +0100 From: Matthew Seaman Organization: Infracaninophile User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-GB; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: =?UTF-8?B?UGV0ZXIgQW5rZXJzdMOlbA==?= References: <1C68D21A-9539-473D-AEDB-9A8CCE4956F1@pean.org> <4CA73215.9030908@ksu.ru> In-Reply-To: X-Enigmail-Version: 1.1.1 OpenPGP: id=60AE908C Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enig6F1BD6F6B75A7669257774CC" X-Virus-Scanned: clamav-milter 0.96.3 at lucid-nonsense.infracaninophile.co.uk X-Virus-Status: Clean X-Spam-Status: No, score=-0.6 required=5.0 tests=BAYES_05,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,SPF_FAIL autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on lucid-nonsense.infracaninophile.co.uk Cc: "Marat N.Afanasyev" , stable@freebsd.org Subject: Re: device names changes for adX. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 15:13:41 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enig6F1BD6F6B75A7669257774CC Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 02/10/2010 15:29:49, Peter Ankerst=C3=A5l wrote: >> Peter Ankerst=C3=A5l wrote: >>> When I installed FreeBSD 8.1-RELEASE (freebsd-update) the adX devices= changed index number and >>> the machine obviously didnt boot. Due to this I hesitate to install 8= =2E1 on my servers remote. How do I know >>> if and to what the devices will change? >> label your filesystems and mount them by label rather than by device n= ame. see >> >> man glabel > Thanks, I may try that. But how will this affect ZFS raidz set up to us= e ad-drives? >=20 > Like this: >=20 > tank ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > ad10s2 ONLINE 0 0 0 > ad12 ONLINE 0 0 0 > ad14 ONLINE 0 0 0 > ad16 ONLINE 0 0 0 It actually shouldn't matter. ZFS writes metadata about the zpools, zdevs etc. it knows about onto the drives, and if it can read the drive and see the metadata, it can reconstruct itself. You can take disks out of a ZFS setup, shuffle them, stick them back into the wrong slots, and ZFS will still work. Similarly, you can use glabel to name the disks, and switch to using that, and everything should still work. In fact, if you glabel the disks while ZFS is still active, it should instantly recognise all the new names and update the 'zpool status' output on the fly. Actually, that last is probably the right way to do things -- you'll update the zpool.cache that way, which means that ZFS will come up without any remedial action after reboot. Cheers, Matthew --=20 Dr Matthew J Seaman MA, D.Phil. 7 Priory Courtyard Flat 3 PGP: http://www.infracaninophile.co.uk/pgpkey Ramsgate JID: matthew@infracaninophile.co.uk Kent, CT11 9PW --------------enig6F1BD6F6B75A7669257774CC Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.14 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAkynTCAACgkQ8Mjk52CukIxIXQCfZjARgmx5JqyXsWTg+t+zCg9C VBcAoIh+/LFFRRLqFe+EID8Y9gIs5Abc =hsgs -----END PGP SIGNATURE----- --------------enig6F1BD6F6B75A7669257774CC-- From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 15:19:57 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3CEF3106566B for ; Sat, 2 Oct 2010 15:19:57 +0000 (UTC) (envelope-from martin@saturn.pcs.ms) Received: from mail4.hostpark.net (mail4.hostpark.net [212.243.197.34]) by mx1.freebsd.org (Postfix) with ESMTP id BB2648FC0A for ; Sat, 2 Oct 2010 15:19:55 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail4.hostpark.net (Postfix) with ESMTP id 3AE274BB6C; Sat, 2 Oct 2010 17:19:55 +0200 (CEST) X-Virus-Scanned: by Hostpark/NetZone Mailprotection at hostpark.net Received: from mail4.hostpark.net ([127.0.0.1]) by localhost (mail4.hostpark.net [127.0.0.1]) (amavisd-new, port 10124) with ESMTP id gWX7qYEq25Jk; Sat, 2 Oct 2010 17:19:55 +0200 (CEST) Received: from saturn.pcs.ms (246-171.62-81.cust.bluewin.ch [81.62.171.246]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail4.hostpark.net (Postfix) with ESMTP id CBF954BA9B; Sat, 2 Oct 2010 17:19:54 +0200 (CEST) Received: from saturn.pcs.ms (localhost [127.0.0.1]) by saturn.pcs.ms (8.14.4/8.14.4) with ESMTP id o92FJVH7049959 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NOT); Sat, 2 Oct 2010 17:19:31 +0200 (CEST) (envelope-from martin@saturn.pcs.ms) Received: (from martin@localhost) by saturn.pcs.ms (8.14.4/8.14.4/Submit) id o92FJVWC049958; Sat, 2 Oct 2010 17:19:31 +0200 (CEST) (envelope-from martin) Date: Sat, 2 Oct 2010 17:19:30 +0200 From: Martin Schweizer To: Jeremy Chadwick Message-ID: <20101002151930.GI74320@saturn.pcs.ms> References: <20101002131106.GH74320@saturn.pcs.ms> <20101002141146.GB70283@icarus.home.lan> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101002141146.GB70283@icarus.home.lan> X-Organization: PC-Service M. Schweizer GmbH, CH-8608 Bubikon, Switzerland User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-stable@freebsd.org Subject: Re: Broken SASL/Kerberos authentication: openldap client GSSAPI authentication segfaults on FreeBSD 8.1 Release too X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Schweizer List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 15:19:57 -0000 Hello Jeremy Am Sat, Oct 02, 2010 at 07:11:46AM -0700 Jeremy Chadwick schrieb: > On Sat, Oct 02, 2010 at 03:11:07PM +0200, Martin Schweizer wrote: > > [...] > > 3. Now I make buildworld && make buidlkernel && make installkernel and I get the following messages: > > [...] > > > > cc -fpic -DPIC -O2 -pipe -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi -I/usr/src/kerberos5/lib/libgssapi/../../../crypto/heimdal/lib/gssapi/krb5 -I/usr/src/kerb > > make: don't know how to make /usr/obj/usr/src/tmp/usr/lib/libpthread.a. Stop > > *** Error code 2 > > Stop in /usr/src. > > > > What I'm doing wrong? > > Did you specify any -j flags during your "make buildworld" (ex. "make > -j2 buildworld")? > > If so, please remove them and restart the build. Then you will see > where the actual compile/make error happens. From the above output, it > doesn't look like it's related to the Kerberos or libgssapi stuff. No, I did not use any flags when I did start make buildworld. Regards, -- Martin Schweizer PC-Service M. Schweizer GmbH; Bannholzstrasse 6; CH-8608 Bubikon Tel. +41 55 243 30 00; Fax: +41 55 243 33 22 From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 15:20:00 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 189341065670 for ; Sat, 2 Oct 2010 15:20:00 +0000 (UTC) (envelope-from bruce@cran.org.uk) Received: from muon.cran.org.uk (unknown [IPv6:2a01:348:0:15:5d59:5c40:0:1]) by mx1.freebsd.org (Postfix) with ESMTP id CBA9E8FC0C for ; Sat, 2 Oct 2010 15:19:59 +0000 (UTC) Received: from muon.cran.org.uk (localhost [127.0.0.1]) by muon.cran.org.uk (Postfix) with ESMTP id 245C5E63CD; Sat, 2 Oct 2010 16:19:59 +0100 (BST) Received: from unknown (client-82-31-11-222.midd.adsl.virginmedia.com [82.31.11.222]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by muon.cran.org.uk (Postfix) with ESMTPSA; Sat, 2 Oct 2010 16:19:58 +0100 (BST) Date: Sat, 2 Oct 2010 16:19:56 +0100 From: Bruce Cran To: Matthew Seaman Message-ID: <20101002161956.00007257@unknown> In-Reply-To: <4CA74C18.2020405@infracaninophile.co.uk> References: <1C68D21A-9539-473D-AEDB-9A8CCE4956F1@pean.org> <4CA73215.9030908@ksu.ru> <4CA74C18.2020405@infracaninophile.co.uk> X-Mailer: Claws Mail 3.7.6 (GTK+ 2.16.6; i586-pc-mingw32msvc) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: "Marat N.Afanasyev" , stable@freebsd.org, Peter =?ISO-8859-1?Q?Ankerst=E5l?= Subject: Re: device names changes for adX. X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 15:20:00 -0000 On Sat, 02 Oct 2010 16:13:28 +0100 Matthew Seaman wrote: > In fact, if you glabel the disks while ZFS is still active, it should > instantly recognise all the new names and update the 'zpool status' > output on the fly. Actually, that last is probably the right way to > do things -- you'll update the zpool.cache that way, which means that > ZFS will come up without any remedial action after reboot. Since glabel writes metadata to disk, won't doing this on a disk with a filesystem corrupt something? -- Bruce Cran From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 17:46:27 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AC9281065670 for ; Sat, 2 Oct 2010 17:46:27 +0000 (UTC) (envelope-from mezz.freebsd@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id 406A18FC17 for ; Sat, 2 Oct 2010 17:46:26 +0000 (UTC) Received: by fxm9 with SMTP id 9so3380237fxm.13 for ; Sat, 02 Oct 2010 10:46:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=p91muJOAFfoWl9FWhQgjiS5FLu62sFL+0NsUoh85l2Q=; b=NpikBIqKs2x+gh2ziLl7mh9Mt62/nNxFNAXxZ753HwyV0SXw6tGIpe2DILgHWhqR3N /5a33PfUbhp1XLNVUj/hEro0PQdOReszO8Q35EDMQvzQd3BSfFPXvoIb3GCmTpM3y78i UxsxPZiOLBPatGX1rOQiJD7y2VhC2QRML5Q9s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=duJo3clm7RA9fIJXnLriLk/qZXeC8Wpd0JF4p+sRBa1e4MhXAaAo4QnMdJvzyXaejn BdNt2/u6I/ILXwv+yiDPhNm6qvTqOLMYypDoklWrMkx1r+sfVW+6sTUU8X9bpX+TvfJH veUBetG1ngqXwEo0OEOBLId1F+tUizkyhYfhs= MIME-Version: 1.0 Received: by 10.223.114.19 with SMTP id c19mr7125725faq.29.1286040142388; Sat, 02 Oct 2010 10:22:22 -0700 (PDT) Received: by 10.223.126.207 with HTTP; Sat, 2 Oct 2010 10:22:22 -0700 (PDT) Date: Sat, 2 Oct 2010 12:22:22 -0500 Message-ID: From: Jeremy Messenger To: stable@FreeBSD.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Subject: utmp.h exists or not in RELENG_8? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 17:46:27 -0000 My system is RELENG_8 and I have checkout by via csup today. It shows that utmp.h still exists in RELENG_8. But when I see this PR: http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/149945 I have decided to check in the http://www.freebsd.org/cgi/cvsweb.cgi/src/include/?only_with_tag=RELENG_8 ... It shows that utmp.h has been removed. But in the http://sources.freebsd.org/RELENG_8/src/include/ shows a different story as it exists. I am confusing... Is it supposed to be deleted in CVS when it did the SVN->CVS? Or what? I don't have svn installed in my system at the moment, so can't check it now. Please add me in the CC as I am not in the list. Cheers, Mezz -- mezz.freebsd@gmail.com - mezz@FreeBSD.org FreeBSD GNOME Team http://www.FreeBSD.org/gnome/ - gnome@FreeBSD.org From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 19:16:18 2010 Return-Path: Delivered-To: stable@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7B956106564A for ; Sat, 2 Oct 2010 19:16:18 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (relay04.stack.nl [IPv6:2001:610:1108:5010::107]) by mx1.freebsd.org (Postfix) with ESMTP id 408D88FC18 for ; Sat, 2 Oct 2010 19:16:18 +0000 (UTC) Received: from turtle.stack.nl (turtle.stack.nl [IPv6:2001:610:1108:5010::132]) by mx1.stack.nl (Postfix) with ESMTP id 468D61DD635; Sat, 2 Oct 2010 21:16:17 +0200 (CEST) Received: by turtle.stack.nl (Postfix, from userid 1677) id 386D9172E5; Sat, 2 Oct 2010 21:16:17 +0200 (CEST) Date: Sat, 2 Oct 2010 21:16:17 +0200 From: Jilles Tjoelker To: Jeremy Messenger Message-ID: <20101002191617.GA73249@stack.nl> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) Cc: stable@FreeBSD.org Subject: Re: utmp.h exists or not in RELENG_8? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 19:16:18 -0000 On Sat, Oct 02, 2010 at 12:22:22PM -0500, Jeremy Messenger wrote: > My system is RELENG_8 and I have checkout by via csup today. It shows > that utmp.h still exists in RELENG_8. But when I see this PR: > http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/149945 > I have decided to check in the > http://www.freebsd.org/cgi/cvsweb.cgi/src/include/?only_with_tag=RELENG_8 > ... It shows that utmp.h has been removed. But in the > http://sources.freebsd.org/RELENG_8/src/include/ shows a different > story as it exists. I am confusing... Is it supposed to be deleted in > CVS when it did the SVN->CVS? Or what? I don't have svn installed in > my system at the moment, so can't check it now. utmp.h has been removed in HEAD (9.x) but is still present in 8.x and earlier branches. It looks like cvsweb is buggy in this area. The build error in ports/149945 may be caused by a stray utmpx related file found by the configure process. Partly because the various unix variant developers have made a mess of utmp/utmpx, the code to use it is rather fragile. -- Jilles Tjoelker From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 19:18:31 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8CB1D106564A for ; Sat, 2 Oct 2010 19:18:31 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 475EB8FC0C for ; Sat, 2 Oct 2010 19:18:31 +0000 (UTC) Received: by qwd6 with SMTP id 6so2654436qwd.13 for ; Sat, 02 Oct 2010 12:18:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=n3qlw4D/5F3C3f20dWXpUy1hsOvKHGjVP+4qZtHMxpw=; b=ItC+2NCyvOhxp50hKpx7VLRscEGwDgcNdrqVQb6/D4saIcoPE7V5N/vukKTcPdFjyx xf0phcr21KBXd7SSL3xtSC/qfXNccSOM8M8PteBE9NZNEWz1Hw8CF3ZzsrKZ44G5Cc0g /uGr82oiuRuSyxCriFEcAAAeU60KU6ho1y26Y= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=FyVHlli5U36w7FePT3k/IzCzpKGnJ4BFAMtW3fKKGNksHI5FHOPjW5aKVz9mTA3MPy dq9Ux9QznOnOQRhgzckyPrgh6/AwlyYjVAz0Ny2zQW8Q3wHcJ0DQxsKUkcjsZlvQ4uzz q9OJjOrICc2T/Ly3P1okRRW6ZEH3TpR/Xu/nA= MIME-Version: 1.0 Received: by 10.224.65.91 with SMTP id h27mr5168728qai.13.1286047110302; Sat, 02 Oct 2010 12:18:30 -0700 (PDT) Received: by 10.229.50.8 with HTTP; Sat, 2 Oct 2010 12:18:30 -0700 (PDT) In-Reply-To: References: Date: Sat, 2 Oct 2010 23:18:30 +0400 Message-ID: From: Sergey Kandaurov To: Jeremy Messenger Content-Type: text/plain; charset=ISO-8859-1 Cc: stable@freebsd.org Subject: Re: utmp.h exists or not in RELENG_8? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 19:18:31 -0000 On 2 October 2010 21:22, Jeremy Messenger wrote: > My system is RELENG_8 and I have checkout by via csup today. It shows > that utmp.h still exists in RELENG_8. But when I see this PR: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/149945 > > I have decided to check in the > http://www.freebsd.org/cgi/cvsweb.cgi/src/include/?only_with_tag=RELENG_8 > ... It shows that utmp.h has been removed. But in the > http://sources.freebsd.org/RELENG_8/src/include/ shows a different > story as it exists. I am confusing... Is it supposed to be deleted in > CVS when it did the SVN->CVS? Or what? I don't have svn installed in > my system at the moment, so can't check it now. > > Please add me in the CC as I am not in the list. > I have a suspect, cvsweb.cgi handles such case incorrectly, i.e. when file is removed in MAIN branch, but it exists in BRANCH_X, then passing the only_with_tag=RELENG_X will show such file in Attic. -- wbr, pluknet From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 19:58:35 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A82A8106564A for ; Sat, 2 Oct 2010 19:58:35 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id 65A398FC14 for ; Sat, 2 Oct 2010 19:58:35 +0000 (UTC) Received: from elsa.codelab.cz (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 8B08419E027 for ; Sat, 2 Oct 2010 21:58:32 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 62AB619E023 for ; Sat, 2 Oct 2010 21:58:28 +0200 (CEST) Message-ID: <4CA78EE3.9020005@quip.cz> Date: Sat, 02 Oct 2010 21:58:27 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.13) Gecko/20100914 SeaMonkey/2.0.8 MIME-Version: 1.0 To: freebsd-stable Content-Type: text/plain; charset=ISO-8859-2; format=flowed Content-Transfer-Encoding: 7bit Subject: is there a bug in AWK on 6.x and 7.x (fixed in 8.x)? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 19:58:35 -0000 I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on 6.4 i386 and 7.3 i386) I have this simple test case, where I want 2 columns from GeoIP CSV file: awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv It should produce output like this: # awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" (above is taken from FreeBSD 8.1 i386) On FreeBSD 6.4 and 7.3 it results in broken first line: awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" There are no errors in CSV file, it doesn't metter if I delete the affected first line from the file. It is reproducible with handmade file: # cat test.csv "1.9.0.0","1.9.255.255","17367040","17432575","MY","Malaysia" "1.10.10.0","1.10.10.255","17435136","17435391","AU","Australia" "1.11.0.0","1.11.255.255","17498112","17563647","KR","Korea, Republic of" "1.12.0.0","1.15.255.255","17563648","17825791","CN","China" "1.16.0.0","1.19.255.255","17825792","18087935","KR","Korea, Republic of" "1.21.0.0","1.21.255.255","18153472","18219007","JP","Japan" # awk 'FS="," { print $1"-"$2 }' test.csv "1.9.0.0","1.9.255.255","17367040","17432575","MY","Malaysia"- "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" "1.16.0.0"-"1.19.255.255" "1.21.0.0"-"1.21.255.255" As it works in 8.1, can it be fixed in 7-STABLE? (I don't know if it was purposely fixed or if it is coincidence of newer version of AWK in 8.x) Should I file PR for it? Miroslav Lachman From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 20:18:21 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5CE4E1065672 for ; Sat, 2 Oct 2010 20:18:21 +0000 (UTC) (envelope-from telbizov@gmail.com) Received: from mail-qy0-f182.google.com (mail-qy0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id 1607B8FC18 for ; Sat, 2 Oct 2010 20:18:20 +0000 (UTC) Received: by qyk33 with SMTP id 33so2324269qyk.13 for ; Sat, 02 Oct 2010 13:18:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type; bh=m8XrcdzOUcQfo2D+NeZe99as+VNnBOWOZqCwSeaj+o8=; b=NdQXZS6qQimUuR+FUxYSJMOBKQAy4PvcGxQJ87bagEVra26nL18ZBLT2bB1bJioN9o Ggq1i89dk2z7JaltvicJn40HSTyf+0V70j4o2jAPZd1z8jKo1T3UeLw/JNx/+AlculMN mbBSiDtzYXe8KPykQy8ft3Xo0ccRhphb/nPuo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=TSlNMPi4gqH405QgTqo+BEgon2OtSNBp7yKn2N5mjj424ve2q267DiwUn3Wq9M8rzN xI5Hd+oVf34KBYBIgjYO0hVyRuFzIIXgQbOmp3iWsp4YlnxPzYMvyE/6abZAhWtNcDl6 qxjQRsaTYvM2J0aCQZ4UKTMzf3idLDBxbVEWc= MIME-Version: 1.0 Received: by 10.229.240.76 with SMTP id kz12mr5390911qcb.65.1286050700405; Sat, 02 Oct 2010 13:18:20 -0700 (PDT) Received: by 10.229.191.132 with HTTP; Sat, 2 Oct 2010 13:18:20 -0700 (PDT) In-Reply-To: References: Date: Sat, 2 Oct 2010 13:18:20 -0700 Message-ID: From: Rumen Telbizov To: Steven Hartland Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-stable@freebsd.org Subject: Re: MySQL performance concern X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 20:18:21 -0000 Hello everyone, Here's the requested information below: FreeBSD mysql 5.1.51: my.cnf: skip-external-locking key_buffer_size = 8192M max_allowed_packet = 16M table_open_cache = 2048 sort_buffer_size = 64M read_buffer_size = 8M read_rnd_buffer_size = 16M myisam_sort_buffer_size = 256M thread_cache_size = 64 query_cache_size = 32M thread_concurrency = 8 max_heap_table_size = 6G hardware: FreeBSD 8.1-STABLE amd64 (Tue Sep 14 15:29:22 PDT 2010) running on a SuperMicro machine with X8DTU motherboard and 2 x Dual Core Xeon E5502 1.87Ghz ; 4 x SAS 15K in RAID10 setup under ZFS (two mirrored pairs) and 2 x SSD X25-E partitioned for: 8G for ZIL and the rest for L2ARC; 16G RAM. Disk controller is LSI 4Hi in IT (Initiator Target) mode. -- Linux Gentoo (2.6.18-164.10.1.el5.028stab067.4) mysql 5.1.50 -- my.cnf: skip-external-locking key_buffer = 4G max_heap_table_size = 6G max_allowed_packet = 1M table_cache = 64 sort_buffer_size = 512K net_buffer_length = 8K read_buffer_size = 256K read_rnd_buffer_size = 512K myisam_sort_buffer_size = 8M Linux runs as an OpenVZ VE inside CentOS. It's the only VE and has all the memory allocated to it hardware node: 2 x Xeon Quad E5410 @ 2.33GHz on SuperMicro X7DBU motherboard; 16G RAM; 4 SATA 1T disks in hardware raid 5 attached to a 3ware controller; NO SSDs Some other notes: * It is indeed a single thread which inserts into the mysql so yes it's only one core which handles the application and another one for MySQL. What is interesting here, like I mentioned, is that on FreeBSD mysql process doesn't get more than 30-40% CPU utilization. So it has a lot of headroom. gstat also shows 0% disk load * It is exactly the same database schema. In fact it's only one table that's inserted heavily into. It is a partition table with only one HASH index which looks something like this: PRIMARY KEY (`IntField`,`DateField`,`Varchar150Field`) USING HASH. The speed difference is obvious right from the beginning. I don't have to wait for any data to accrue to see a degradation. I don't wait for more than a 100'000 records to be processed. * Application maintains only 1 local TCP connection to mysql. They both run on the same host * As for the ZFS. Here's the pool configuration: pool: tank config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 gpt/tank0 ONLINE 0 0 0 gpt/tank1 ONLINE 0 0 0 mirror ONLINE 0 0 0 gpt/tank2 ONLINE 0 0 0 gpt/tank3 ONLINE 0 0 0 logs ONLINE 0 0 0 mirror ONLINE 0 0 0 gpt/zil0 ONLINE 0 0 0 gpt/zil1 ONLINE 0 0 0 cache gpt/l2arc0 ONLINE 0 0 0 gpt/l2arc1 ONLINE 0 0 0 pool: zroot config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror ONLINE 0 0 0 gpt/zroot0 ONLINE 0 0 0 gpt/zroot1 ONLINE 0 0 0 zroot is a couple of small partitions from two of the same SAS disks. zil and l2arc are 8 and 22G partitions from 32G SSDs I pretty much have no zfs tuning done since from what I've found there shouldn't be any needed since I'm running 8.1 on a 64bit machine. Let me know if you'd like me to experiment with any ... Some additional information: # sysctl vm.kmem_size vm.kmem_size: 5539958784 # sysctl vm.kmem_size_max vm.kmem_size_max: 329853485875 # sysctl vfs.zfs.arc_max vfs.zfs.arc_max: 4466216960 I think this answers all the questions so far. Let me know what you think. I might be missing something obvious. Thank you, Rumen Telbizov From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 20:40:53 2010 Return-Path: Delivered-To: stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 452611065673 for ; Sat, 2 Oct 2010 20:40:53 +0000 (UTC) (envelope-from mezz.freebsd@gmail.com) Received: from mail-fx0-f54.google.com (mail-fx0-f54.google.com [209.85.161.54]) by mx1.freebsd.org (Postfix) with ESMTP id CF6C38FC12 for ; Sat, 2 Oct 2010 20:40:52 +0000 (UTC) Received: by fxm9 with SMTP id 9so3434884fxm.13 for ; Sat, 02 Oct 2010 13:40:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:content-type; bh=efSccB1rkyMuAw9Hrtpjppgz7VMRO1LrqP/hS7i7vUY=; b=XMcUrDxXrVhrvciV0lMFirixAS5LcTuI0V6krP806VOF0ZIlUGvi/tMgyAGd+fIas0 /GeKXMYDfr0nrXHzjkYamnsEmk1rboaNCiZgx1pgp3s3TMtgLxDICkkjkXZBlVQkhgMB TmMZf2dyrG5JdtTRpGLHMJQPVE7/bDmK7LEoo= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=OuWJP1ueYJylqu4iJNrK2sEFqe5rmfcg2Ezg1V/bLd6nESl9/rRPii507TupeqeQ0Y KoJIAUEltfP5WOfaJG7RmIVMSzZ1Jr1mF4Hk7rZAC6BIit1V/rBuLUfdAVM5LmSDPOaD N4IK9BpD4WwEy9LgVN7WAdpYViENFoFZnFLTY= MIME-Version: 1.0 Received: by 10.223.105.145 with SMTP id t17mr95354fao.88.1286052049059; Sat, 02 Oct 2010 13:40:49 -0700 (PDT) Received: by 10.223.126.207 with HTTP; Sat, 2 Oct 2010 13:40:49 -0700 (PDT) In-Reply-To: References: Date: Sat, 2 Oct 2010 15:40:49 -0500 Message-ID: From: Jeremy Messenger To: stable@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Cc: Subject: Re: utmp.h exists or not in RELENG_8? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 20:40:53 -0000 Thanks all, it's cvsweb bug then.. Cheers, Mezz On Sat, Oct 2, 2010 at 12:22 PM, Jeremy Messenger wrote: > My system is RELENG_8 and I have checkout by via csup today. It shows > that utmp.h still exists in RELENG_8. But when I see this PR: > > http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/149945 > > I have decided to check in the > http://www.freebsd.org/cgi/cvsweb.cgi/src/include/?only_with_tag=RELENG_8 > ... It shows that utmp.h has been removed. But in the > http://sources.freebsd.org/RELENG_8/src/include/ shows a different > story as it exists. I am confusing... Is it supposed to be deleted in > CVS when it did the SVN->CVS? Or what? I don't have svn installed in > my system at the moment, so can't check it now. > > Please add me in the CC as I am not in the list. > > Cheers, > Mezz > > > -- > mezz.freebsd@gmail.com - mezz@FreeBSD.org > FreeBSD GNOME Team > http://www.FreeBSD.org/gnome/ - gnome@FreeBSD.org > -- mezz.freebsd@gmail.com - mezz@FreeBSD.org FreeBSD GNOME Team http://www.FreeBSD.org/gnome/ - gnome@FreeBSD.org From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 21:21:11 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DD2E71065673 for ; Sat, 2 Oct 2010 21:21:10 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id 7C3B58FC13 for ; Sat, 2 Oct 2010 21:21:10 +0000 (UTC) Received: from elsa.codelab.cz (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 0E50819E027; Sat, 2 Oct 2010 23:21:09 +0200 (CEST) Received: from [192.168.1.2] (ip-86-49-61-235.net.upcbroadband.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 485E419E023; Sat, 2 Oct 2010 23:21:06 +0200 (CEST) Message-ID: <4CA7A241.4050507@quip.cz> Date: Sat, 02 Oct 2010 23:21:05 +0200 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.13) Gecko/20100914 SeaMonkey/2.0.8 MIME-Version: 1.0 To: Damian Weber References: <4CA78EE3.9020005@quip.cz> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable Subject: Re: is there a bug in AWK on 6.x and 7.x (fixed in 8.x)? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 21:21:11 -0000 Damian Weber wrote: > > > On Sat, 2 Oct 2010, Miroslav Lachman wrote: > >> Date: Sat, 02 Oct 2010 21:58:27 +0200 >> From: Miroslav Lachman<000.fbsd@quip.cz> >> To: freebsd-stable >> Subject: is there a bug in AWK on 6.x and 7.x (fixed in 8.x)? >> >> I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on 6.4 >> i386 and 7.3 i386) >> >> I have this simple test case, where I want 2 columns from GeoIP CSV file: >> >> awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv >> >> It should produce output like this: >> >> # awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 >> "1.0.0.0"-"1.7.255.255" >> "1.9.0.0"-"1.9.255.255" >> "1.10.10.0"-"1.10.10.255" >> "1.11.0.0"-"1.11.255.255" >> "1.12.0.0"-"1.15.255.255" >> >> (above is taken from FreeBSD 8.1 i386) >> >> On FreeBSD 6.4 and 7.3 it results in broken first line: >> >> awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 >> "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- >> "1.9.0.0"-"1.9.255.255" >> "1.10.10.0"-"1.10.10.255" >> "1.11.0.0"-"1.11.255.255" >> "1.12.0.0"-"1.15.255.255" >> > > Are you sure the command above contains a valid variable assignment? I am not AWK expert, so maybe you are right. I just found this difference between 7.x and 8.x. But if if works for other lines, why it doesn't work fot the first line too? Anyway, thank you for working examples, I will use them! Another working example from 6.4 is: awk -F "," '{ print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" > The following works on both 7.3-STABLE and 8.1-STABLE > > $ awk -v FS="," '{ print $1"-"$2; }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > > The following works as well > > $ awk '{ print $1"-"$2; }' FS="," GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > Or, using a BEGIN section for assignment... > > $ awk 'BEGIN {FS=","} { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > As a side note, gawk shows the following output on 7-STABLE and 8-STABLE > $ gawk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > ... which means the new behaviour of awk on 8-STABLE seems to break > compatibility with gawk at that point. > > -- Damian From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 21:48:45 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 930B3106564A for ; Sat, 2 Oct 2010 21:48:45 +0000 (UTC) (envelope-from dweber@htw-saarland.de) Received: from theia.rz.uni-saarland.de (theia.rz.uni-saarland.de [134.96.7.31]) by mx1.freebsd.org (Postfix) with ESMTP id 215BE8FC0C for ; Sat, 2 Oct 2010 21:48:44 +0000 (UTC) Received: from zdve-mailx.htw-saarland.de (zdve-mailx.htw-saarland.de [134.96.208.108]) by theia.rz.uni-saarland.de (8.14.1/8.14.0) with ESMTP id o92KwcmG023815; Sat, 2 Oct 2010 22:58:38 +0200 Received: from magritte.htw-saarland.de (magritte.htw-saarland.de [134.96.216.98]) by zdve-mailx.htw-saarland.de (8.13.8/8.13.8) with ESMTP id o92KwbKJ023288; Sat, 2 Oct 2010 22:58:37 +0200 (CEST) Date: Sat, 2 Oct 2010 22:58:34 +0200 (CEST) From: Damian Weber To: Miroslav Lachman <000.fbsd@quip.cz> In-Reply-To: <4CA78EE3.9020005@quip.cz> Message-ID: References: <4CA78EE3.9020005@quip.cz> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: clamav-milter 0.96 at zdve-mailx X-Virus-Status: Clean X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (theia.rz.uni-saarland.de [134.96.7.31]); Sat, 02 Oct 2010 22:58:38 +0200 (CEST) X-AntiVirus: checked by AntiVir MailGate (version: 2.1.2-14; AVE: 7.9.4.72; VDF: 7.10.12.111; host: AntiVir1) Cc: freebsd-stable Subject: Re: is there a bug in AWK on 6.x and 7.x (fixed in 8.x)? X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 21:48:45 -0000 On Sat, 2 Oct 2010, Miroslav Lachman wrote: > Date: Sat, 02 Oct 2010 21:58:27 +0200 > From: Miroslav Lachman <000.fbsd@quip.cz> > To: freebsd-stable > Subject: is there a bug in AWK on 6.x and 7.x (fixed in 8.x)? > > I think there is a bug in AWK in base of FreeBSD 6.x and 7.x (tested on 6.4 > i386 and 7.3 i386) > > I have this simple test case, where I want 2 columns from GeoIP CSV file: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv > > It should produce output like this: > > # awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0"-"1.7.255.255" > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > > (above is taken from FreeBSD 8.1 i386) > > On FreeBSD 6.4 and 7.3 it results in broken first line: > > awk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 > "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- > "1.9.0.0"-"1.9.255.255" > "1.10.10.0"-"1.10.10.255" > "1.11.0.0"-"1.11.255.255" > "1.12.0.0"-"1.15.255.255" > Are you sure the command above contains a valid variable assignment? The following works on both 7.3-STABLE and 8.1-STABLE $ awk -v FS="," '{ print $1"-"$2; }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" The following works as well $ awk '{ print $1"-"$2; }' FS="," GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" Or, using a BEGIN section for assignment... $ awk 'BEGIN {FS=","} { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0"-"1.7.255.255" "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" As a side note, gawk shows the following output on 7-STABLE and 8-STABLE $ gawk 'FS="," { print $1"-"$2 }' GeoIPCountryWhois.csv | head -n 5 "1.0.0.0","1.7.255.255","16777216","17301503","AU","Australia"- "1.9.0.0"-"1.9.255.255" "1.10.10.0"-"1.10.10.255" "1.11.0.0"-"1.11.255.255" "1.12.0.0"-"1.15.255.255" ... which means the new behaviour of awk on 8-STABLE seems to break compatibility with gawk at that point. -- Damian From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 22:09:29 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3190C106564A for ; Sat, 2 Oct 2010 22:09:29 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id C65E18FC13 for ; Sat, 2 Oct 2010 22:09:28 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 56795509B6; Sat, 2 Oct 2010 23:09:28 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UvKGIEeAWBtB; Sat, 2 Oct 2010 23:09:28 +0100 (BST) Received: from smtp-auth.unixathome.org (smtp-auth.unixathome.org [10.4.7.7]) (Authenticated sender: hidden) by nyi.unixathome.org (Postfix) with ESMTPSA id 0A62A508D8 ; Sat, 2 Oct 2010 23:09:28 +0100 (BST) Message-ID: <4CA7AD95.9040703@langille.org> Date: Sat, 02 Oct 2010 18:09:25 -0400 From: Dan Langille Organization: The FreeBSD Diary User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick , freebsd-stable References: <4CA73702.5080203@langille.org> <20101002141921.GC70283@icarus.home.lan> In-Reply-To: <20101002141921.GC70283@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: out of HDD space - zfs degraded X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 22:09:29 -0000 On 10/2/2010 10:19 AM, Jeremy Chadwick wrote: > On Sat, Oct 02, 2010 at 09:43:30AM -0400, Dan Langille wrote: >> Overnight I was running a zfs send | zfs receive (both within the >> same system / zpool). The system ran out of space, a drive went off >> line, and the system is degraded. >> >> This is a raidz2 array running on FreeBSD 8.1-STABLE #0: Sat Sep 18 >> 23:43:48 EDT 2010. >> >> The following logs are also available at >> http://www.langille.org/tmp/zfs-space.txt<- no line wrapping >> >> This is what was running: >> >> # time zfs send storage/bacula@transfer | mbuffer | zfs receive >> storage/compressed/bacula-mbuffer >> in @ 0.0 kB/s, out @ 0.0 kB/s, 3670 GB total, buffer 100% >> fullcannot receive new filesystem stream: out of space >> mbuffer: error: outputThread: error writing to at offset >> 0x395917c4000: Broken pipe >> >> summary: 3670 GByte in 10 h 40 min 97.8 MB/s >> mbuffer: warning: error during output to: Broken pipe >> warning: cannot send 'storage/bacula@transfer': Broken pipe >> >> real 640m48.423s >> user 8m52.660s >> sys 211m40.862s >> >> >> Looking in the logs, I see this: >> >> Oct 2 00:50:53 kraken kernel: (ada0:siisch0:0:0:0): lost device >> Oct 2 00:50:54 kraken kernel: siisch0: Timeout on slot 30 >> Oct 2 00:50:54 kraken kernel: siisch0: siis_timeout is 00040000 ss >> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >> Oct 2 00:50:54 kraken kernel: siisch0: Error while READ LOG EXT >> Oct 2 00:50:55 kraken kernel: siisch0: Timeout on slot 30 >> Oct 2 00:50:55 kraken kernel: siisch0: siis_timeout is 00040000 ss >> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >> Oct 2 00:50:55 kraken kernel: siisch0: Error while READ LOG EXT >> Oct 2 00:50:56 kraken kernel: siisch0: Timeout on slot 30 >> Oct 2 00:50:56 kraken kernel: siisch0: siis_timeout is 00040000 ss >> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >> Oct 2 00:50:56 kraken kernel: siisch0: Error while READ LOG EXT >> Oct 2 00:50:57 kraken kernel: siisch0: Timeout on slot 30 >> Oct 2 00:50:57 kraken kernel: siisch0: siis_timeout is 00040000 ss >> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >> Oct 2 00:50:57 kraken kernel: siisch0: Error while READ LOG EXT >> Oct 2 00:50:58 kraken kernel: siisch0: Timeout on slot 30 >> Oct 2 00:50:58 kraken kernel: siisch0: siis_timeout is 00040000 ss >> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >> Oct 2 00:50:58 kraken kernel: siisch0: Error while READ LOG EXT >> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage >> path=/dev/gpt/disk06-live offset=270336 size=8192 error=6 >> >> Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): Synchronize >> cache failed >> Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): removing device entry >> >> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage >> path=/dev/gpt/disk06-live offset=2000187564032 size=8192 error=6 >> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage >> path=/dev/gpt/disk06-live offset=2000187826176 size=8192 error=6 >> >> $ zpool status >> pool: storage >> state: DEGRADED >> scrub: scrub in progress for 5h32m, 17.16% done, 26h44m to go >> config: >> >> NAME STATE READ WRITE CKSUM >> storage DEGRADED 0 0 0 >> raidz2 DEGRADED 0 0 0 >> gpt/disk01-live ONLINE 0 0 0 >> gpt/disk02-live ONLINE 0 0 0 >> gpt/disk03-live ONLINE 0 0 0 >> gpt/disk04-live ONLINE 0 0 0 >> gpt/disk05-live ONLINE 0 0 0 >> gpt/disk06-live REMOVED 0 0 0 >> gpt/disk07-live ONLINE 0 0 0 >> >> $ zfs list >> NAME USED AVAIL REFER MOUNTPOINT >> storage 6.97T 1.91T 1.75G /storage >> storage/bacula 4.72T 1.91T 4.29T /storage/bacula >> storage/compressed 2.25T 1.91T 46.9K /storage/compressed >> storage/compressed/bacula 2.25T 1.91T 42.7K /storage/compressed/bacula >> storage/pgsql 5.50G 1.91T 5.50G /storage/pgsql >> >> $ sudo camcontrol devlist >> Password: >> at scbus2 target 0 lun 0 (pass1,ada1) >> at scbus3 target 0 lun 0 (pass2,ada2) >> at scbus4 target 0 lun 0 (pass3,ada3) >> at scbus5 target 0 lun 0 (pass4,ada4) >> at scbus6 target 0 lun 0 (pass5,ada5) >> at scbus7 target 0 lun 0 (pass6,ada6) >> at scbus8 target 0 lun 0 (pass7,ada7) >> at scbus9 target 0 lun 0 (cd0,pass8) >> at scbus10 target 0 lun 0 (pass9,ada8) >> >> I'm not yet sure if the drive is fully dead or not. This is not a >> hot-swap box. > > It looks to me like the disk labelled gpt/disk06-live literally stopped > responding to commands. The errors you see are coming from the OS and > the siis(4) controller, and both indicate the actual hard disk isn't > responding to the ATA command READ LOG EXT. error=6 means Device not > configured. > > I can't see how/why running out of space would cause this. It looks > more like that you had a hardware issue of some sort happen during the > course of the operations you were running. It may not have happened > until now because you hadn't utilised writes to that area of the disk > (could have bad sectors there, or physical media/platter problems). > > Please provide smartctl -a output for the drive that's gpt/disk06-live, > which I assume is /dev/ada6 (glabel sure makes correlation easy, doesn't > it? Sigh...). Please put the results up on the web somewhere, not > copy-pasted, otherwise I have to do a bunch of manual work with regarsd > to line wrapping/etc... I'll provide an analysis of SMART stats for > you, to see if anything crazy happened to the disk itself. It is ada0, I'm sure, based on the 'lost device' mentioned in /var/log/messages above. I'm getting nowhere. /dev/ada0 does not exist so there is nothing for smartctl to work on. $ sudo smartctl -a /dev/ada0 smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 8.1-STABLE amd64] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net /dev/ada0: Unable to detect device type Smartctl: please specify device type with the -d option. Use smartctl -h to get a usage summary $ sudo smartctl -d ata -a /dev/ada0da0 smartctl 5.39.1 2010-01-28 r3054 [FreeBSD 8.1-STABLE amd64] (local build) Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net Smartctl open device: /dev/ada0 failed: No such file or directory $ ls -l /dev/ada0* ls: /dev/ada0*: No such file or directory I am tempted to reboot or do a camontrol scan. -- Dan Langille - http://langille.org/ From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 22:30:08 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 235B5106566B for ; Sat, 2 Oct 2010 22:30:08 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta06.westchester.pa.mail.comcast.net (qmta06.westchester.pa.mail.comcast.net [76.96.62.56]) by mx1.freebsd.org (Postfix) with ESMTP id C2EF18FC13 for ; Sat, 2 Oct 2010 22:30:07 +0000 (UTC) Received: from omta16.westchester.pa.mail.comcast.net ([76.96.62.88]) by qmta06.westchester.pa.mail.comcast.net with comcast id Dxtq1f0041uE5Es56yW81J; Sat, 02 Oct 2010 22:30:08 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta16.westchester.pa.mail.comcast.net with comcast id DyW61f00B3LrwQ23cyW7li; Sat, 02 Oct 2010 22:30:08 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 666A39B418; Sat, 2 Oct 2010 15:30:05 -0700 (PDT) Date: Sat, 2 Oct 2010 15:30:05 -0700 From: Jeremy Chadwick To: Rumen Telbizov Message-ID: <20101002223005.GA78136@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable@freebsd.org, Steven Hartland Subject: Re: MySQL performance concern X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 22:30:08 -0000 On Sat, Oct 02, 2010 at 01:18:20PM -0700, Rumen Telbizov wrote: > Hello everyone, > > Here's the requested information below: The tunings between your Linux and FreeBSD instances differ severely, and some of the variables don't even exist any longer (example: table_cache is now known as table_open_cache as of MySQL 5.1.3, and probably key_buffer vs. key_buffer_size too). Can you please rule out MySQL tunings being responsible for the problem? Here are your configuration bits, more sanely written: FreeBSD Linux -------------------------- -------------- ------------------ MySQL version 5.1.51 5.1.50 -------------------------- -------------- ------------------ my.cnf tuning FreeBSD Linux -------------------------- -------------- ------------------ key_buffer_size 8 GB ???? key_buffer ???? 4 GB max_allowed_packet 16 MB 1 MB table_open_cache 2048 ???? table_cache ???? 64 sort_buffer_size 64 MB 512 KB read_buffer_size 8 MB 256 KB read_rnd_buffer_size 16 MB 512 KB net_buffer_length ???? 8 KB myisam_sort_buffer_size 256 MB 8 MB thread_cache_size 64 ???? query_cache_size 32 MB ???? thread_concurrency 8 ???? max_heap_table_size 6 GB 6 GB -------------------------- -------------- ------------------ Can you also please provide "top" output for the mysqld process on FreeBSD? > * As for the ZFS. Here's the pool configuration: If you move things to UFS2, does the problem disappear? You might not be seeing any disk I/O on the filesystem with gstat because ZFS ARC could have all of the data in it. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 22:36:28 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF0D1106566B for ; Sat, 2 Oct 2010 22:36:28 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta13.westchester.pa.mail.comcast.net (qmta13.westchester.pa.mail.comcast.net [76.96.59.243]) by mx1.freebsd.org (Postfix) with ESMTP id 7BBA48FC17 for ; Sat, 2 Oct 2010 22:36:28 +0000 (UTC) Received: from omta15.westchester.pa.mail.comcast.net ([76.96.62.87]) by qmta13.westchester.pa.mail.comcast.net with comcast id DxCQ1f0071swQuc5DycU1D; Sat, 02 Oct 2010 22:36:28 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta15.westchester.pa.mail.comcast.net with comcast id DycT1f0083LrwQ23bycUAp; Sat, 02 Oct 2010 22:36:28 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 4FF199B418; Sat, 2 Oct 2010 15:36:26 -0700 (PDT) Date: Sat, 2 Oct 2010 15:36:26 -0700 From: Jeremy Chadwick To: Dan Langille Message-ID: <20101002223626.GB78136@icarus.home.lan> References: <4CA73702.5080203@langille.org> <20101002141921.GC70283@icarus.home.lan> <4CA7AD95.9040703@langille.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA7AD95.9040703@langille.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable Subject: Re: out of HDD space - zfs degraded X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 22:36:28 -0000 On Sat, Oct 02, 2010 at 06:09:25PM -0400, Dan Langille wrote: > On 10/2/2010 10:19 AM, Jeremy Chadwick wrote: > >On Sat, Oct 02, 2010 at 09:43:30AM -0400, Dan Langille wrote: > >>Overnight I was running a zfs send | zfs receive (both within the > >>same system / zpool). The system ran out of space, a drive went off > >>line, and the system is degraded. > >> > >>This is a raidz2 array running on FreeBSD 8.1-STABLE #0: Sat Sep 18 > >>23:43:48 EDT 2010. > >> > >>The following logs are also available at > >>http://www.langille.org/tmp/zfs-space.txt<- no line wrapping > >> > >>This is what was running: > >> > >># time zfs send storage/bacula@transfer | mbuffer | zfs receive > >>storage/compressed/bacula-mbuffer > >>in @ 0.0 kB/s, out @ 0.0 kB/s, 3670 GB total, buffer 100% > >>fullcannot receive new filesystem stream: out of space > >>mbuffer: error: outputThread: error writing to at offset > >>0x395917c4000: Broken pipe > >> > >>summary: 3670 GByte in 10 h 40 min 97.8 MB/s > >>mbuffer: warning: error during output to: Broken pipe > >>warning: cannot send 'storage/bacula@transfer': Broken pipe > >> > >>real 640m48.423s > >>user 8m52.660s > >>sys 211m40.862s > >> > >> > >>Looking in the logs, I see this: > >> > >>Oct 2 00:50:53 kraken kernel: (ada0:siisch0:0:0:0): lost device > >>Oct 2 00:50:54 kraken kernel: siisch0: Timeout on slot 30 > >>Oct 2 00:50:54 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>Oct 2 00:50:54 kraken kernel: siisch0: Error while READ LOG EXT > >>Oct 2 00:50:55 kraken kernel: siisch0: Timeout on slot 30 > >>Oct 2 00:50:55 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>Oct 2 00:50:55 kraken kernel: siisch0: Error while READ LOG EXT > >>Oct 2 00:50:56 kraken kernel: siisch0: Timeout on slot 30 > >>Oct 2 00:50:56 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>Oct 2 00:50:56 kraken kernel: siisch0: Error while READ LOG EXT > >>Oct 2 00:50:57 kraken kernel: siisch0: Timeout on slot 30 > >>Oct 2 00:50:57 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>Oct 2 00:50:57 kraken kernel: siisch0: Error while READ LOG EXT > >>Oct 2 00:50:58 kraken kernel: siisch0: Timeout on slot 30 > >>Oct 2 00:50:58 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>Oct 2 00:50:58 kraken kernel: siisch0: Error while READ LOG EXT > >>Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > >>path=/dev/gpt/disk06-live offset=270336 size=8192 error=6 > >> > >>Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): Synchronize > >>cache failed > >>Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): removing device entry > >> > >>Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > >>path=/dev/gpt/disk06-live offset=2000187564032 size=8192 error=6 > >>Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > >>path=/dev/gpt/disk06-live offset=2000187826176 size=8192 error=6 > >> > >>$ zpool status > >> pool: storage > >> state: DEGRADED > >> scrub: scrub in progress for 5h32m, 17.16% done, 26h44m to go > >>config: > >> > >> NAME STATE READ WRITE CKSUM > >> storage DEGRADED 0 0 0 > >> raidz2 DEGRADED 0 0 0 > >> gpt/disk01-live ONLINE 0 0 0 > >> gpt/disk02-live ONLINE 0 0 0 > >> gpt/disk03-live ONLINE 0 0 0 > >> gpt/disk04-live ONLINE 0 0 0 > >> gpt/disk05-live ONLINE 0 0 0 > >> gpt/disk06-live REMOVED 0 0 0 > >> gpt/disk07-live ONLINE 0 0 0 > >> > >>$ zfs list > >>NAME USED AVAIL REFER MOUNTPOINT > >>storage 6.97T 1.91T 1.75G /storage > >>storage/bacula 4.72T 1.91T 4.29T /storage/bacula > >>storage/compressed 2.25T 1.91T 46.9K /storage/compressed > >>storage/compressed/bacula 2.25T 1.91T 42.7K /storage/compressed/bacula > >>storage/pgsql 5.50G 1.91T 5.50G /storage/pgsql > >> > >>$ sudo camcontrol devlist > >>Password: > >> at scbus2 target 0 lun 0 (pass1,ada1) > >> at scbus3 target 0 lun 0 (pass2,ada2) > >> at scbus4 target 0 lun 0 (pass3,ada3) > >> at scbus5 target 0 lun 0 (pass4,ada4) > >> at scbus6 target 0 lun 0 (pass5,ada5) > >> at scbus7 target 0 lun 0 (pass6,ada6) > >> at scbus8 target 0 lun 0 (pass7,ada7) > >> at scbus9 target 0 lun 0 (cd0,pass8) > >> at scbus10 target 0 lun 0 (pass9,ada8) > >> > >>I'm not yet sure if the drive is fully dead or not. This is not a > >>hot-swap box. > > > >It looks to me like the disk labelled gpt/disk06-live literally stopped > >responding to commands. The errors you see are coming from the OS and > >the siis(4) controller, and both indicate the actual hard disk isn't > >responding to the ATA command READ LOG EXT. error=6 means Device not > >configured. > > > >I can't see how/why running out of space would cause this. It looks > >more like that you had a hardware issue of some sort happen during the > >course of the operations you were running. It may not have happened > >until now because you hadn't utilised writes to that area of the disk > >(could have bad sectors there, or physical media/platter problems). > > > >Please provide smartctl -a output for the drive that's gpt/disk06-live, > >which I assume is /dev/ada6 (glabel sure makes correlation easy, doesn't > >it? Sigh...). Please put the results up on the web somewhere, not > >copy-pasted, otherwise I have to do a bunch of manual work with regarsd > >to line wrapping/etc... I'll provide an analysis of SMART stats for > >you, to see if anything crazy happened to the disk itself. > > It is ada0, I'm sure, based on the 'lost device' mentioned in > /var/log/messages above. > > I'm getting nowhere. /dev/ada0 does not exist so there is nothing > for smartctl to work on. > > [...] > > $ ls -l /dev/ada0* > ls: /dev/ada0*: No such file or directory Okay, so gpt/disk06-live is /dev/ada0. (I won't ask why the label is called "disk06", but whatever. :-) ) > I am tempted to reboot or do a camontrol scan. DO NOT REBOOT. You can try the following -- I'm not sure whether to use scbus0 or scbus1 as the argument however, since I don't know what scbusX number ada0 was attached to previously. "dmesg" from when the machine booted would show this. camcontrol reset scbusX camcontrol rescan scbusX If the disk comes back, please smartctl -a it. -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 23:23:21 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1666D106564A for ; Sat, 2 Oct 2010 23:23:21 +0000 (UTC) (envelope-from dan@langille.org) Received: from nyi.unixathome.org (nyi.unixathome.org [64.147.113.42]) by mx1.freebsd.org (Postfix) with ESMTP id B8E4C8FC13 for ; Sat, 2 Oct 2010 23:23:20 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by nyi.unixathome.org (Postfix) with ESMTP id 1E0BB508D8; Sun, 3 Oct 2010 00:23:20 +0100 (BST) X-Virus-Scanned: amavisd-new at unixathome.org Received: from nyi.unixathome.org ([127.0.0.1]) by localhost (nyi.unixathome.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UthwVOVZhghc; Sun, 3 Oct 2010 00:23:19 +0100 (BST) Received: from smtp-auth.unixathome.org (smtp-auth.unixathome.org [10.4.7.7]) (Authenticated sender: hidden) by nyi.unixathome.org (Postfix) with ESMTPSA id C6052508AD ; Sun, 3 Oct 2010 00:23:19 +0100 (BST) Message-ID: <4CA7BEE4.9050201@langille.org> Date: Sat, 02 Oct 2010 19:23:16 -0400 From: Dan Langille Organization: The FreeBSD Diary User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.9) Gecko/20100915 Thunderbird/3.1.4 MIME-Version: 1.0 To: Jeremy Chadwick References: <4CA73702.5080203@langille.org> <20101002141921.GC70283@icarus.home.lan> <4CA7AD95.9040703@langille.org> <20101002223626.GB78136@icarus.home.lan> In-Reply-To: <20101002223626.GB78136@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-stable Subject: Re: out of HDD space - zfs degraded X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 23:23:21 -0000 On 10/2/2010 6:36 PM, Jeremy Chadwick wrote: > On Sat, Oct 02, 2010 at 06:09:25PM -0400, Dan Langille wrote: >> On 10/2/2010 10:19 AM, Jeremy Chadwick wrote: >>> On Sat, Oct 02, 2010 at 09:43:30AM -0400, Dan Langille wrote: >>>> Overnight I was running a zfs send | zfs receive (both within the >>>> same system / zpool). The system ran out of space, a drive went off >>>> line, and the system is degraded. >>>> >>>> This is a raidz2 array running on FreeBSD 8.1-STABLE #0: Sat Sep 18 >>>> 23:43:48 EDT 2010. >>>> >>>> The following logs are also available at >>>> http://www.langille.org/tmp/zfs-space.txt<- no line wrapping >>>> >>>> This is what was running: >>>> >>>> # time zfs send storage/bacula@transfer | mbuffer | zfs receive >>>> storage/compressed/bacula-mbuffer >>>> in @ 0.0 kB/s, out @ 0.0 kB/s, 3670 GB total, buffer 100% >>>> fullcannot receive new filesystem stream: out of space >>>> mbuffer: error: outputThread: error writing to at offset >>>> 0x395917c4000: Broken pipe >>>> >>>> summary: 3670 GByte in 10 h 40 min 97.8 MB/s >>>> mbuffer: warning: error during output to: Broken pipe >>>> warning: cannot send 'storage/bacula@transfer': Broken pipe >>>> >>>> real 640m48.423s >>>> user 8m52.660s >>>> sys 211m40.862s >>>> >>>> >>>> Looking in the logs, I see this: >>>> >>>> Oct 2 00:50:53 kraken kernel: (ada0:siisch0:0:0:0): lost device >>>> Oct 2 00:50:54 kraken kernel: siisch0: Timeout on slot 30 >>>> Oct 2 00:50:54 kraken kernel: siisch0: siis_timeout is 00040000 ss >>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >>>> Oct 2 00:50:54 kraken kernel: siisch0: Error while READ LOG EXT >>>> Oct 2 00:50:55 kraken kernel: siisch0: Timeout on slot 30 >>>> Oct 2 00:50:55 kraken kernel: siisch0: siis_timeout is 00040000 ss >>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >>>> Oct 2 00:50:55 kraken kernel: siisch0: Error while READ LOG EXT >>>> Oct 2 00:50:56 kraken kernel: siisch0: Timeout on slot 30 >>>> Oct 2 00:50:56 kraken kernel: siisch0: siis_timeout is 00040000 ss >>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >>>> Oct 2 00:50:56 kraken kernel: siisch0: Error while READ LOG EXT >>>> Oct 2 00:50:57 kraken kernel: siisch0: Timeout on slot 30 >>>> Oct 2 00:50:57 kraken kernel: siisch0: siis_timeout is 00040000 ss >>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >>>> Oct 2 00:50:57 kraken kernel: siisch0: Error while READ LOG EXT >>>> Oct 2 00:50:58 kraken kernel: siisch0: Timeout on slot 30 >>>> Oct 2 00:50:58 kraken kernel: siisch0: siis_timeout is 00040000 ss >>>> 40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 >>>> Oct 2 00:50:58 kraken kernel: siisch0: Error while READ LOG EXT >>>> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage >>>> path=/dev/gpt/disk06-live offset=270336 size=8192 error=6 >>>> >>>> Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): Synchronize >>>> cache failed >>>> Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): removing device entry >>>> >>>> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage >>>> path=/dev/gpt/disk06-live offset=2000187564032 size=8192 error=6 >>>> Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage >>>> path=/dev/gpt/disk06-live offset=2000187826176 size=8192 error=6 >>>> >>>> $ zpool status >>>> pool: storage >>>> state: DEGRADED >>>> scrub: scrub in progress for 5h32m, 17.16% done, 26h44m to go >>>> config: >>>> >>>> NAME STATE READ WRITE CKSUM >>>> storage DEGRADED 0 0 0 >>>> raidz2 DEGRADED 0 0 0 >>>> gpt/disk01-live ONLINE 0 0 0 >>>> gpt/disk02-live ONLINE 0 0 0 >>>> gpt/disk03-live ONLINE 0 0 0 >>>> gpt/disk04-live ONLINE 0 0 0 >>>> gpt/disk05-live ONLINE 0 0 0 >>>> gpt/disk06-live REMOVED 0 0 0 >>>> gpt/disk07-live ONLINE 0 0 0 >>>> >>>> $ zfs list >>>> NAME USED AVAIL REFER MOUNTPOINT >>>> storage 6.97T 1.91T 1.75G /storage >>>> storage/bacula 4.72T 1.91T 4.29T /storage/bacula >>>> storage/compressed 2.25T 1.91T 46.9K /storage/compressed >>>> storage/compressed/bacula 2.25T 1.91T 42.7K /storage/compressed/bacula >>>> storage/pgsql 5.50G 1.91T 5.50G /storage/pgsql >>>> >>>> $ sudo camcontrol devlist >>>> Password: >>>> at scbus2 target 0 lun 0 (pass1,ada1) >>>> at scbus3 target 0 lun 0 (pass2,ada2) >>>> at scbus4 target 0 lun 0 (pass3,ada3) >>>> at scbus5 target 0 lun 0 (pass4,ada4) >>>> at scbus6 target 0 lun 0 (pass5,ada5) >>>> at scbus7 target 0 lun 0 (pass6,ada6) >>>> at scbus8 target 0 lun 0 (pass7,ada7) >>>> at scbus9 target 0 lun 0 (cd0,pass8) >>>> at scbus10 target 0 lun 0 (pass9,ada8) >>>> >>>> I'm not yet sure if the drive is fully dead or not. This is not a >>>> hot-swap box. >>> >>> It looks to me like the disk labelled gpt/disk06-live literally stopped >>> responding to commands. The errors you see are coming from the OS and >>> the siis(4) controller, and both indicate the actual hard disk isn't >>> responding to the ATA command READ LOG EXT. error=6 means Device not >>> configured. >>> >>> I can't see how/why running out of space would cause this. It looks >>> more like that you had a hardware issue of some sort happen during the >>> course of the operations you were running. It may not have happened >>> until now because you hadn't utilised writes to that area of the disk >>> (could have bad sectors there, or physical media/platter problems). >>> >>> Please provide smartctl -a output for the drive that's gpt/disk06-live, >>> which I assume is /dev/ada6 (glabel sure makes correlation easy, doesn't >>> it? Sigh...). Please put the results up on the web somewhere, not >>> copy-pasted, otherwise I have to do a bunch of manual work with regarsd >>> to line wrapping/etc... I'll provide an analysis of SMART stats for >>> you, to see if anything crazy happened to the disk itself. >> >> It is ada0, I'm sure, based on the 'lost device' mentioned in >> /var/log/messages above. >> >> I'm getting nowhere. /dev/ada0 does not exist so there is nothing >> for smartctl to work on. >> >> [...] >> >> $ ls -l /dev/ada0* >> ls: /dev/ada0*: No such file or directory > > Okay, so gpt/disk06-live is /dev/ada0. (I won't ask why the label is > called "disk06", but whatever. :-) ) > >> I am tempted to reboot or do a camontrol scan. > > DO NOT REBOOT.You can try the following -- I'm not sure whether to > use scbus0 or scbus1 as the argument however, since I don't know what > scbusX number ada0 was attached to previously. "dmesg" from when the > machine booted would show this. > > camcontrol reset scbusX > camcontrol rescan scbusX I see this in /var/run/dmesg.boot: ada0 at siisch0 bus 0 scbus0 target 0 lun 0 ada0: ATA-8 SATA 2.x device ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) ada0: Command Queueing enabled $ sudo camcontrol reset scbus0 Password: Reset of bus 0 was successful $ sudo camcontrol rescan scbus0 Re-scan of bus 0 was successful > If the disk comes back, please smartctl -a it. I didn't come back: $ ls /dev/ada* /dev/ada1 /dev/ada2p1 /dev/ada4 /dev/ada5p1 /dev/ada7 /dev/ada1p1 /dev/ada3 /dev/ada4p1 /dev/ada6 /dev/ada8 /dev/ada2 /dev/ada3p1 /dev/ada5 /dev/ada6p1 FYI, there's nothing new in /var/log/messages as a results of those commands. -- Dan Langille - http://langille.org/ From owner-freebsd-stable@FreeBSD.ORG Sat Oct 2 23:50:25 2010 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D148A106566C for ; Sat, 2 Oct 2010 23:50:25 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta11.emeryville.ca.mail.comcast.net (qmta11.emeryville.ca.mail.comcast.net [76.96.27.211]) by mx1.freebsd.org (Postfix) with ESMTP id B563B8FC15 for ; Sat, 2 Oct 2010 23:50:25 +0000 (UTC) Received: from omta19.emeryville.ca.mail.comcast.net ([76.96.30.76]) by qmta11.emeryville.ca.mail.comcast.net with comcast id DzpT1f0041eYJf8ABzqRnK; Sat, 02 Oct 2010 23:50:25 +0000 Received: from koitsu.dyndns.org ([98.248.41.155]) by omta19.emeryville.ca.mail.comcast.net with comcast id DzqQ1f00A3LrwQ201zqQqd; Sat, 02 Oct 2010 23:50:25 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 47C7C9B418; Sat, 2 Oct 2010 16:50:24 -0700 (PDT) Date: Sat, 2 Oct 2010 16:50:24 -0700 From: Jeremy Chadwick To: Dan Langille Message-ID: <20101002235024.GA80643@icarus.home.lan> References: <4CA73702.5080203@langille.org> <20101002141921.GC70283@icarus.home.lan> <4CA7AD95.9040703@langille.org> <20101002223626.GB78136@icarus.home.lan> <4CA7BEE4.9050201@langille.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4CA7BEE4.9050201@langille.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-stable Subject: Re: out of HDD space - zfs degraded X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Oct 2010 23:50:26 -0000 On Sat, Oct 02, 2010 at 07:23:16PM -0400, Dan Langille wrote: > On 10/2/2010 6:36 PM, Jeremy Chadwick wrote: > >On Sat, Oct 02, 2010 at 06:09:25PM -0400, Dan Langille wrote: > >>On 10/2/2010 10:19 AM, Jeremy Chadwick wrote: > >>>On Sat, Oct 02, 2010 at 09:43:30AM -0400, Dan Langille wrote: > >>>>Overnight I was running a zfs send | zfs receive (both within the > >>>>same system / zpool). The system ran out of space, a drive went off > >>>>line, and the system is degraded. > >>>> > >>>>This is a raidz2 array running on FreeBSD 8.1-STABLE #0: Sat Sep 18 > >>>>23:43:48 EDT 2010. > >>>> > >>>>The following logs are also available at > >>>>http://www.langille.org/tmp/zfs-space.txt<- no line wrapping > >>>> > >>>>This is what was running: > >>>> > >>>># time zfs send storage/bacula@transfer | mbuffer | zfs receive > >>>>storage/compressed/bacula-mbuffer > >>>>in @ 0.0 kB/s, out @ 0.0 kB/s, 3670 GB total, buffer 100% > >>>>fullcannot receive new filesystem stream: out of space > >>>>mbuffer: error: outputThread: error writing to at offset > >>>>0x395917c4000: Broken pipe > >>>> > >>>>summary: 3670 GByte in 10 h 40 min 97.8 MB/s > >>>>mbuffer: warning: error during output to: Broken pipe > >>>>warning: cannot send 'storage/bacula@transfer': Broken pipe > >>>> > >>>>real 640m48.423s > >>>>user 8m52.660s > >>>>sys 211m40.862s > >>>> > >>>> > >>>>Looking in the logs, I see this: > >>>> > >>>>Oct 2 00:50:53 kraken kernel: (ada0:siisch0:0:0:0): lost device > >>>>Oct 2 00:50:54 kraken kernel: siisch0: Timeout on slot 30 > >>>>Oct 2 00:50:54 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>>>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>>>Oct 2 00:50:54 kraken kernel: siisch0: Error while READ LOG EXT > >>>>Oct 2 00:50:55 kraken kernel: siisch0: Timeout on slot 30 > >>>>Oct 2 00:50:55 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>>>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>>>Oct 2 00:50:55 kraken kernel: siisch0: Error while READ LOG EXT > >>>>Oct 2 00:50:56 kraken kernel: siisch0: Timeout on slot 30 > >>>>Oct 2 00:50:56 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>>>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>>>Oct 2 00:50:56 kraken kernel: siisch0: Error while READ LOG EXT > >>>>Oct 2 00:50:57 kraken kernel: siisch0: Timeout on slot 30 > >>>>Oct 2 00:50:57 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>>>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>>>Oct 2 00:50:57 kraken kernel: siisch0: Error while READ LOG EXT > >>>>Oct 2 00:50:58 kraken kernel: siisch0: Timeout on slot 30 > >>>>Oct 2 00:50:58 kraken kernel: siisch0: siis_timeout is 00040000 ss > >>>>40000000 rs 40000000 es 00000000 sts 801f0040 serr 00000000 > >>>>Oct 2 00:50:58 kraken kernel: siisch0: Error while READ LOG EXT > >>>>Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > >>>>path=/dev/gpt/disk06-live offset=270336 size=8192 error=6 > >>>> > >>>>Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): Synchronize > >>>>cache failed > >>>>Oct 2 00:50:59 kraken kernel: (ada0:siisch0:0:0:0): removing device entry > >>>> > >>>>Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > >>>>path=/dev/gpt/disk06-live offset=2000187564032 size=8192 error=6 > >>>>Oct 2 00:50:59 kraken root: ZFS: vdev I/O failure, zpool=storage > >>>>path=/dev/gpt/disk06-live offset=2000187826176 size=8192 error=6 > >>>> > >>>>$ zpool status > >>>> pool: storage > >>>> state: DEGRADED > >>>> scrub: scrub in progress for 5h32m, 17.16% done, 26h44m to go > >>>>config: > >>>> > >>>> NAME STATE READ WRITE CKSUM > >>>> storage DEGRADED 0 0 0 > >>>> raidz2 DEGRADED 0 0 0 > >>>> gpt/disk01-live ONLINE 0 0 0 > >>>> gpt/disk02-live ONLINE 0 0 0 > >>>> gpt/disk03-live ONLINE 0 0 0 > >>>> gpt/disk04-live ONLINE 0 0 0 > >>>> gpt/disk05-live ONLINE 0 0 0 > >>>> gpt/disk06-live REMOVED 0 0 0 > >>>> gpt/disk07-live ONLINE 0 0 0 > >>>> > >>>>$ zfs list > >>>>NAME USED AVAIL REFER MOUNTPOINT > >>>>storage 6.97T 1.91T 1.75G /storage > >>>>storage/bacula 4.72T 1.91T 4.29T /storage/bacula > >>>>storage/compressed 2.25T 1.91T 46.9K /storage/compressed > >>>>storage/compressed/bacula 2.25T 1.91T 42.7K /storage/compressed/bacula > >>>>storage/pgsql 5.50G 1.91T 5.50G /storage/pgsql > >>>> > >>>>$ sudo camcontrol devlist > >>>>Password: > >>>> at scbus2 target 0 lun 0 (pass1,ada1) > >>>> at scbus3 target 0 lun 0 (pass2,ada2) > >>>> at scbus4 target 0 lun 0 (pass3,ada3) > >>>> at scbus5 target 0 lun 0 (pass4,ada4) > >>>> at scbus6 target 0 lun 0 (pass5,ada5) > >>>> at scbus7 target 0 lun 0 (pass6,ada6) > >>>> at scbus8 target 0 lun 0 (pass7,ada7) > >>>> at scbus9 target 0 lun 0 (cd0,pass8) > >>>> at scbus10 target 0 lun 0 (pass9,ada8) > >>>> > >>>>I'm not yet sure if the drive is fully dead or not. This is not a > >>>>hot-swap box. > >>> > >>>It looks to me like the disk labelled gpt/disk06-live literally stopped > >>>responding to commands. The errors you see are coming from the OS and > >>>the siis(4) controller, and both indicate the actual hard disk isn't > >>>responding to the ATA command READ LOG EXT. error=6 means Device not > >>>configured. > >>> > >>>I can't see how/why running out of space would cause this. It looks > >>>more like that you had a hardware issue of some sort happen during the > >>>course of the operations you were running. It may not have happened > >>>until now because you hadn't utilised writes to that area of the disk > >>>(could have bad sectors there, or physical media/platter problems). > >>> > >>>Please provide smartctl -a output for the drive that's gpt/disk06-live, > >>>which I assume is /dev/ada6 (glabel sure makes correlation easy, doesn't > >>>it? Sigh...). Please put the results up on the web somewhere, not > >>>copy-pasted, otherwise I have to do a bunch of manual work with regarsd > >>>to line wrapping/etc... I'll provide an analysis of SMART stats for > >>>you, to see if anything crazy happened to the disk itself. > >> > >>It is ada0, I'm sure, based on the 'lost device' mentioned in > >>/var/log/messages above. > >> > >>I'm getting nowhere. /dev/ada0 does not exist so there is nothing > >>for smartctl to work on. > >> > >>[...] > >> > >>$ ls -l /dev/ada0* > >>ls: /dev/ada0*: No such file or directory > > > >Okay, so gpt/disk06-live is /dev/ada0. (I won't ask why the label is > >called "disk06", but whatever. :-) ) > > > >>I am tempted to reboot or do a camontrol scan. > > > >DO NOT REBOOT.You can try the following -- I'm not sure whether to > >use scbus0 or scbus1 as the argument however, since I don't know what > >scbusX number ada0 was attached to previously. "dmesg" from when the > >machine booted would show this. > > > >camcontrol reset scbusX > >camcontrol rescan scbusX > > I see this in /var/run/dmesg.boot: > > ada0 at siisch0 bus 0 scbus0 target 0 lun 0 > ada0: ATA-8 SATA 2.x device > ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes) > ada0: Command Queueing enabled > > $ sudo camcontrol reset scbus0 > Password: > Reset of bus 0 was successful > > $ sudo camcontrol rescan scbus0 > Re-scan of bus 0 was successful > > >If the disk comes back, please smartctl -a it. > > I didn't come back: > > $ ls /dev/ada* > /dev/ada1 /dev/ada2p1 /dev/ada4 /dev/ada5p1 /dev/ada7 > /dev/ada1p1 /dev/ada3 /dev/ada4p1 /dev/ada6 /dev/ada8 > /dev/ada2 /dev/ada3p1 /dev/ada5 /dev/ada6p1 > > FYI, there's nothing new in /var/log/messages as a results of those > commands. Then I would recommend power-cycling (not rebooting or pressing of the reset button) the machine. There's a good chance the ada0 disk has fallen off the bus and needs a full power-cycle, since a LUN scan didn't result in its reappearance. I see this kind of problem on a weekly basis at my workplace, in 3 different datacenters, with Fujitsu SCSI-3 disks. A system reboot doesn't make the disk reappear on on the bus, nor does a reset (pressing of the reset button). Only a full power-cycle works. And when I say weekly, I'm not exaggerating. I realise your disks are Hitachi not Fujitsu, and are SATA not SCSI, but it really doesn't matter -- there are cases where the drive firmware is wedged so hard that a physical power-cycle is required. If a power-cycle works, smartctl -a /dev/ada0 the disk and save the SMART stats somewhere. If the same disk fails in this way again, I strongly recommend advance RMA'ing it (to ensure you get a completely different disk). Good luck! -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |