From owner-freebsd-hackers@FreeBSD.ORG Sun Jan 20 00:11:04 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0046B25D for ; Sun, 20 Jan 2013 00:11:03 +0000 (UTC) (envelope-from scott4long@yahoo.com) Received: from nm15-vm4.bullet.mail.ne1.yahoo.com (nm15-vm4.bullet.mail.ne1.yahoo.com [98.138.91.175]) by mx1.freebsd.org (Postfix) with ESMTP id BD7856EC for ; Sun, 20 Jan 2013 00:11:03 +0000 (UTC) Received: from [98.138.90.48] by nm15.bullet.mail.ne1.yahoo.com with NNFMP; 20 Jan 2013 00:10:57 -0000 Received: from [98.138.84.45] by tm1.bullet.mail.ne1.yahoo.com with NNFMP; 20 Jan 2013 00:10:57 -0000 Received: from [127.0.0.1] by smtp113.mail.ne1.yahoo.com with NNFMP; 20 Jan 2013 00:10:57 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1358640657; bh=5wZ1oqPghfnXLKaF++uLOenHoGZSRLmZWjPlIXY+/YI=; h=X-Yahoo-Newman-Id:X-Yahoo-Newman-Property:X-YMail-OSG:X-Yahoo-SMTP:Received:Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc:Content-Transfer-Encoding:Message-Id:References:To:X-Mailer; b=Dwv6ky5Hsq/5wb1zuvlPzt2/DaXS4RS9z9TdTXWUuXVFuBi+TCoQ6qKNm2/HYKcjVlWM4xwFGLqpJ+jKFu712yZ/sYqc/8bQs1NBAoZ//TfZ1dbMAEJ5/m9Z36v33Zz6ewXjAC2Z0kgwzH7MzfpYNVK5vTA2KtAUcD09Uox0AAU= X-Yahoo-Newman-Id: 565520.14309.bm@smtp113.mail.ne1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: mvZTaN0VM1kI6fZMmHgSMCKft_bu7TKse8ELx.f1YdFriRR d_plXLAdEqch9aQmIdvGprV_3GVJ1nkAOff.3H2l6vvccHyx7DDmszMclGpo GkCzSf93O8LXcTcfUBMKI._GcsrJufBwyoNWJqMT9c2ZYLRzUT8SoorfoGTX T81TLWMLlKqSau9lD6.Xm1s6b6.IUH_TSaLZTNZMN8RGjGHxMEbrdejrFlF_ UoLlgk5F3wLiwy7P5Vkr6GV2N9Edn1Z8VIW7yLHbvab.vnWpQeYlSF8SsuGP mwzBjXlTHoJTADpsAS1fncMdbRpjN59b80NPTI_10BPyPxqqAdCSDyS1618o F1Ket1HQjhD02ziTujNvh.KgsNepZ127dkSDmmH.KYKpGBZxjtqkqv08rwkc zvlil8VDvDiDb8Qo0.5T6OLwbELhh1vDwfitkK7a7NanAkj80NUoy2Rr3Q5A L8G13y.6.qEq5 X-Yahoo-SMTP: clhABp.swBB7fs.LwIJpv3jkWgo2NU8- Received: from [192.168.1.108] (scott4long@216.237.246.3 with plain) by smtp113.mail.ne1.yahoo.com with SMTP; 19 Jan 2013 16:10:57 -0800 PST Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: IBM blade server abysmal disk write performances From: Scott Long In-Reply-To: Date: Sat, 19 Jan 2013 17:10:57 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <6C0B86E6-195C-4D35-AE40-3D2F9F6D28FB@yahoo.com> <1358544287.32417.251.camel@revolution.hippie.lan> <50F9CFEB.5060302@feral.com> <50F9DB9A.9050303@gmail.com> <50FABB71.6050406@freebsd.org> To: Wojciech Puchar X-Mailer: Apple Mail (2.1499) X-Mailman-Approved-At: Sun, 20 Jan 2013 00:36:31 +0000 Cc: Karim Fodil-Lemelin , "freebsd-hackers@freebsd.org Hackers" , "gibbs@FreeBSD.org Gibbs" , "mjacob@FreeBSD.org Jacob" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jan 2013 00:11:04 -0000 On Jan 19, 2013, at 4:33 PM, Wojciech Puchar = wrote: >> to be enabled to get any speed-up from tagged commands. This was no >> risk with SCSI drives, since the cache did not make the drives lye >=20 > i see no correlation between interface type and possibility of lying = about command completion. >=20 Any interface that enables write cache will lie about write completions. = This is true for SAS, SATA, SCSI, and PATA (and probably FC and iSCSI). = That's the whole point of the write cache =3D-) Where things got interesting was in the days of SCSI vs PATA. There was no tagged queuing for PATA, except for a hack that allowed CDROMs to disconnect from the shared bus. So you only got 1 command at a time, = and you payed a serialized latency penalty. The only way to get reasonable write performance on PATA was to enable the write cache. Meanwhile, SCSI had TCQ and could amortize the latency penalty to the point where performance with TCQ and no WC was almost as good at with WC. This made SCSI the clear choice for performance + data safety. With SATA vs SAS, the gap is much narrower. The TCQ command set (still used by SAS) is still better than the NCQ command set, but the differences are minor enough that it doesn't matter for most = applications. Scott From owner-freebsd-hackers@FreeBSD.ORG Sun Jan 20 01:43:18 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id EA5AA3F4; Sun, 20 Jan 2013 01:43:18 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (gw.catspoiler.org [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id AFDE3987; Sun, 20 Jan 2013 01:43:18 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id r0K1OAld019768; Sat, 19 Jan 2013 17:24:15 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <201301200124.r0K1OAld019768@gw.catspoiler.org> Date: Sat, 19 Jan 2013 17:24:10 -0800 (PST) From: Don Lewis Subject: Re: IBM blade server abysmal disk write performances To: wojtek@wojtek.tensor.gdynia.pl In-Reply-To: MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: scott4long@yahoo.com, freebsd-hackers@FreeBSD.org, dieterbsd@gmail.com, scottl@FreeBSD.org, gibbs@FreeBSD.org, mjacob@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jan 2013 01:43:19 -0000 On 18 Jan, Wojciech Puchar wrote: > If computer have UPS then write caching is fine. even if FreeBSD crash, > disk would write data I've had my share of sudden UPS failures over the years. Probably more than half have been during an automatic battery self test. UPS goes on battery, and then *boom*, everything shuts down. At that point the UPS helpfully indicates that the battery needs to be replaced. This seems to happen more frequently once the batteries get to be about 4 years old. I've started replacing them after 3 years. My next big build will have redundant PSUs, each connected to a separate UPS. From owner-freebsd-hackers@FreeBSD.ORG Sun Jan 20 01:46:26 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B0D404F7; Sun, 20 Jan 2013 01:46:26 +0000 (UTC) (envelope-from truckman@FreeBSD.org) Received: from gw.catspoiler.org (gw.catspoiler.org [75.1.14.242]) by mx1.freebsd.org (Postfix) with ESMTP id 957329A8; Sun, 20 Jan 2013 01:46:26 +0000 (UTC) Received: from FreeBSD.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.13.3/8.13.3) with ESMTP id r0K1kFxY019796; Sat, 19 Jan 2013 17:46:19 -0800 (PST) (envelope-from truckman@FreeBSD.org) Message-Id: <201301200146.r0K1kFxY019796@gw.catspoiler.org> Date: Sat, 19 Jan 2013 17:46:14 -0800 (PST) From: Don Lewis Subject: Re: IBM blade server abysmal disk write performances To: se@FreeBSD.org In-Reply-To: <50FABB71.6050406@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Cc: fodillemlinkarim@gmail.com, freebsd-hackers@FreeBSD.org, gibbs@FreeBSD.org, scottl@FreeBSD.org, mjacob@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jan 2013 01:46:26 -0000 On 19 Jan, Stefan Esser wrote: > I seem to remember, that drives of that time required the write cache > to be enabled to get any speed-up from tagged commands. This was no > risk with SCSI drives, since the cache did not make the drives lye > about command completion (i.e. the status for the write was only > returned when the cached data had been written to disk, independently > of the write cache enable). For a very long time, all of the SCSI drives that I have purchased have come with the WCE bit turned on. I always had to remember to use camcontrol to turn it off. When I last benchmarked it quite a few years ago, buildworld times were about the same with either setting, and my filesystems were a lot safer with WCE off, which UFS+SU depends on. I've also seen drives dynamically drop the number of supported tags WCE was on and the write cache started getting full, which made CAM unhappy. I've been using SCSI for anything important for all these years except on my laptop. I haven't yet switched to SATA because I haven't put together a new system since NCQ support made it into -STABLE. The hard drives in my -CURRENT machine are cast-offs from my primary machine. Just doin' my part to make sure legacy support isn't broken ... From owner-freebsd-hackers@FreeBSD.ORG Sun Jan 20 14:07:35 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C163164C; Sun, 20 Jan 2013 14:07:35 +0000 (UTC) (envelope-from wjw@digiware.nl) Received: from mail.digiware.nl (unknown [IPv6:2001:4cb8:90:ffff::3]) by mx1.freebsd.org (Postfix) with ESMTP id 85A2E3EE; Sun, 20 Jan 2013 14:07:35 +0000 (UTC) Received: from rack1.digiware.nl (localhost.digiware.nl [127.0.0.1]) by mail.digiware.nl (Postfix) with ESMTP id F13AE153434; Sun, 20 Jan 2013 15:07:33 +0100 (CET) X-Virus-Scanned: amavisd-new at digiware.nl Received: from mail.digiware.nl ([127.0.0.1]) by rack1.digiware.nl (rack1.digiware.nl [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ur3f4ylVkqfn; Sun, 20 Jan 2013 15:07:33 +0100 (CET) Received: from [192.168.10.10] (vaio [192.168.10.10]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by mail.digiware.nl (Postfix) with ESMTPSA id 436D8153433; Sun, 20 Jan 2013 15:07:32 +0100 (CET) Message-ID: <50FBFA2C.2010504@digiware.nl> Date: Sun, 20 Jan 2013 15:07:40 +0100 From: Willem Jan Withagen User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Ian Lepore Subject: Re: Failsafe on kernel panic References: <201301161513.27016.jhb@freebsd.org> <1358392725.32417.179.camel@revolution.hippie.lan> In-Reply-To: <1358392725.32417.179.camel@revolution.hippie.lan> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Sun, 20 Jan 2013 15:27:46 +0000 Cc: freebsd-hackers@FreeBSD.org, Sami Halabi , "freebsd-stable@freebsd.org" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jan 2013 14:07:35 -0000 On 17-1-2013 4:18, Ian Lepore wrote: > On Wed, 2013-01-16 at 23:27 +0200, Sami Halabi wrote: >> Thank you for your response, very helpful. >> one question - how do i configure auto-reboot once kernel panic occurs? >> >> Sami >> > > From src/sys/conf/NOTES, this may be what you're looking for... > > # > # Don't enter the debugger for a panic. Intended for unattended operation > # where you may want to enter the debugger from the console, but still want > # the machine to recover from a panic. > # > options KDB_UNATTENDED > > But I think it only has meaning if you have option KDB in effect, > otherwise it should just reboot itself after a 15 second pause. Well it is not the magical fix-all solution. Last night I had to drive to the colo (lucky for me a 5 min drive.) because I could not get a system to reboot/recover from a crash. Upon arrival the system was crashed and halted on the message: rebooting in 15 sec. Which but those 15 secs are would have gone by for about 10-20 minutes. fysically rebooting or resetting ended up in the same position: rebooting in 15 sec. Without ever getting to actually rebooting. So if I (you) have servers 2 hours away, I usually try to work on upgrading/rebooting during business hours. And remote hands can get me out of trouble.... IPMI is another nice way of getting at the server in these cases. But that requires a lot more infra and tinkering. --WjW From owner-freebsd-hackers@FreeBSD.ORG Sun Jan 20 22:26:57 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 67E80C1F; Sun, 20 Jan 2013 22:26:57 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-la0-f53.google.com (mail-la0-f53.google.com [209.85.215.53]) by mx1.freebsd.org (Postfix) with ESMTP id C141A8A7; Sun, 20 Jan 2013 22:26:56 +0000 (UTC) Received: by mail-la0-f53.google.com with SMTP id fn20so5431393lab.40 for ; Sun, 20 Jan 2013 14:26:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:date:message-id:subject:from:to :content-type; bh=JuNTwzghQnLFi7b4kiekbY/7yuz5RXU+skBR1484Yts=; b=jG8/p/Q+tBT0MGMnibYW3KSGnqU4BC0pjxAKeBYrRpvxu6Y7c6dPSzZ4oOon5t9X3p QPrfrHOL5JYV9bfr+oE5lDunivuHHfAy71SvZFFaW6seynZuLX6VlDMa91hhixjcNvJ4 abCmG0u6RvklvMgRrWClvSBswaXki54+gtM1xT23ScDaBiBoB1SasbXXwTlXb5Ft6Kwm jZF2npvpAu+Q5rH9XbPfnWmDVQtf+26o1Yb7GWpbhRZJrkInODdyV4Oc1OnBVUsMDckg UrIbyCMxPeZViXEa+t2jyyUS6WxuWTJkTLHIUxwSMLDDzDb/bpiKjuriFYyAGV9PNkSL Zytw== MIME-Version: 1.0 X-Received: by 10.112.28.9 with SMTP id x9mr6710216lbg.27.1358720810293; Sun, 20 Jan 2013 14:26:50 -0800 (PST) Received: by 10.112.6.38 with HTTP; Sun, 20 Jan 2013 14:26:50 -0800 (PST) Date: Sun, 20 Jan 2013 17:26:50 -0500 Message-ID: Subject: ZFS regimen: scrub, scrub, scrub and scrub again. From: Zaphod Beeblebrox To: freebsd-fs , FreeBSD Hackers Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jan 2013 22:26:57 -0000 Please don't misinterpret this post: ZFS's ability to recover from fairly catastrophic failures is pretty stellar, but I'm wondering if there can be a little room for improvement. I use RAID pretty much everywhere. I don't like to loose data and disks are cheap. I have a fair amount of experience with all flavors ... and ZFS has become a go-to filesystem for most of my applications. One of the best recommendations I can give for ZFS is it's crash-recoverability. As a counter example, if you have most hardware RAID going or a software whole-disk raid, after a crash it will generally declare one disk as good and the other disk as "to be repaired" ... after which a full surface scan of the affected disks --- reading one and writing the other --- ensues. On my Windows desktop, the pair of 2T's take 3 or 4 hours to do this. A pair of green 2T's can take over 6. You don't loose any data, but you have severely reduced performance until it's repaired. The rub is that you know only one or two blocks could possibly even be different ... and that this is a highly unoptimized way of going about the problem. ZFS is smart on this point: it will recover on reboot with a minimum amount of fuss. Even if you dislodge a drive ... so that it's missing the last 'n' transactions, ZFS seems to figure this out (which I thought was extra cudos). MY PROBLEM comes from problems that scrub can fix. Let's talk, in specific, about my home array. It has 9x 1.5T and 8x 2T in a RAID-Z configuration (2 sets, obviously). The drives themselves are housed (4 each) in external drive bays with a single SATA connection for each. I think I have spoken of this here before. A full scrub of my drives weighs in at 36 hours or so. Now around Christmas, while moving some things, I managed to pull the plug on one cabinet of 4 drives. It was likely that the only active use of the filesystem was an automated cvs checkin (backup) given that the errors only appeared on the cvs directory. IN-THE-END, no data was lost, but I had to scrub 4 times to remove the complaints, which showed like this from "zpool status -v" errors: Permanent errors have been detected in the following files: vr2/cvs:<0x1c1> Now ... this is just an example: after each scrub, the hex number was different. I also couldn't actually find the error on the cvs filesystem, as a side note. Not many files are stored there, and they all seemed to be present. MY TAKEAWAY from this is that 2 major improvements could be made to ZFS: 1) a pause for scrub... such that long scrubs could be paused during working hours. 2) going back over errors... during each scrub, the "new" error was found before the old error was cleared. Then this new error gets similarly cleared by the next scrub. It seems that if the scrub returned to this new found error after fixing the "known" errors, this could save whole new scrub runs from being required. From owner-freebsd-hackers@FreeBSD.ORG Sun Jan 20 23:02:05 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 8B6124E6 for ; Sun, 20 Jan 2013 23:02:05 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id 706D2A13 for ; Sun, 20 Jan 2013 23:02:05 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id r0KN1xWP034885 for ; Sun, 20 Jan 2013 15:01:59 -0800 (PST) (envelope-from yuri@rawbw.com) Message-ID: <50FC7767.4050207@rawbw.com> Date: Sun, 20 Jan 2013 15:01:59 -0800 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130112 Thunderbird/17.0.2 MIME-Version: 1.0 To: hackers@freebsd.org Subject: How to validate the variable size memory block in ioctl handler? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jan 2013 23:02:05 -0000 I am implementing an ioctl that reads/writes variable size structure. Allocated size is supplied by the caller in the structure itself. struct my_struct { int len; // allocated size other_struct s[1]; }; ioctl request id is defined as _IOWR('X', , my_struct) How to validate from the ioctl function handler (for some device) that the whole (variable size) block of bytes is RW accessible in the process memory space? Should I call copyout/copyin for this, or there is some shorter way? EFAULT should be returned in case of validation failure. As I understand, macros like _IOR, _IOWR do validation based on the size of structure supplied to them. So that the handler procedures don't have to do that. I was expecting to find among them some macro that would work for such variable size structure, but it isn't there. (Not sure if this is possible language-wise). Yuri From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 00:59:46 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 42B1D1E9 for ; Mon, 21 Jan 2013 00:59:46 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-qc0-f173.google.com (mail-qc0-f173.google.com [209.85.216.173]) by mx1.freebsd.org (Postfix) with ESMTP id D8658E80 for ; Mon, 21 Jan 2013 00:59:45 +0000 (UTC) Received: by mail-qc0-f173.google.com with SMTP id b12so3523979qca.32 for ; Sun, 20 Jan 2013 16:59:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=k9zI6O/9NZZVxXr5r2kiEDFUWs93gMu8Eesx55GYS3I=; b=pPN+A+XqO/U8asbbdURiyiVa9u98xi0aeEqdcQV4sVKaRPrNoPH4eTpu2q1IJLCU4v 7uqz1oA6ThWv14JfrAUfQLSVCb53HUIBN4mP6KF/kFr+t26TgNrpaAYjt84cyFUyg4Hg kKhqaSJq/pb3XZJaTeJHU/7YLCIAdSCZ3+CheTyFloLTLNjvqK+SD6V/l/n1JFxbLaqk Km42SBn85gA4u+e52K6xUdi6tTRiZEC33DSR/u1frtSKrGc9IHrwzKpN0k5jplsffxrL FJUDj4TtHi3ixtkovjLmr/npS4QKoYPDyIwI23F3LdXIGoDAO1ZuxytvYmMN67TkU+tg SMyg== MIME-Version: 1.0 X-Received: by 10.229.77.13 with SMTP id e13mr4173030qck.69.1358729979427; Sun, 20 Jan 2013 16:59:39 -0800 (PST) Sender: mdf356@gmail.com Received: by 10.229.156.18 with HTTP; Sun, 20 Jan 2013 16:59:39 -0800 (PST) In-Reply-To: <50FC7767.4050207@rawbw.com> References: <50FC7767.4050207@rawbw.com> Date: Sun, 20 Jan 2013 16:59:39 -0800 X-Google-Sender-Auth: KjrsqUxlShXlXyzk1QeBZo0sC9c Message-ID: Subject: Re: How to validate the variable size memory block in ioctl handler? From: mdf@FreeBSD.org To: Yuri Content-Type: text/plain; charset=ISO-8859-1 Cc: hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 00:59:46 -0000 On Sun, Jan 20, 2013 at 3:01 PM, Yuri wrote: > I am implementing an ioctl that reads/writes variable size structure. > Allocated size is supplied by the caller in the structure itself. > struct my_struct { > int len; // allocated size > other_struct s[1]; > }; > ioctl request id is defined as _IOWR('X', , my_struct) > > How to validate from the ioctl function handler (for some device) that the > whole (variable size) block of bytes is RW accessible in the process memory > space? > Should I call copyout/copyin for this, or there is some shorter way? > EFAULT should be returned in case of validation failure. > > As I understand, macros like _IOR, _IOWR do validation based on the size of > structure supplied to them. So that the handler procedures don't have to do > that. > I was expecting to find among them some macro that would work for such > variable size structure, but it isn't there. (Not sure if this is possible > language-wise). You'll need to pass in more than the above, probably, as the kernel's ioctl() function has copied in the specified number of bytes already. I.e. the value passed to your ioctl handler is already in the kernel space, and unless it's 4 bytes, was malloc(9)'d and copyin'd (if it's an IN parameter). The size used is the size passed to the _IOC() macro. To do what you want it sounds like you want your handler to take something like: struct var_ioctl { int len; void *data; }; Then then handler itself would have to use copyin/copyout to access the data. There's no simpler way. Cheers, matthew From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 02:50:22 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1DB02AE9; Mon, 21 Jan 2013 02:50:22 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id F2FED29F; Mon, 21 Jan 2013 02:50:21 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id r0L2oKJm068072; Sun, 20 Jan 2013 18:50:21 -0800 (PST) (envelope-from yuri@rawbw.com) Message-ID: <50FCACEC.8000100@rawbw.com> Date: Sun, 20 Jan 2013 18:50:20 -0800 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130112 Thunderbird/17.0.2 MIME-Version: 1.0 To: mdf@freebsd.org Subject: Re: How to validate the variable size memory block in ioctl handler? References: <50FC7767.4050207@rawbw.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 02:50:22 -0000 On 01/20/2013 16:59, mdf@freebsd.org wrote: > To do what you want it sounds like you want your handler to take something like: > > struct var_ioctl { > int len; > void *data; > }; > > Then then handler itself would have to use copyin/copyout to access > the data. There's no simpler way. I think I found the simpler way, see the draft patch below. Generic macro _IOWRE will handle the case when the first integer in ioctl parameter holds the actual size of the structure. This way of passing the variable array sizes is quite common in various APIs. Other potential uses would also benefit from this. Yuri Index: sys/kern/sys_generic.c =================================================================== --- sys/kern/sys_generic.c (revision 245654) +++ sys/kern/sys_generic.c (working copy) @@ -640,6 +640,7 @@ int arg, error; u_int size; caddr_t data; + int vsize; if (uap->com > 0xffffffff) { printf( @@ -654,6 +655,14 @@ * copied to/from the user's address space. */ size = IOCPARM_LEN(com); + if (size == IOC_VARIABLE) { + /* first integer has the length of the memory */ + error = copyin(uap->data, (caddr_t)&vsize, sizeof(vsize)); + if (error) + return (error); + size = (u_int)vsize; + } if ((size > IOCPARM_MAX) || ((com & (IOC_VOID | IOC_IN | IOC_OUT)) == 0) || #if defined(COMPAT_FREEBSD5) || defined(COMPAT_FREEBSD4) || defined(COMPAT_43) Index: sys/sys/ioccom.h =================================================================== --- sys/sys/ioccom.h (revision 245654) +++ sys/sys/ioccom.h (working copy) @@ -50,6 +50,7 @@ #define IOC_IN 0x80000000 /* copy in parameters */ #define IOC_INOUT (IOC_IN|IOC_OUT) #define IOC_DIRMASK (IOC_VOID|IOC_OUT|IOC_IN) +#define IOC_VARIABLE IOCPARM_MASK /* parameters size in parameters */ #define _IOC(inout,group,num,len) ((unsigned long) \ ((inout) | (((len) & IOCPARM_MASK) << 16) | ((group) << 8) | (num))) @@ -59,6 +60,9 @@ #define _IOW(g,n,t) _IOC(IOC_IN, (g), (n), sizeof(t)) /* this should be _IORW, but stdio got there first */ #define _IOWR(g,n,t) _IOC(IOC_INOUT, (g), (n), sizeof(t)) +#define _IORE(g,n) _IOC(IOC_OUT, (g), (n), IOC_VARIABLE) +#define _IOWE(g,n) _IOC(IOC_IN, (g), (n), IOC_VARIABLE) +#define _IOWRE(g,n) _IOC(IOC_INOUT, (g), (n), IOC_VARIABLE) #ifdef _KERNEL From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 03:15:44 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 40CF68ED for ; Mon, 21 Jan 2013 03:15:44 +0000 (UTC) (envelope-from mdf356@gmail.com) Received: from mail-qa0-f42.google.com (mail-qa0-f42.google.com [209.85.216.42]) by mx1.freebsd.org (Postfix) with ESMTP id 07A9A636 for ; Mon, 21 Jan 2013 03:15:43 +0000 (UTC) Received: by mail-qa0-f42.google.com with SMTP id hg5so6447676qab.8 for ; Sun, 20 Jan 2013 19:15:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=xrThzg3kKJNNeOb4/iV9ZnloALJ/aFzOgvK1GqsnLtI=; b=qD4Y4NO0HtGTjSxUEYJzmQlzP76RN2pvRlYG1RuBlGZtt/wN3dtKl5yIz5fdzs/TTD aw2oPR4ysos3xfMVv3hGNnd3FlCglQnE/GZklk02KU9psZa54FY9rWmhuhidmhFFubtq LBg+MWVGRBW909s14BlAsqu499fC2qvdDK/LfaYPR+EZWa1LIehJn0tIyxHT/gcckHxn 3ymUtLLvzlvjZBmee31RsyGENGrO1HJf1xP5a48jlRnwSvp2O9iJ7Wqolm3vjppc0Crx Bue0cXKpHa1FLesEqYTxXpSGNzpVDXi4Dj9irILb/3+/5D9bJUiXyuqad5b31c2DOExE LqBg== MIME-Version: 1.0 X-Received: by 10.229.203.28 with SMTP id fg28mr412857qcb.103.1358738137438; Sun, 20 Jan 2013 19:15:37 -0800 (PST) Sender: mdf356@gmail.com Received: by 10.229.156.18 with HTTP; Sun, 20 Jan 2013 19:15:37 -0800 (PST) In-Reply-To: <50FCACEC.8000100@rawbw.com> References: <50FC7767.4050207@rawbw.com> <50FCACEC.8000100@rawbw.com> Date: Sun, 20 Jan 2013 19:15:37 -0800 X-Google-Sender-Auth: W8LwcusvKTK56ivMHGvol2kWUa4 Message-ID: Subject: Re: How to validate the variable size memory block in ioctl handler? From: mdf@FreeBSD.org To: Yuri Content-Type: text/plain; charset=ISO-8859-1 Cc: hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 03:15:44 -0000 On Sun, Jan 20, 2013 at 6:50 PM, Yuri wrote: > On 01/20/2013 16:59, mdf@freebsd.org wrote: > > To do what you want it sounds like you want your handler to take something > like: > > struct var_ioctl { > int len; > void *data; > }; > > Then then handler itself would have to use copyin/copyout to access > the data. There's no simpler way. > > > I think I found the simpler way, see the draft patch below. > Generic macro _IOWRE will handle the case when the first integer in ioctl > parameter holds the actual size of the structure. > This way of passing the variable array sizes is quite common in various > APIs. > Other potential uses would also benefit from this. > > Yuri > > > Index: sys/kern/sys_generic.c > =================================================================== > --- sys/kern/sys_generic.c (revision 245654) > +++ sys/kern/sys_generic.c (working copy) > @@ -640,6 +640,7 @@ > int arg, error; > u_int size; > caddr_t data; > + int vsize; > > if (uap->com > 0xffffffff) { > printf( > @@ -654,6 +655,14 @@ > * copied to/from the user's address space. > */ > size = IOCPARM_LEN(com); > + if (size == IOC_VARIABLE) { > + /* first integer has the length of the memory */ > + error = copyin(uap->data, (caddr_t)&vsize, sizeof(vsize)); > + if (error) > + return (error); > + size = (u_int)vsize; > + } > if ((size > IOCPARM_MAX) || > ((com & (IOC_VOID | IOC_IN | IOC_OUT)) == 0) || > #if defined(COMPAT_FREEBSD5) || defined(COMPAT_FREEBSD4) || > defined(COMPAT_43) > Index: sys/sys/ioccom.h > =================================================================== > --- sys/sys/ioccom.h (revision 245654) > +++ sys/sys/ioccom.h (working copy) > @@ -50,6 +50,7 @@ > #define IOC_IN 0x80000000 /* copy in parameters */ > #define IOC_INOUT (IOC_IN|IOC_OUT) > #define IOC_DIRMASK (IOC_VOID|IOC_OUT|IOC_IN) > +#define IOC_VARIABLE IOCPARM_MASK /* parameters size in parameters */ > > #define _IOC(inout,group,num,len) ((unsigned long) \ > ((inout) | (((len) & IOCPARM_MASK) << 16) | ((group) << 8) | (num))) > @@ -59,6 +60,9 @@ > #define _IOW(g,n,t) _IOC(IOC_IN, (g), (n), sizeof(t)) > /* this should be _IORW, but stdio got there first */ > #define _IOWR(g,n,t) _IOC(IOC_INOUT, (g), (n), sizeof(t)) > +#define _IORE(g,n) _IOC(IOC_OUT, (g), (n), IOC_VARIABLE) > +#define _IOWE(g,n) _IOC(IOC_IN, (g), (n), IOC_VARIABLE) > +#define _IOWRE(g,n) _IOC(IOC_INOUT, (g), (n), IOC_VARIABLE) > > #ifdef _KERNEL This would be fine for a local patch but it breaks existing (valid) uses that have exactly 8191 bytes of data, so it wouldn't be suitable for the main FreeBSD repository. Also, in general one wants to have limits on syscalls that can force a kernel malloc of any size, as it leads to denial of service attacks or crashes by requesting the kernel over-allocate memory. Cheers, matthew From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 03:55:11 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 5A09C3E0; Mon, 21 Jan 2013 03:55:11 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id 438747CB; Mon, 21 Jan 2013 03:55:11 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id r0L3tAnp075930; Sun, 20 Jan 2013 19:55:10 -0800 (PST) (envelope-from yuri@rawbw.com) Message-ID: <50FCBC1D.4070905@rawbw.com> Date: Sun, 20 Jan 2013 19:55:09 -0800 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130112 Thunderbird/17.0.2 MIME-Version: 1.0 To: mdf@freebsd.org Subject: Re: How to validate the variable size memory block in ioctl handler? References: <50FC7767.4050207@rawbw.com> <50FCACEC.8000100@rawbw.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 03:55:11 -0000 On 01/20/2013 19:15, mdf@freebsd.org wrote: > This would be fine for a local patch but it breaks existing (valid) > uses that have exactly 8191 bytes of data, so it wouldn't be suitable > for the main FreeBSD repository. Also, in general one wants to have > limits on syscalls that can force a kernel malloc of any size, as it > leads to denial of service attacks or crashes by requesting the kernel > over-allocate memory. Both problems are easily fixable. Current len range can be preserved by encoding this case into an 'inout' parameter of _IOC instead. IOC_VOID is only used when no IOC_IN/IOC_OUT is set, so all 3 bits would mean _IORWE. And arbitrarily high parameter size can be explicitly limited in sys_generic.c to IOCPARM_MAX. Yuri From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 07:02:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C14442E8; Mon, 21 Jan 2013 07:02:56 +0000 (UTC) (envelope-from bra@fsn.hu) Received: from people.fsn.hu (people.fsn.hu [195.228.252.137]) by mx1.freebsd.org (Postfix) with ESMTP id 21EF8E54; Mon, 21 Jan 2013 07:02:55 +0000 (UTC) Received: by people.fsn.hu (Postfix, from userid 1001) id E6FE9F8FFE4; Mon, 21 Jan 2013 07:53:01 +0100 (CET) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.3 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MF-ACE0E1EA [pR: 6.5807] X-CRM114-CacheID: sfid-20130121_07525_18F2FEA6 X-CRM114-Status: Good ( pR: 6.5807 ) X-DSPAM-Result: Whitelisted X-DSPAM-Processed: Mon Jan 21 07:53:01 2013 X-DSPAM-Confidence: 0.7600 X-DSPAM-Probability: 0.0000 X-DSPAM-Signature: 50fce5cd875961076684440 X-DSPAM-Factors: 27, From*Attila Nagy , 0.00010, wrote+>, 0.00217, >+>, 0.00365, >+>, 0.00365, In-Reply-To*mail.gmail.com>, 0.00375, References*mail.gmail.com>, 0.00389, wrote, 0.00573, and+set, 0.00616, zfs, 0.00692, Subject*ZFS, 0.00692, I+haven't, 0.00921, >+1), 0.01000, Date*07+52, 0.99000, that+long, 0.01000, Date*52+58, 0.99000, here?, 0.01375, From*Attila, 0.01825, relevant, 0.01825, To*gmail.com>, 0.02321, delay+or, 0.02713, values, 0.02713, be+set, 0.02713, From*Nagy, 0.02713, (maybe, 0.02713, 23+26, 0.02713, Number+of, 0.02798, X-Spambayes-Classification: ham; 0.00 Received: from japan.t-online.private (japan.t-online.co.hu [195.228.243.99]) by people.fsn.hu (Postfix) with ESMTPSA id F0D9DF8FFD5; Mon, 21 Jan 2013 07:52:58 +0100 (CET) Message-ID: <50FCE5CA.7080006@fsn.hu> Date: Mon, 21 Jan 2013 07:52:58 +0100 From: Attila Nagy MIME-Version: 1.0 To: Zaphod Beeblebrox Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 07:02:56 -0000 Hi, On 01/20/13 23:26, Zaphod Beeblebrox wrote: > > 1) a pause for scrub... such that long scrubs could be paused during > working hours. > > While not exactly pause, but isn't playing with scrub_delay works here? vfs.zfs.scrub_delay: Number of ticks to delay scrub Set this to a high value during working hours, and set back to its normal (or even below) value off working hours. (maybe resilver delay, or some other values should also be set, I haven't yet read the relevant code) From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 11:12:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2CF77807; Mon, 21 Jan 2013 11:12:56 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 983D8930; Mon, 21 Jan 2013 11:12:55 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0LBCjTd001085; Mon, 21 Jan 2013 12:12:45 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0LBCjTT001082; Mon, 21 Jan 2013 12:12:45 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Mon, 21 Jan 2013 12:12:45 +0100 (CET) From: Wojciech Puchar To: Zaphod Beeblebrox Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Mon, 21 Jan 2013 12:12:46 +0100 (CET) Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 11:12:56 -0000 > Please don't misinterpret this post: ZFS's ability to recover from fairly > catastrophic failures is pretty stellar, but I'm wondering if there can be from my testing it is exactly opposite. You have to see a difference between marketing and reality. > a little room for improvement. > > I use RAID pretty much everywhere. I don't like to loose data and disks > are cheap. I have a fair amount of experience with all flavors ... and ZFS just like me. And because i want performance and - as you described - disks are cheap - i use RAID-1 (gmirror). > has become a go-to filesystem for most of my applications. My applications doesn't tolerate low performance, overcomplexity and high risk of data loss. That's why i use properly tuned UFS, gmirror, and prefer not to use gstripe but have multiple filesystems > One of the best recommendations I can give for ZFS is it's > crash-recoverability. Which is marketing, not truth. If you want bullet-proof recoverability, UFS beats everything i've ever seen. If you want FAST crash recovery, use softupdates+journal, available in FreeBSD 9. > As a counter example, if you have most hardware RAID > going or a software whole-disk raid, after a crash it will generally > declare one disk as good and the other disk as "to be repaired" ... after > which a full surface scan of the affected disks --- reading one and writing > the other --- ensues. true. gmirror do it, but you can defer mirror rebuild, which i use. I have a script that send me a mail when gmirror is degraded, and i - after finding out the cause of problem, and possibly replacing disk - run rebuild after work hours, so no slowdown is experienced. > ZFS is smart on this point: it will recover on reboot with a minimum amount > of fuss. Even if you dislodge a drive ... so that it's missing the last > 'n' transactions, ZFS seems to figure this out (which I thought was extra > cudos). Yes this is marketing. practice is somehow different. as you discovered yourself. > > MY PROBLEM comes from problems that scrub can fix. > > Let's talk, in specific, about my home array. It has 9x 1.5T and 8x 2T in > a RAID-Z configuration (2 sets, obviously). While RAID-Z is already a king of bad performance, i assume you mean two POOLS, not 2 RAID-Z sets. if you mixed 2 different RAID-Z pools you would spread load unevenly and make performance even worse. > > A full scrub of my drives weighs in at 36 hours or so. which is funny as ZFS is marketed as doing this efficient (like checking only used space). dd if=/dev/disk of=/dev/null bs=2m would take no more than a few hours. and you may do all in parallel. > vr2/cvs:<0x1c1> > > Now ... this is just an example: after each scrub, the hex number was seems like scrub simply not do it's work right. > before the old error was cleared. Then this new error gets similarly > cleared by the next scrub. It seems that if the scrub returned to this new > found error after fixing the "known" errors, this could save whole new > scrub runs from being required. Even better - use UFS. For both bullet proof recoverability and performance. If you need help in tuning you may ask me privately. From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 11:13:46 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EBD3D999; Mon, 21 Jan 2013 11:13:46 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 5B2BB94E; Mon, 21 Jan 2013 11:13:46 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0LBDhPm001251; Mon, 21 Jan 2013 12:13:43 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0LBDhXr001248; Mon, 21 Jan 2013 12:13:43 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Mon, 21 Jan 2013 12:13:43 +0100 (CET) From: Wojciech Puchar To: Dieter BSD Subject: Re: IBM blade server abysmal disk write performances In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Mon, 21 Jan 2013 12:13:43 +0100 (CET) Cc: freebsd-hackers@freebsd.org, gibbs@freebsd.org, scottl@freebsd.org, mjacob@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 11:13:47 -0000 > > Interesting. Is there a way to tell, other than coming up with > some way to actually test it, whether a particular drive waits until my crappy laptop hard drive behave the same no matter if i turn write cache on, off or leave default. seems like it is always on. From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 11:14:44 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AEA61AAC; Mon, 21 Jan 2013 11:14:44 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 0AAD996B; Mon, 21 Jan 2013 11:14:43 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0LBEfvc001519; Mon, 21 Jan 2013 12:14:41 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0LBEfpr001508; Mon, 21 Jan 2013 12:14:41 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Mon, 21 Jan 2013 12:14:41 +0100 (CET) From: Wojciech Puchar To: Scott Long Subject: Re: IBM blade server abysmal disk write performances In-Reply-To: Message-ID: References: <6C0B86E6-195C-4D35-AE40-3D2F9F6D28FB@yahoo.com> <1358544287.32417.251.camel@revolution.hippie.lan> <50F9CFEB.5060302@feral.com> <50F9DB9A.9050303@gmail.com> <50FABB71.6050406@freebsd.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Mon, 21 Jan 2013 12:14:41 +0100 (CET) Cc: Karim Fodil-Lemelin , "freebsd-hackers@freebsd.org Hackers" , "gibbs@FreeBSD.org Gibbs" , "mjacob@FreeBSD.org Jacob" X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 11:14:44 -0000 > With SATA vs SAS, the gap is much narrower. The TCQ command set > (still used by SAS) is still better than the NCQ command set, but the in what point TCQ is exactly better than SATA NCQ. From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 11:15:27 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 792DFBBF; Mon, 21 Jan 2013 11:15:27 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id D42B797F; Mon, 21 Jan 2013 11:15:26 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0LBFOop001722; Mon, 21 Jan 2013 12:15:24 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0LBFOgT001719; Mon, 21 Jan 2013 12:15:24 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Mon, 21 Jan 2013 12:15:24 +0100 (CET) From: Wojciech Puchar To: Don Lewis Subject: Re: IBM blade server abysmal disk write performances In-Reply-To: <201301200124.r0K1OAld019768@gw.catspoiler.org> Message-ID: References: <201301200124.r0K1OAld019768@gw.catspoiler.org> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Mon, 21 Jan 2013 12:15:24 +0100 (CET) Cc: scott4long@yahoo.com, freebsd-hackers@FreeBSD.org, dieterbsd@gmail.com, scottl@FreeBSD.org, gibbs@FreeBSD.org, mjacob@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 11:15:27 -0000 > I've had my share of sudden UPS failures over the years. Probably more everything can fail. That's why serious sysadmins do proper backup, no matter what "safety features" are used in their servers. From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 13:19:14 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C7D63666 for ; Mon, 21 Jan 2013 13:19:14 +0000 (UTC) (envelope-from feld@feld.me) Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2]) by mx1.freebsd.org (Postfix) with ESMTP id 9560EFF0 for ; Mon, 21 Jan 2013 13:19:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me; s=blargle; h=Message-Id:From:Mime-Version:Subject:To:Date:Content-Type; bh=N+X1xoaybnyGoduT2ETARgqbCxsNt3+xFdrMmgkYcts=; b=KS0pWEUab0Udy7k1EkVJ7O0gpSzzcI54IMK08hHblGRFzATQBWcaAkdf6xcaNDkXtgrfd74Ql1IenKdSMfDYvA+IQb9kYl1yPzrKB7cL2XJlzdLZh/I+PfvNZphBC3Z4; Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org) by feld.me with esmtp (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1TxHHN-0007MS-EB for freebsd-hackers@freebsd.org; Mon, 21 Jan 2013 07:19:13 -0600 Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4) with esmtpsa id 1358774347-17998-17996/5/1; Mon, 21 Jan 2013 13:19:07 +0000 Content-Type: text/plain; format=flowed; delsp=yes Date: Mon, 21 Jan 2013 07:19:07 -0600 To: freebsd-hackers@freebsd.org Subject: ipv6 equivalent to ipv4_addr_IF in network.subr? Mime-Version: 1.0 From: Mark Felder Message-Id: User-Agent: Opera Mail/12.12 (FreeBSD) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 13:19:14 -0000 Hi all, At work we have several standalone webservers with lots of IPs... let's say x.x.x.100 - 200. That's a LOT of "ifconfig_IF_alias0, alias1, alias2..." to maintain, and it's also painful when we need to move an IP to a different server which happens occasionally. The right solution for this is to use ranges with ipvr_addr_IF="x.x.x.100-200/24" and if you need to move an IP you just create a gap.For example, if we needed to move the IP .126 we'd just change it to: > ipv4_addr_IF="x.x.x.100-125/24 x.x.x.127-200/32" This works great! But what about IPv6? We use corresponding IPv6 IPs so if a customer actually wants IPv6 enabled it's as easy as adding the AAAA record. So this leaves us with having to maintain 100 aliases again, and when you create a gap you have to renumber all of those alias numbers or leave things like "ifconfig_IF_alias67="inet6 up" strewn throughout the config to fill the gaps. It's just not something worth maintaining long term and I'd like a way to do ranges for IPv6 as well. I've been playing with adding ipv6_addr_IF support to network.subr and it certainly works but the main problem is that I'm only dealing with decimal ranges. This would *not* work with any IPv6 hex ranges unless someone more clever than I can think of a good way to code that up. Mostly a blatant ripoff of ipv4_addrs_common() we come up with this: > # ipv6_addrs_common if action > # Evaluate the ifconfig_if_ipv6 arguments for interface $if and > # use $action to add or remove ipv6 addresses from $if. > ipv6_addrs_common() > { > local _ret _if _action _cidr _cidr_addr > local _ipaddr _prefixlen _range _ipnet _iplow _iphigh _ipcount > _ret=1 > _if=$1 > _action=$2 > # get ipv6-addresses > cidr_addr=`get_if_var $_if ipv6_addrs_IF` > for _cidr in ${cidr_addr}; do > _ipaddr=${_cidr%%/*} > _prefixlen="/"${_cidr##*/} > _range=${_ipaddr##*:} > _ipnet=${_ipaddr%:*} > _iplow=${_range%-*} > _iphigh=${_range#*-} > # clear prefixlen when removing aliases > if [ "${_action}" = "-alias" ]; then > _prefixlen="" > fi > _ipcount=${_iplow} > while [ "${_ipcount}" -le "${_iphigh}" ]; do > eval "ifconfig ${_if} inet6 ${_action} > ${_ipnet}:${_ipcount}${_prefixlen}" > _ipcount=$((${_ipcount}+1)) > _ret=0 > # only the first ipaddr in a subnet need the > real prefixlen > if [ "${_action}" != "-alias" ]; then > _prefixlen="/128" > fi > done > done > return $_ret > } > But again, has no concept of any non-decimal ranges. However, this would still be invaluable to us and perhaps anyone else out there managing large numbers of IPs on a server. So two questions: 1) With its current limitations (decimal ranges only) would this ever be accepted into network.subr? 2) Can anyone assist me with correctly modifying ipv6if() so this works standalone? Without ipv6if() modification it will always return 1 and skip setting up any ipv6 addresses on the interface because it doesn't find any ifconfig_IF_ipv6 or ipv6_ifconfig_IF in rc.conf. Thanks! From owner-freebsd-hackers@FreeBSD.ORG Mon Jan 21 21:06:10 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B6416FFF; Mon, 21 Jan 2013 21:06:10 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.dawidek.net (garage.dawidek.net [91.121.88.72]) by mx1.freebsd.org (Postfix) with ESMTP id 7D0A1D92; Mon, 21 Jan 2013 21:06:10 +0000 (UTC) Received: from localhost (unknown [86.188.145.194]) by mail.dawidek.net (Postfix) with ESMTPSA id B9CF875A; Mon, 21 Jan 2013 22:03:36 +0100 (CET) Date: Mon, 21 Jan 2013 22:06:45 +0100 From: Pawel Jakub Dawidek To: mdf@FreeBSD.org Subject: Re: kmem_map auto-sizing and size dependencies Message-ID: <20130121210645.GC1341@garage.freebsd.pl> References: <50F96A67.9080203@freebsd.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="iFRdW5/EC4oqxDHL" Content-Disposition: inline In-Reply-To: X-OS: FreeBSD 10.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-hackers , freebsd-current@freebsd.org, Andre Oppermann X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 21:06:10 -0000 --iFRdW5/EC4oqxDHL Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Jan 18, 2013 at 08:26:04AM -0800, mdf@FreeBSD.org wrote: > > Should it be set to a larger initial value based on min(physical,KVM) = space > > available? >=20 > It needs to be smaller than the physical space, [...] Or larger, as the address space can get fragmented and you might not be able to allocate memory even if you have physical pages available. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://tupytaj.pl --iFRdW5/EC4oqxDHL Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlD9reUACgkQForvXbEpPzRQuACgmpMuaWnlzrwGLDg8via2mpRB H/MAn0osPB9G8vejrumWSQaYnHc8khDu =hBju -----END PGP SIGNATURE----- --iFRdW5/EC4oqxDHL-- From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 06:35:51 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 1A699365 for ; Tue, 22 Jan 2013 06:35:51 +0000 (UTC) (envelope-from bogorodskiy@gmail.com) Received: from mail-la0-f41.google.com (mail-la0-f41.google.com [209.85.215.41]) by mx1.freebsd.org (Postfix) with ESMTP id 94FE07EA for ; Tue, 22 Jan 2013 06:35:50 +0000 (UTC) Received: by mail-la0-f41.google.com with SMTP id fo12so1544668lab.0 for ; Mon, 21 Jan 2013 22:35:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:date:from:to:subject:message-id:mime-version :content-type:content-disposition:user-agent; bh=iBP5r/cvTfqiyhWAuvvOFpsjhqqo97u317WmoaJgrZs=; b=QQI4XZgJqNizF89OMMAQhjnUvpW1s+s/+/rt1mWWliC8OEmeepKR3grBTrPK81CiuO Eoj4pVxmxd1hfdNRIxbvVQ4mgGkwa5RaITws/qcWO3DWJka1zfPrp6JTrKcFBQNUWAEl 8JmSgyB/Ou2arT8BeUnDhV3EYUUnYBBHT9Rm+spiVQ8ADfzsCIduDcJToaGoRRVIcpoC tv7r6jqTAwjKpfwFLSZZWAsXLgV+RtdWTZX1JhgD5Zaob7A7ziyNP6lXWALDnqmNKxRH UQ289uGjbTrKyJ7u3VVSzsIltYOxOhCr48M+CoAkPakfKuRcUwX0dTob+0p/eg3ijTit SFFw== X-Received: by 10.112.10.71 with SMTP id g7mr8633746lbb.70.1358836549115; Mon, 21 Jan 2013 22:35:49 -0800 (PST) Received: from ritual.srt.mirantis.net ([91.207.132.67]) by mx.google.com with ESMTPS id ee5sm6426596lbb.14.2013.01.21.22.35.47 (version=TLSv1 cipher=RC4-SHA bits=128/128); Mon, 21 Jan 2013 22:35:48 -0800 (PST) Sender: Roman Bogorodskiy Date: Tue, 22 Jan 2013 10:34:51 +0400 From: Roman Bogorodskiy To: freebsd-hackers@freebsd.org Subject: Creating tap interface with custom name Message-ID: <20130122063450.GB39359@ritual.srt.mirantis.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/WwmFnJnmDyWGHa4" Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 06:35:51 -0000 --/WwmFnJnmDyWGHa4 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Hi, I have a thing I want to do and I'm not sure what's the best way to do it, hopefully somebody could suggest a solution. The idea is to create a 'tap' interfaces with a custom name, e.g. foobarN instead of tapN, from an application. So the current workflow looks this way: - a request comes to create new 'foobar' interface - 'tap' device is created via SIOCIFCREATE2 - code finds the first free interface name, that it could use by going from foobar0, foobar1 etc until it finds an unused name - code renames 'tap' device to the new 'foobarN' via SIOCSIFNAME As you can see, this code is not optimal and very fragile. It spends a lot of time trying to find an appropriate interface name and it will break if something happens while it's running (tap device will be deleted by user before it gets renamed, new foobarN will be created manually before the code will create it, etc). Is there any better to implement this? Roman Bogorodskiy --/WwmFnJnmDyWGHa4 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.12 (FreeBSD) iQEcBAEBAgAGBQJQ/jMJAAoJEMltX/4IwiJq6sMH/A5lw1WCWzUxdI+SKU1Ay+AH t0HGcYxmMS8ybHF4ehahAFli0lrp5slvdEXQ90feaByah+5UYEJm++9pQRrXOe+n e3hL3Dsk/dabK+x0zTvy/0qs6/d8+R01v+Wt2qqIlJiTVkqsvY7pwNXZlJeimfMY 9x7hY4Ewu8EvvG8rQwkFQ9Q7SpiBKQvHMhDhliBiM1H5+uRF16iP3oFfD/3hJNa6 3Tt8EIVa3BuCLzD59ddi58WbgOdTTctcBoyJZnr0GKkBPkeMi27F5uE0U5SbCedI /WdtCx/3ZNZGSDWy6TE1VYkMoWqyeg4sfCM+U9xgMx40lDKK/awd49wkfvf/QYg= =1sJS -----END PGP SIGNATURE----- --/WwmFnJnmDyWGHa4-- From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 07:36:55 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id B2E1FA07; Tue, 22 Jan 2013 07:36:55 +0000 (UTC) (envelope-from peter@rulingia.com) Received: from vps.rulingia.com (host-122-100-2-194.octopus.com.au [122.100.2.194]) by mx1.freebsd.org (Postfix) with ESMTP id 4545C9FF; Tue, 22 Jan 2013 07:36:54 +0000 (UTC) Received: from server.rulingia.com (c220-239-253-186.belrs5.nsw.optusnet.com.au [220.239.253.186]) by vps.rulingia.com (8.14.5/8.14.5) with ESMTP id r0M7anI0064423 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 22 Jan 2013 18:36:51 +1100 (EST) (envelope-from peter@rulingia.com) X-Bogosity: Ham, spamicity=0.000000 Received: from server.rulingia.com (localhost.rulingia.com [127.0.0.1]) by server.rulingia.com (8.14.5/8.14.5) with ESMTP id r0M7ahxi050714 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jan 2013 18:36:43 +1100 (EST) (envelope-from peter@server.rulingia.com) Received: (from peter@localhost) by server.rulingia.com (8.14.5/8.14.5/Submit) id r0M7afm9050710; Tue, 22 Jan 2013 18:36:41 +1100 (EST) (envelope-from peter) Date: Tue, 22 Jan 2013 18:36:41 +1100 From: Peter Jeremy To: Wojciech Puchar Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. Message-ID: <20130122073641.GH30633@server.rulingia.com> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="n+lFg1Zro7sl44OB" Content-Disposition: inline In-Reply-To: X-PGP-Key: http://www.rulingia.com/keys/peter.pgp User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 07:36:55 -0000 --n+lFg1Zro7sl44OB Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2013-Jan-21 12:12:45 +0100, Wojciech Puchar wrote: >That's why i use properly tuned UFS, gmirror, and prefer not to use=20 >gstripe but have multiple filesystems When I started using ZFS, I didn't fully trust it so I had a gmirrored UFS root (including a full src tree). Over time, I found that gmirror plus UFS was giving me more problems than ZFS. In particular, I was seeing behaviour that suggested that the mirrors were out of sync, even though gmirror insisted they were in sync. Unfortunately, there is no way to get gmirror to verify the mirroring or to get UFS to check correctness of data or metadata (fsck can only check metadata consistency). I've since moved to a ZFS root. >Which is marketing, not truth. If you want bullet-proof recoverability,=20 >UFS beats everything i've ever seen. I've seen the opposite. One big difference is that ZFS is designed to ensure it returns the data that was written to it whereas UFS just returns the bytes it finds where it thinks it wrote your data. One side effect of this is that ZFS is far fussier about hardware quality - since it checksums everything, it is likely to pick up glitches that UFS doesn't notice. >If you want FAST crash recovery, use softupdates+journal, available in=20 >FreeBSD 9. I'll admit that I haven't used SU+J but one downside of SU+J is that it prevents the use of snapshots, which in turn prevents the (safe) use of dump(8) (which is the official tool for UFS backups) on live filesystems. >> of fuss. Even if you dislodge a drive ... so that it's missing the last >> 'n' transactions, ZFS seems to figure this out (which I thought was extra >> cudos). > >Yes this is marketing. practice is somehow different. as you discovered=20 >yourself. Most of the time this works as designed. It's possible there are bugs in the implementation. >While RAID-Z is already a king of bad performance, I don't believe RAID-Z is any worse than RAID5. Do you have any actual measurements to back up your claim? > i assume=20 >you mean two POOLS, not 2 RAID-Z sets. if you mixed 2 different RAID-Z poo= ls you would=20 >spread load unevenly and make performance even worse. There's no real reason why you could't have 2 different vdevs in the same pool. >> A full scrub of my drives weighs in at 36 hours or so. > >which is funny as ZFS is marketed as doing this efficient (like checking= =20 >only used space). It _does_ only check used space but it does so in logical order rather than physical order. For a fragmented pool, this means random accesses. >Even better - use UFS. Then you'll never know that your data has been corrupted. >For both bullet proof recoverability and performance. use ZFS. --=20 Peter Jeremy --n+lFg1Zro7sl44OB Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iEYEARECAAYFAlD+QYkACgkQ/opHv/APuIdH7QCfQcSzk1BtPmFuSWNBqH/UUZL0 r+kAoKU/ks97MatHjPwjXl2BarlMyOzg =KFNN -----END PGP SIGNATURE----- --n+lFg1Zro7sl44OB-- From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 11:21:22 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id A0246BE1 for ; Tue, 22 Jan 2013 11:21:22 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 580A2A21 for ; Tue, 22 Jan 2013 11:21:22 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1Txbuq-0007Zr-MY for freebsd-hackers@freebsd.org; Tue, 22 Jan 2013 13:21:20 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: FreeBSD Hackers Subject: pmbr: Boot loader too large Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 22 Jan 2013 13:21:20 +0200 From: Daniel Braniss Message-ID: X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 11:21:22 -0000 hi, this is the output from gpart show: => 34 976773101 ada0 GPT (465G) 34 2048 1 freebsd-boot (1.0M) 2082 4194304 2 freebsd-ufs [bootme] (2.0G) 4196386 12582912 3 freebsd-swap (6.0G) 16779298 959993837 4 freebsd-zfs (457G) => 34 976773101 ada1 GPT (465G) 34 2048 1 freebsd-boot (1.0M) 2082 4194304 2 freebsd-ufs (2.0G) 4196386 12582912 3 freebsd-swap (6.0G) 16779298 959993837 4 freebsd-zfs (457G) I also did: gpart bootcode -b /boot/pmbr ada0 I'm trying to boot and get Boot loader too large not matter if I boot from disk or pxe. The pmbr is 512 bytes, so what causes it to overshoot? I don't know x86 assembler (nor want to :-), but the comment says: 545k should be enough so what's going on? thanks, danny From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 11:26:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id DB997D6F for ; Tue, 22 Jan 2013 11:26:56 +0000 (UTC) (envelope-from trond@fagskolen.gjovik.no) Received: from smtp.fagskolen.gjovik.no (smtp.fagskolen.gjovik.no [IPv6:2001:700:1100:1:200:ff:fe00:b]) by mx1.freebsd.org (Postfix) with ESMTP id 52070A6D for ; Tue, 22 Jan 2013 11:26:55 +0000 (UTC) Received: from mail.fig.ol.no (localhost [127.0.0.1]) by mail.fig.ol.no (8.14.5/8.14.5) with ESMTP id r0MBOxRP049901 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jan 2013 12:24:59 +0100 (CET) (envelope-from trond@fagskolen.gjovik.no) Received: from localhost (trond@localhost) by mail.fig.ol.no (8.14.5/8.14.5/Submit) with ESMTP id r0MBOxMK049898; Tue, 22 Jan 2013 12:24:59 +0100 (CET) (envelope-from trond@fagskolen.gjovik.no) X-Authentication-Warning: mail.fig.ol.no: trond owned process doing -bs Date: Tue, 22 Jan 2013 12:24:59 +0100 (CET) From: =?ISO-8859-1?Q?Trond_Endrest=F8l?= Sender: Trond.Endrestol@fagskolen.gjovik.no To: Daniel Braniss Subject: Re: pmbr: Boot loader too large In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) Organization: Fagskolen Innlandet OpenPGP: url=http://fig.ol.no/~trond/trond.key MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="2055831798-439718327-1358853899=:41917" X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on mail.fig.ol.no Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 11:26:56 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --2055831798-439718327-1358853899=:41917 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT On Tue, 22 Jan 2013 13:21+0200, Daniel Braniss wrote: > hi, > this is the output from gpart show: > => 34 976773101 ada0 GPT (465G) > 34 2048 1 freebsd-boot (1.0M) > 2082 4194304 2 freebsd-ufs [bootme] (2.0G) > 4196386 12582912 3 freebsd-swap (6.0G) > 16779298 959993837 4 freebsd-zfs (457G) > > => 34 976773101 ada1 GPT (465G) > 34 2048 1 freebsd-boot (1.0M) > 2082 4194304 2 freebsd-ufs (2.0G) > 4196386 12582912 3 freebsd-swap (6.0G) > 16779298 959993837 4 freebsd-zfs (457G) > > I also did: > gpart bootcode -b /boot/pmbr ada0 > > I'm trying to boot and get > Boot loader too large > > not matter if I boot from disk or pxe. > The pmbr is 512 bytes, so what causes it to overshoot? > I don't know x86 assembler (nor want to :-), but the comment says: > 545k should be enough > so what's going on? > thanks, > danny A freebsd-boot partition must never be larger than 128K, i.e. 65536 512B blocks. -- +-------------------------------+------------------------------------+ | Vennlig hilsen, | Best regards, | | Trond Endrestøl, | Trond Endrestøl, | | IT-ansvarlig, | System administrator, | | Fagskolen Innlandet, | Gjøvik Technical College, Norway, | | tlf. mob. 952 62 567, | Cellular...: +47 952 62 567, | | sentralbord 61 14 54 00. | Switchboard: +47 61 14 54 00. | +-------------------------------+------------------------------------+ --2055831798-439718327-1358853899=:41917-- From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 11:42:31 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 5D6E2F89 for ; Tue, 22 Jan 2013 11:42:31 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 0AAE2B17 for ; Tue, 22 Jan 2013 11:42:30 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1TxcFC-00082M-E2 for freebsd-hackers@freebsd.org; Tue, 22 Jan 2013 13:42:22 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: FreeBSD Hackers Subject: Re: solved: pmbr: Boot loader too large In-reply-to: References: Comments: In-reply-to Daniel Braniss message dated "Tue, 22 Jan 2013 13:21:20 +0200." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 22 Jan 2013 13:42:22 +0200 From: Daniel Braniss Message-ID: X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 11:42:31 -0000 > hi, > this is the output from gpart show: > => 34 976773101 ada0 GPT (465G) > 34 2048 1 freebsd-boot (1.0M) > 2082 4194304 2 freebsd-ufs [bootme] (2.0G) > 4196386 12582912 3 freebsd-swap (6.0G) > 16779298 959993837 4 freebsd-zfs (457G) > > => 34 976773101 ada1 GPT (465G) > 34 2048 1 freebsd-boot (1.0M) > 2082 4194304 2 freebsd-ufs (2.0G) > 4196386 12582912 3 freebsd-swap (6.0G) > 16779298 959993837 4 freebsd-zfs (457G) > > I also did: > gpart bootcode -b /boot/pmbr ada0 > > I'm trying to boot and get > Boot loader too large > > not matter if I boot from disk or pxe. > The pmbr is 512 bytes, so what causes it to overshoot? > I don't know x86 assembler (nor want to :-), but the comment says: > 545k should be enough > so what's going on? never underestimate the human stupidity (mine in this case) nor of the boot. pmbr will load the whole partition, which was 1M, instead of the size of gptboot :-( reducing the size of the slice/partition fixed the issue. From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 12:24:35 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id E211CEFA for ; Tue, 22 Jan 2013 12:24:35 +0000 (UTC) (envelope-from trond@fagskolen.gjovik.no) Received: from smtp.fagskolen.gjovik.no (smtp.fagskolen.gjovik.no [IPv6:2001:700:1100:1:200:ff:fe00:b]) by mx1.freebsd.org (Postfix) with ESMTP id 701BCE94 for ; Tue, 22 Jan 2013 12:24:35 +0000 (UTC) Received: from mail.fig.ol.no (localhost [127.0.0.1]) by mail.fig.ol.no (8.14.5/8.14.5) with ESMTP id r0MCNtes050242 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jan 2013 13:23:55 +0100 (CET) (envelope-from trond@fagskolen.gjovik.no) Received: from localhost (trond@localhost) by mail.fig.ol.no (8.14.5/8.14.5/Submit) with ESMTP id r0MCNt75050239; Tue, 22 Jan 2013 13:23:55 +0100 (CET) (envelope-from trond@fagskolen.gjovik.no) X-Authentication-Warning: mail.fig.ol.no: trond owned process doing -bs Date: Tue, 22 Jan 2013 13:23:55 +0100 (CET) From: =?ISO-8859-1?Q?Trond_Endrest=F8l?= Sender: Trond.Endrestol@fagskolen.gjovik.no To: Daniel Braniss Subject: Re: pmbr: Boot loader too large In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) Organization: Fagskolen Innlandet OpenPGP: url=http://fig.ol.no/~trond/trond.key MIME-Version: 1.0 Content-Type: MULTIPART/Mixed; BOUNDARY="2055831798-439718327-1358853899=:41917" Content-ID: X-Spam-Status: No, score=-2.9 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on mail.fig.ol.no Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 12:24:35 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --2055831798-439718327-1358853899=:41917 Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-1 Content-Transfer-Encoding: 8BIT Content-ID: On Tue, 22 Jan 2013 12:24+0100, Trond Endrestøl wrote: > On Tue, 22 Jan 2013 13:21+0200, Daniel Braniss wrote: > > > hi, > > this is the output from gpart show: > > => 34 976773101 ada0 GPT (465G) > > 34 2048 1 freebsd-boot (1.0M) > > 2082 4194304 2 freebsd-ufs [bootme] (2.0G) > > 4196386 12582912 3 freebsd-swap (6.0G) > > 16779298 959993837 4 freebsd-zfs (457G) > > > > => 34 976773101 ada1 GPT (465G) > > 34 2048 1 freebsd-boot (1.0M) > > 2082 4194304 2 freebsd-ufs (2.0G) > > 4196386 12582912 3 freebsd-swap (6.0G) > > 16779298 959993837 4 freebsd-zfs (457G) > > > > I also did: > > gpart bootcode -b /boot/pmbr ada0 > > > > I'm trying to boot and get > > Boot loader too large > > > > not matter if I boot from disk or pxe. > > The pmbr is 512 bytes, so what causes it to overshoot? > > I don't know x86 assembler (nor want to :-), but the comment says: > > 545k should be enough > > so what's going on? > > thanks, > > danny > > A freebsd-boot partition must never be larger than 128K, i.e. 65536 > 512B blocks. I was partially right. Unless http://www.freebsd.org/doc/handbook/bsdinstall-partitioning.html is seriously outdated, then the maximum size of an freebsd-boot partition is 512K, i.e. 262144 512B blocks. ``Tip: Proper sector alignment provides the best performance, and making partition sizes even multiples of 4K bytes helps to ensure alignment on drives with either 512-byte or 4K-byte sectors. Generally, using partition sizes that are even multiples of 1M or 1G is the easiest way to make sure every partition starts at an even multiple of 4K. One exception: at present, the freebsd-boot partition should be no larger than 512K due to boot code limitations.'' Perhaps you should shrink the freebsd-boot partition and possibly reapply /boot/gptboot. -- +-------------------------------+------------------------------------+ | Vennlig hilsen, | Best regards, | | Trond Endrestøl, | Trond Endrestøl, | | IT-ansvarlig, | System administrator, | | Fagskolen Innlandet, | Gjøvik Technical College, Norway, | | tlf. mob. 952 62 567, | Cellular...: +47 952 62 567, | | sentralbord 61 14 54 00. | Switchboard: +47 61 14 54 00. | +-------------------------------+------------------------------------+ --2055831798-439718327-1358853899=:41917 Content-Type: TEXT/PLAIN; CHARSET=us-ascii Content-ID: Content-Description: Content-Disposition: INLINE _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" --2055831798-439718327-1358853899=:41917-- From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 12:41:25 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 618D38F4 for ; Tue, 22 Jan 2013 12:41:25 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 115996E for ; Tue, 22 Jan 2013 12:41:24 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1TxdAI-0008lP-PR; Tue, 22 Jan 2013 14:41:22 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: =?ISO-8859-1?Q?Trond_Endrest=F8l?= Subject: Re: pmbr: Boot loader too large In-reply-to: References: Comments: In-reply-to =?ISO-8859-1?Q?Trond_Endrest=F8l?= message dated "Tue, 22 Jan 2013 13:23:55 +0100." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 22 Jan 2013 14:41:22 +0200 From: Daniel Braniss Message-ID: Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 12:41:25 -0000 > This message is in MIME format. The first part should be readable text, > while the remaining parts are likely unreadable without MIME-aware tools. > > --2055831798-439718327-1358853899=:41917 > Content-Type: TEXT/PLAIN; CHARSET=ISO-8859-1 > Content-ID: > Content-Transfer-Encoding: quoted-printable > X-MIME-Autoconverted: from 8bit to quoted-printable by mail.fig.ol.no id r0MCNtes050242 > > On Tue, 22 Jan 2013 12:24+0100, Trond Endrest=F8l wrote: > > > On Tue, 22 Jan 2013 13:21+0200, Daniel Braniss wrote: > >=20 > > > hi, > > > this is the output from gpart show: > > > =3D> 34 976773101 ada0 GPT (465G) > > > 34 2048 1 freebsd-boot (1.0M) > > > 2082 4194304 2 freebsd-ufs [bootme] (2.0G) > > > 4196386 12582912 3 freebsd-swap (6.0G) > > > 16779298 959993837 4 freebsd-zfs (457G) > > >=20 > > > =3D> 34 976773101 ada1 GPT (465G) > > > 34 2048 1 freebsd-boot (1.0M) > > > 2082 4194304 2 freebsd-ufs (2.0G) > > > 4196386 12582912 3 freebsd-swap (6.0G) > > > 16779298 959993837 4 freebsd-zfs (457G) > > >=20 > > > I also did: > > > gpart bootcode -b /boot/pmbr ada0 > > >=20 > > > I'm trying to boot and get > > > Boot loader too large > > >=20 > > > not matter if I boot from disk or pxe. > > > The pmbr is 512 bytes, so what causes it to overshoot?=20 > > > I don't know x86 assembler (nor want to :-), but the comment says:=20 > > > 545k should be enough > > > so what's going on? > > > thanks, > > > danny > >=20 > > A freebsd-boot partition must never be larger than 128K, i.e. 65536=20 > > 512B blocks. > > I was partially right. Unless=20 > http://www.freebsd.org/doc/handbook/bsdinstall-partitioning.html is=20 > seriously outdated, then the maximum size of an freebsd-boot partition=20 > is 512K, i.e. 262144 512B blocks. > > ``Tip: Proper sector alignment provides the best performance, and=20 > making partition sizes even multiples of 4K bytes helps to ensure=20 > alignment on drives with either 512-byte or 4K-byte sectors.=20 > Generally, using partition sizes that are even multiples of 1M or 1G=20 > is the easiest way to make sure every partition starts at an even=20 > multiple of 4K. One exception: at present, the freebsd-boot partition=20 > should be no larger than 512K due to boot code limitations.'' > > Perhaps you should shrink the freebsd-boot partition and possibly=20 > reapply /boot/gptboot. I did exactly that, and now all is ok. what got me going in the wrong direction was the message: Loader file too large it's anything but that, it's the partition size! I reduced it to 64k and now all is ok. the source pmbr.s seems to say different - 545K, but since gptboot is 15k ... someone should mention it in the gpart(8) man page. cheers, danny From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 17:09:53 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D5C40FFE for ; Tue, 22 Jan 2013 17:09:53 +0000 (UTC) (envelope-from bu7cher@yandex.ru) Received: from forward18.mail.yandex.net (forward18.mail.yandex.net [IPv6:2a02:6b8:0:1402::3]) by mx1.freebsd.org (Postfix) with ESMTP id 8886761E for ; Tue, 22 Jan 2013 17:09:53 +0000 (UTC) Received: from smtp17.mail.yandex.net (smtp17.mail.yandex.net [95.108.252.17]) by forward18.mail.yandex.net (Yandex) with ESMTP id 18ABD17815AD; Tue, 22 Jan 2013 21:09:51 +0400 (MSK) Received: from smtp17.mail.yandex.net (localhost [127.0.0.1]) by smtp17.mail.yandex.net (Yandex) with ESMTP id CCFBC1900135; Tue, 22 Jan 2013 21:09:50 +0400 (MSK) Received: from v10-167-169.yandex.net (v10-167-169.yandex.net [84.201.167.169]) by smtp17.mail.yandex.net (nwsmtp/Yandex) with ESMTP id 9oq4Nm6Q-9oqe73L6; Tue, 22 Jan 2013 21:09:50 +0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1358874590; bh=Dhgic+PsQ0p5qcqXjEX2kqyPB6Oog8P3XVWh/4/dlwM=; h=Message-ID:Date:From:User-Agent:MIME-Version:To:CC:Subject: References:In-Reply-To:X-Enigmail-Version:Content-Type: Content-Transfer-Encoding; b=TxopVDBDMZEkIrCNsAVmmSgiiScXz4tv14Cwz5nra7jm3kezxSOn89N1MU/Hn670h HKoMhIU5DtufLbhAd5380Ff9+xDEk1AbXRMtuPZm+wcYLHxlPR8NotbCbqNPa/0/oA DqMFQlxce86wpNxsRbmNbIRGKelRUXP89fHWitDs= Message-ID: <50FEC796.8090502@yandex.ru> Date: Tue, 22 Jan 2013 21:08:38 +0400 From: "Andrey V. Elsukov" User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/17.0 Thunderbird/17.0 MIME-Version: 1.0 To: Daniel Braniss Subject: Re: pmbr: Boot loader too large References: In-Reply-To: X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD Hackers , =?ISO-8859-1?Q?Trond_Endrest=F8l?= X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 17:09:53 -0000 On 22.01.2013 16:41, Daniel Braniss wrote: > the source pmbr.s seems to say different - 545K, but since gptboot is 15k ... > someone should mention it in the gpart(8) man page. It is already documented in the gpart(8) man page, twice. -- WBR, Andrey V. Elsukov From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 17:14:53 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id CC1E1270; Tue, 22 Jan 2013 17:14:53 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id AACA669E; Tue, 22 Jan 2013 17:14:53 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 101AFB94A; Tue, 22 Jan 2013 12:14:53 -0500 (EST) From: John Baldwin To: freebsd-hackers@freebsd.org Subject: Re: libprocstat(3): retrieve process command line args and environment Date: Tue, 22 Jan 2013 12:01:06 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <20130119151253.GB88025@gmail.com> In-Reply-To: <20130119151253.GB88025@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301221201.06290.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 22 Jan 2013 12:14:53 -0500 (EST) Cc: Mikolaj Golub , Stanislav Sedov , Robert Watson X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 17:14:53 -0000 On Saturday, January 19, 2013 10:12:54 am Mikolaj Golub wrote: > Hi, > > Some time ago Stanislav Sedov suggested to me extending libprocstat(3) > with functions to retrieve process command line arguments and > environment variables. > > In the first approach I tried, the newly added functions > procstat_getargv/getenvv allocated a buffer of necessary size, stored > the values and returned to the caller: > > http://people.freebsd.org/~trociny/libprocstat.1.patch > > The problem with this approach was that when I updated procstat(1) to > use this interface, I observed noticeable performance degradation > (about 30% on systems with MALLOC_PRODUCTION off), due to memory > allocation overhead: the original procstat(1) reuses the buffer for > all its retrievals. > > So my second approach was to add internal buffers to struct procstat, > which are used by procstat_getargv/getenvv to store values and reused > on the subsequent call: > > http://people.freebsd.org/~trociny/libprocstat.2.patch > > The drawback of this approach is that a user has to take care and > remember that a subsequent call rewrites argument vector obtained from > the previous call. On the other hand this is ok for typical use cases > while does not add allocation overhead, so I like this approach more. > > I would like to commit this second patch, if there are no objections > or suggestions how to improve the things. How is this different from kvm_getargv()? It seems to be a direct copy. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 17:14:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 468273C5 for ; Tue, 22 Jan 2013 17:14:55 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id BB39F6A0 for ; Tue, 22 Jan 2013 17:14:55 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 242EEB95B; Tue, 22 Jan 2013 12:14:55 -0500 (EST) From: John Baldwin To: freebsd-hackers@freebsd.org Subject: Re: solved: pmbr: Boot loader too large Date: Tue, 22 Jan 2013 12:06:58 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301221206.58460.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 22 Jan 2013 12:14:55 -0500 (EST) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 17:14:56 -0000 On Tuesday, January 22, 2013 6:42:22 am Daniel Braniss wrote: > > hi, > > this is the output from gpart show: > > => 34 976773101 ada0 GPT (465G) > > 34 2048 1 freebsd-boot (1.0M) > > 2082 4194304 2 freebsd-ufs [bootme] (2.0G) > > 4196386 12582912 3 freebsd-swap (6.0G) > > 16779298 959993837 4 freebsd-zfs (457G) > > > > => 34 976773101 ada1 GPT (465G) > > 34 2048 1 freebsd-boot (1.0M) > > 2082 4194304 2 freebsd-ufs (2.0G) > > 4196386 12582912 3 freebsd-swap (6.0G) > > 16779298 959993837 4 freebsd-zfs (457G) > > > > I also did: > > gpart bootcode -b /boot/pmbr ada0 > > > > I'm trying to boot and get > > Boot loader too large > > > > not matter if I boot from disk or pxe. > > The pmbr is 512 bytes, so what causes it to overshoot? > > I don't know x86 assembler (nor want to :-), but the comment says: > > 545k should be enough > > so what's going on? > > never underestimate the human stupidity (mine in this case) nor of the boot. > pmbr will load the whole partition, which was 1M, instead of the size of > gptboot :-( > > reducing the size of the slice/partition fixed the issue. pmbr doesn't have room to be but so smart. It can't parse a filesystem, so it just loads a raw partition assuming that the partition is the boot loader. The 545k bit has to do with where it is loaded. The boot loader has to live in the lower 640k, but it starts at 0x7c00 (the address that the BIOS always loads boot loaders). The 545k limit comes from 640k - 0x7c00. This is a fundamental limit of the x86 BIOS architecture. Compared to the 15.5k that UFS leaves for boot2 it is worlds of space. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 17:16:49 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E5CDE90B; Tue, 22 Jan 2013 17:16:49 +0000 (UTC) (envelope-from sodynet1@gmail.com) Received: from mail-ie0-f175.google.com (mail-ie0-f175.google.com [209.85.223.175]) by mx1.freebsd.org (Postfix) with ESMTP id A34FC715; Tue, 22 Jan 2013 17:16:49 +0000 (UTC) Received: by mail-ie0-f175.google.com with SMTP id qd14so11819204ieb.6 for ; Tue, 22 Jan 2013 09:16:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=aSyed+y1c4mVcrZvvftQaRGopwBXw/6drdgqXoETXbY=; b=0vZXbgTnJkacr6yQVLS4mjMPTnG5l9Ox+MYvT+GWcATb/2u+OUh73bHYp3/0SjMMJ9 rxhAtCA3oZP2jgPt5+yxkFvC/GpdT1hBW52QsPhjuNB8bJ7AvQ0lOcdPkGTLt4ol1kxb J74bf1QFqtNWND+4Sys38ac0KF3L7S1Sg24IAYUL4vPmQs3aQaxtIpcO92IvEq6jXMtM luEWDrNIg/XdAeZxnTbPcCj7n5JMAn7JqvZkV7E6qNqdeSwcbDOTxTHUG88vz6Jxegfm 9MAZilV4U8QMuRn4964M7Uk+oDO4uIPeLhFz0xdHWYLS5bXVUqZFJrCmT7Byz6+wLb3C YMkw== MIME-Version: 1.0 X-Received: by 10.50.222.226 with SMTP id qp2mr12745969igc.103.1358875009202; Tue, 22 Jan 2013 09:16:49 -0800 (PST) Received: by 10.64.51.98 with HTTP; Tue, 22 Jan 2013 09:16:48 -0800 (PST) Received: by 10.64.51.98 with HTTP; Tue, 22 Jan 2013 09:16:48 -0800 (PST) In-Reply-To: <50FBFA2C.2010504@digiware.nl> References: <201301161513.27016.jhb@freebsd.org> <1358392725.32417.179.camel@revolution.hippie.lan> <50FBFA2C.2010504@digiware.nl> Date: Tue, 22 Jan 2013 19:16:48 +0200 Message-ID: Subject: Re: Failsafe on kernel panic From: Sami Halabi To: Willem Jan Withagen Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-hackers@freebsd.org, freebsd-stable@freebsd.org, Ian Lepore X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 17:16:50 -0000 I started investigating ipmi, so far i can configure IP from fbsd to ipmi. My question is how to access it? Can it be done inband attached to one oc the Ibm nics kn the board? or knlh out oc band? In case of oob any knows if the iLO plug is pure rj45 in ibm servers (specially x3250/3550)? Thanks in advance Sami =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A 20 =D7=91=D7=99=D7=A0=D7=95 2013 16:0= 7, =D7=9E=D7=90=D7=AA "Willem Jan Withagen" : > On 17-1-2013 4:18, Ian Lepore wrote: > > On Wed, 2013-01-16 at 23:27 +0200, Sami Halabi wrote: > >> Thank you for your response, very helpful. > >> one question - how do i configure auto-reboot once kernel panic occurs= ? > >> > >> Sami > >> > > > > From src/sys/conf/NOTES, this may be what you're looking for... > > > > # > > # Don't enter the debugger for a panic. Intended for unattended operati= on > > # where you may want to enter the debugger from the console, but still > want > > # the machine to recover from a panic. > > # > > options KDB_UNATTENDED > > > > But I think it only has meaning if you have option KDB in effect, > > otherwise it should just reboot itself after a 15 second pause. > > Well it is not the magical fix-all solution. > > Last night I had to drive to the colo (lucky for me a 5 min drive.) > because I could not get a system to reboot/recover from a crash. > > Upon arrival the system was crashed and halted on the message: > rebooting in 15 sec. > > Which but those 15 secs are would have gone by for about 10-20 minutes. > fysically rebooting or resetting ended up in the same position: > rebooting in 15 sec. > Without ever getting to actually rebooting. > > So if I (you) have servers 2 hours away, I usually try to work on > upgrading/rebooting during business hours. And remote hands can get me > out of trouble.... > > IPMI is another nice way of getting at the server in these cases. But > that requires a lot more infra and tinkering. > > --WjW > > > > From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 18:24:00 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1CA4F43B; Tue, 22 Jan 2013 18:24:00 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from kib.kiev.ua (kib.kiev.ua [IPv6:2001:470:d5e7:1::1]) by mx1.freebsd.org (Postfix) with ESMTP id 844AEB0A; Tue, 22 Jan 2013 18:23:56 +0000 (UTC) Received: from tom.home (kostik@localhost [127.0.0.1]) by kib.kiev.ua (8.14.6/8.14.6) with ESMTP id r0MINmdD036579; Tue, 22 Jan 2013 20:23:48 +0200 (EET) (envelope-from kostikbel@gmail.com) DKIM-Filter: OpenDKIM Filter v2.7.4 kib.kiev.ua r0MINmdD036579 Received: (from kostik@localhost) by tom.home (8.14.6/8.14.6/Submit) id r0MINmlf036578; Tue, 22 Jan 2013 20:23:48 +0200 (EET) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: tom.home: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 22 Jan 2013 20:23:48 +0200 From: Konstantin Belousov To: Oliver Fromme Subject: Re: Processes' FIBs Message-ID: <20130122182348.GA36551@kib.kiev.ua> References: <201201171221.q0HCLRsh034506@lurza.secnetix.de> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tKW2IUtsqtDRztdT" Content-Disposition: inline In-Reply-To: <201201171221.q0HCLRsh034506@lurza.secnetix.de> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Status: No, score=-2.0 required=5.0 tests=ALL_TRUSTED,BAYES_00, DKIM_ADSP_CUSTOM_MED,FREEMAIL_FROM,NML_ADSP_CUSTOM_MED autolearn=no version=3.3.2 X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on tom.home Cc: freebsd-net@FreeBSD.ORG, gibbs@freebsd.org, Oliver Fromme , freebsd-hackers@FreeBSD.ORG X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 18:24:00 -0000 --tKW2IUtsqtDRztdT Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jan 17, 2012 at 01:21:27PM +0100, Oliver Fromme wrote: > Kostik Belousov wrote: > > The patch misses compat32 bits and breaks compat32 ps/top. >=20 > Right, thank you for pointing it out! I missed it because > I only have i386 for testing. >=20 > I've created new patch sets for releng8 and current. These > include compat32 support and an entry for the manual page. >=20 > Would someone with amd64 please test the compat32 part? > I've been using this code on i386 for a few days without > any problems. >=20 > I've attached the patch for current below. Both patch sets > are also available from this URL: > http://www.secnetix.de/olli/tmp/ki_fibnum/ >=20 > Testing is easy: Apply the patch, rebuild bin/ps and kernel. > Make sure that your kernel config has "options ROUTETABLES=3D16" > so multiple FIBs are supported. Reboot. Open a shell with > setfib, e.g. "setfib 3 /bin/sh" (no root required), type > "ps -ax -o user,pid,fib,command" or something similar, and > verify that the shell process and its children are listed > with the correct FIB. When testing on amd64, use both the > native ps and an i386 binary. >=20 > Thank you very much! >=20 > Best regards > Oliver >=20 > --- sys/sys/user.h.orig 2011-11-07 22:13:19.000000000 +0100 > +++ sys/sys/user.h 2012-01-17 11:33:59.000000000 +0100 > @@ -83,7 +83,7 @@ > * it in two places: function fill_kinfo_proc in sys/kern/kern_proc.c and > * function kvm_proclist in lib/libkvm/kvm_proc.c . > */ > -#define KI_NSPARE_INT 9 > +#define KI_NSPARE_INT 8 > #define KI_NSPARE_LONG 12 > #define KI_NSPARE_PTR 6 > =20 > @@ -186,6 +186,7 @@ > */ > char ki_sparestrings[50]; /* spare string space */ > int ki_spareints[KI_NSPARE_INT]; /* spare room for growth */ > + int ki_fibnum; /* Default FIB number */ > u_int ki_cr_flags; /* Credential flags */ > int ki_jid; /* Process jail ID */ > int ki_numthreads; /* XXXKSE number of threads in total */ > --- sys/kern/kern_proc.c.orig 2012-01-15 19:47:24.000000000 +0100 > +++ sys/kern/kern_proc.c 2012-01-17 12:52:36.000000000 +0100 > @@ -836,6 +836,7 @@ > kp->ki_swtime =3D (ticks - p->p_swtick) / hz; > kp->ki_pid =3D p->p_pid; > kp->ki_nice =3D p->p_nice; > + kp->ki_fibnum =3D p->p_fibnum; > kp->ki_start =3D p->p_stats->p_start; > timevaladd(&kp->ki_start, &boottime); > PROC_SLOCK(p); > @@ -1121,6 +1122,7 @@ > bcopy(ki->ki_comm, ki32->ki_comm, COMMLEN + 1); > bcopy(ki->ki_emul, ki32->ki_emul, KI_EMULNAMELEN + 1); > bcopy(ki->ki_loginclass, ki32->ki_loginclass, LOGINCLASSLEN + 1); > + CP(*ki, *ki32, ki_fibnum); > CP(*ki, *ki32, ki_cr_flags); > CP(*ki, *ki32, ki_jid); > CP(*ki, *ki32, ki_numthreads); > --- sys/compat/freebsd32/freebsd32.h.orig 2011-11-11 08:17:00.000000000 += 0100 > +++ sys/compat/freebsd32/freebsd32.h 2012-01-17 11:34:00.000000000 +0100 > @@ -319,6 +319,7 @@ > char ki_loginclass[LOGINCLASSLEN+1]; > char ki_sparestrings[50]; > int ki_spareints[KI_NSPARE_INT]; > + int ki_fibnum; > u_int ki_cr_flags; > int ki_jid; > int ki_numthreads; > --- bin/ps/keyword.c.orig 2011-09-29 08:31:42.000000000 +0200 > +++ bin/ps/keyword.c 2012-01-17 12:54:49.000000000 +0100 > @@ -85,6 +85,7 @@ > {"etimes", "ELAPSED", NULL, USER, elapseds, 0, CHAR, NULL, 0}, > {"euid", "", "uid", 0, NULL, 0, CHAR, NULL, 0}, > {"f", "F", NULL, 0, kvar, KOFF(ki_flag), INT, "x", 0}, > + {"fib", "FIB", NULL, 0, kvar, NULL, 2, KOFF(ki_fibnum), INT, "d", 0}, > {"flags", "", "f", 0, NULL, 0, CHAR, NULL, 0}, > {"gid", "GID", NULL, 0, kvar, KOFF(ki_groups), UINT, UIDFMT, 0}, > {"group", "GROUP", NULL, LJUST, egroupname, 0, CHAR, NULL, 0}, > --- bin/ps/ps.1.orig 2011-11-22 22:53:06.000000000 +0100 > +++ bin/ps/ps.1 2012-01-17 12:56:17.000000000 +0100 > @@ -29,7 +29,7 @@ > .\" @(#)ps.1 8.3 (Berkeley) 4/18/94 > .\" $FreeBSD: src/bin/ps/ps.1,v 1.112 2011/11/22 21:53:06 trociny Exp $ > .\" > -.Dd November 22, 2011 > +.Dd January 17, 2012 > .Dt PS 1 > .Os > .Sh NAME > @@ -506,6 +506,9 @@ > minutes:seconds. > .It Cm etimes > elapsed running time, in decimal integer seconds > +.It Cm fib > +default FIB number, see > +.Xr setfib 1 > .It Cm flags > the process flags, in hexadecimal (alias > .Cm f ) Just reviving the recent thread after the ping. The patch looks fine to me, and is still not committed. --tKW2IUtsqtDRztdT Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBAgAGBQJQ/tkzAAoJEJDCuSvBvK1Bi80P/28DJbyfmQzf9WL8B7bvqkW2 Pw8rsP2jSi23RbOkczbDI1g2CuKUeb6eLJfq7ubAuu+nwnvR4Ue3lY09ntGGGK2G sKbBTH9hMjaAX41Z7zkRDQg/058S00Zd3znaQBQMJJncty3MRg76OhttFP83OENX XEm3HWid9YoNc1zOlMw3fTCDPntuCpZQ3RFUr2Ps0nwEeolYLMZ3+aWYxlDy9Mtj Aw2/wI8rFArTVd9ZF0vd3bdqYkMqmarJr/8VAyIf6nIbaAR0FD5mqkQxzqwLY62k WTBP0tHhY5ZpHjY84S5FFPVE+ecShjWs3FmS1aTVFyk+90UtOyBMSZb3th0AZmCQ PclVx3jgp9ViZNgqgav/nC26GeTKqMxcv8bQYFrJswIxWTZZ1Djm0DcOrglDuPB1 blYiZTmhTHtEa4nfnlbOEXISiWS0iYQoczRZNfZOzX+1+d2L2cc+skVkun7hVUDx SjHyTqIq+BHl9SJ5gFpZldqO7geAc6mDOlUd/L7VSt97FPGp+CY6ErLzC2I14NM3 9sBrfEUmx9qUcI1udTf/b9SMeoPmSPE/DM0ma+bejeg9Yu+xsaDhMe4zE7vaToAG U9lZmRIqGdRR3WeZkDz5pEpk81c96TVkOqA7UBwdzlVi0uEh2JYmJ+piLgyzgYAR vew7p39yNKtXWuXUiNdJ =dxsd -----END PGP SIGNATURE----- --tKW2IUtsqtDRztdT-- From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 18:19:00 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 7E3C9FA7 for ; Tue, 22 Jan 2013 18:19:00 +0000 (UTC) (envelope-from matthew.ahrens@delphix.com) Received: from mail-lb0-f173.google.com (mail-lb0-f173.google.com [209.85.217.173]) by mx1.freebsd.org (Postfix) with ESMTP id 07B33ABE for ; Tue, 22 Jan 2013 18:18:59 +0000 (UTC) Received: by mail-lb0-f173.google.com with SMTP id gf7so4659099lbb.18 for ; Tue, 22 Jan 2013 10:18:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=delphix.com; s=google; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=qIMylib9uTp3ioe2QABsre9UeXFqZcBxYifHZU4Sgqk=; b=GQGa23gi7gjOeKmCQC/jTv4yvJvdl8ZCqTyFW7y+1xaaDHMHwcXznPz8BWc6TCC8a8 cjzMxA2yYe9blMXNKxmwlWF6/1iPEF+01TozCVu3G77ztY9N5FHNX2dGA8XZRE2iW0E9 cgWMAovrbVXWvhR5CepPwxTXVU/C/qqc0coGg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:x-gm-message-state; bh=qIMylib9uTp3ioe2QABsre9UeXFqZcBxYifHZU4Sgqk=; b=B2LU/JohtZOOGVp5FiFiYpigJYmhVWUU7YFggZrJ33iqvtzhYTXM/pP/jkxfFMz68c LoxoOfUgYRkh0Pv2nOs6khUgAbFXnJMQAYFtqldGscMHBuTj6YYSZi9u1p7+B2nqJjf8 niJQKO9OU1FkyVOlB5lZFTwwyAiSostXvgONCzp3o4ZGN4FLC8sLnUJwxevGl3W/99tE hniUgAssARPZbr/nTMHsw54em/+LgWgXQw0atOhZYf4uX1ZAoJuJrEPMRn9lMNF5gUd0 /11209QGJmouMFnKFRpzcqhrDurwod6lo8vovRtDWBeJIPWcAcpDxB+6Dd3H0TCC5rzt JOPA== MIME-Version: 1.0 X-Received: by 10.112.88.105 with SMTP id bf9mr9704257lbb.43.1358878732612; Tue, 22 Jan 2013 10:18:52 -0800 (PST) Received: by 10.114.63.100 with HTTP; Tue, 22 Jan 2013 10:18:52 -0800 (PST) In-Reply-To: <20130122073641.GH30633@server.rulingia.com> References: <20130122073641.GH30633@server.rulingia.com> Date: Tue, 22 Jan 2013 10:18:52 -0800 Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. From: Matthew Ahrens To: Peter Jeremy X-Gm-Message-State: ALoCoQlOUch9HKcnvBP7fjAontjXsBpFkOFxjFGE4T44f/sBOrXXYkfP6swk32rAs3vlHO7ui7oG X-Mailman-Approved-At: Tue, 22 Jan 2013 18:31:27 +0000 Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs , Wojciech Puchar , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 18:19:00 -0000 On Mon, Jan 21, 2013 at 11:36 PM, Peter Jeremy wrote: > On 2013-Jan-21 12:12:45 +0100, Wojciech Puchar < wojtek@wojtek.tensor.gdynia.pl> wrote: >>While RAID-Z is already a king of bad performance, > > I don't believe RAID-Z is any worse than RAID5. Do you have any actual > measurements to back up your claim? Leaving aside anecdotal evidence (or actual measurements), RAID-Z is fundamentally slower than RAID4/5 *for random reads*. This is because RAID-Z spreads each block out over all disks, whereas RAID5 (as it is typically configured) puts each block on only one disk. So to read a block from RAID-Z, all data disks must be involved, vs. for RAID5 only one disk needs to have its head moved. For other workloads (especially streaming reads/writes), there is no fundamental difference, though of course implementation quality may vary. >> Even better - use UFS. To each their own. As a ZFS developer, it should come as no surprise that in my opinion and experience, the benefits of ZFS almost always outweigh this downside. --matt From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 19:41:39 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C99FE5A5 for ; Tue, 22 Jan 2013 19:41:39 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id B22ECFB8 for ; Tue, 22 Jan 2013 19:41:39 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id r0MJfWgx035013 for ; Tue, 22 Jan 2013 11:41:33 -0800 (PST) (envelope-from yuri@rawbw.com) Message-ID: <50FEEB6C.7090303@rawbw.com> Date: Tue, 22 Jan 2013 11:41:32 -0800 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130112 Thunderbird/17.0.2 MIME-Version: 1.0 To: hackers@freebsd.org Subject: Why DTrace sensor is listed but not called? Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 19:41:39 -0000 I tried to create my own DTrace sensors (for debugging purposes) through adding of the simple function like this: static u_int xxx_my_trace(int arg) { return 1; } It is listed in dtrace -l with its entry and return sensors. 8143 fbt kernel xxx_my_trace entry 8144 fbt kernel xxx_my_trace return This function is called, I know for sure because it is called from another procedure which does get traced by DTrace. However, these sensors are never triggered when run through dtrace(1M) #!/usr/sbin/dtrace -s ::xxx_my_trace:entry { printf("xxx_my_trace"); } It does print the following, but nothing else: dtrace: script './dt.d' matched 1 probe Adding __attribute__((noinline)) doesn't help. What is the problem? Why dtrace sensors aren't invoked? Yuri From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 21:17:49 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 612CEC5; Tue, 22 Jan 2013 21:17:49 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-lb0-f178.google.com (mail-lb0-f178.google.com [209.85.217.178]) by mx1.freebsd.org (Postfix) with ESMTP id 3535C3FC; Tue, 22 Jan 2013 21:17:47 +0000 (UTC) Received: by mail-lb0-f178.google.com with SMTP id n1so3155206lba.23 for ; Tue, 22 Jan 2013 13:17:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=ZWKmXY5te34djI3HqJfIxtH7Y7f5LY3sDZnbmdZY5f8=; b=pddXKAAkU1WWqBj848flMsMJ71r1p7q2QPS1A3GG+Dk6yDnXtxqnl07glJHWUk6mVH gc19Q/ocRS1oQdvGq9L8tSy29RlypfJhBpzGXOYQNdZQOgHibp6Nt9oEqmdO4CGUZHgP Kzn6BRMY/qsYktO46ONRneaoXiSYWTSz3vXmYfwhNFO+78QcQeZg3nhP5B/hF0QTEFJ5 uVHgIl160dzyqzOHziyYwjHiSZfYK0x9OqL5Ep3O2WAxn35FaK3I4WooOJCx/bN2QHXq epLEBmWhMhsdc+nC4BQXX0WVwhVkDntUJ4OHoChLE8e9pL1OZqlTZkIQbS5YUBiBRNvR yBcg== X-Received: by 10.152.109.210 with SMTP id hu18mr21532324lab.12.1358889467031; Tue, 22 Jan 2013 13:17:47 -0800 (PST) Received: from localhost ([178.150.115.244]) by mx.google.com with ESMTPS id n7sm7512827lbg.3.2013.01.22.13.17.45 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 22 Jan 2013 13:17:46 -0800 (PST) Sender: Mikolaj Golub Date: Tue, 22 Jan 2013 23:17:43 +0200 From: Mikolaj Golub To: John Baldwin Subject: Re: libprocstat(3): retrieve process command line args and environment Message-ID: <20130122211743.GA4490@gmail.com> References: <20130119151253.GB88025@gmail.com> <201301221201.06290.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201301221201.06290.jhb@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Stanislav Sedov , freebsd-hackers@freebsd.org, Robert Watson X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 21:17:49 -0000 On Tue, Jan 22, 2013 at 12:01:06PM -0500, John Baldwin wrote: > How is this different from kvm_getargv()? It seems to be a direct copy. libprocstat(3) is a frontend for sysctl(3) and kvm(3) interfaces, so it is good to extend it to cover "getarg/env" functionality. Yes the functions look similar to kvm_getargv() but I couldn't implement them just as wrappers around kvm_getargv(): I would like to have libprocstat functions thread safe, while kvm_getargv() uses static variables for its internal buffers. It looks like I could fix kvm_getargv() to use fields of kvm structure instead of static variables to store pointers to the buffers, and then use it in libprocstat(3). Do you think it is worth doing? BTW, struct __kvm already contains some pointers, which looks like are unused currently: char **argv; /* (dynamic) storage for argv pointers */ int argc; /* length of above (not actual # present) */ char *argbuf; /* (dynamic) temporary storage */ But if I even had kvm_getargv() to behave as I wanted, there is still an issue with using it in libprocstat(): to get kvm structure you need to initialize procstat using procstat_open_kvm(). It is supposed to call procstat_open_kvm() when you want to read from kernel memory, while kvm_getargv() uses sysctl. So from a user point of you it would be a litle confusing if she had to call procstat_open_kvm() to get runtime args and env. If she wanted e.g. to get both runtime args and file info (via sysctl) she would have to do procstat_open_kvm() for args and procstat_open_sysctl() for files. -- Mikolaj Golub From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 21:57:38 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 1B136C78; Tue, 22 Jan 2013 21:57:38 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id EC0678D7; Tue, 22 Jan 2013 21:57:37 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 4EF31B96E; Tue, 22 Jan 2013 16:57:37 -0500 (EST) From: John Baldwin To: Mikolaj Golub Subject: Re: libprocstat(3): retrieve process command line args and environment Date: Tue, 22 Jan 2013 16:48:50 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <20130119151253.GB88025@gmail.com> <201301221201.06290.jhb@freebsd.org> <20130122211743.GA4490@gmail.com> In-Reply-To: <20130122211743.GA4490@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301221648.50747.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Tue, 22 Jan 2013 16:57:37 -0500 (EST) Cc: Stanislav Sedov , freebsd-hackers@freebsd.org, Robert Watson X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 21:57:38 -0000 On Tuesday, January 22, 2013 4:17:43 pm Mikolaj Golub wrote: > On Tue, Jan 22, 2013 at 12:01:06PM -0500, John Baldwin wrote: > > > How is this different from kvm_getargv()? It seems to be a direct copy. > > libprocstat(3) is a frontend for sysctl(3) and kvm(3) interfaces, so > it is good to extend it to cover "getarg/env" functionality. > > Yes the functions look similar to kvm_getargv() but I couldn't > implement them just as wrappers around kvm_getargv(): I would like to > have libprocstat functions thread safe, while kvm_getargv() uses > static variables for its internal buffers. > > It looks like I could fix kvm_getargv() to use fields of kvm structure > instead of static variables to store pointers to the buffers, and then > use it in libprocstat(3). Do you think it is worth doing? > > BTW, struct __kvm already contains some pointers, which looks like are > unused currently: > > char **argv; /* (dynamic) storage for argv pointers */ > int argc; /* length of above (not actual # present) */ > char *argbuf; /* (dynamic) temporary storage */ > > But if I even had kvm_getargv() to behave as I wanted, there is still > an issue with using it in libprocstat(): to get kvm structure you need > to initialize procstat using procstat_open_kvm(). It is supposed to > call procstat_open_kvm() when you want to read from kernel memory, > while kvm_getargv() uses sysctl. So from a user point of you it would > be a litle confusing if she had to call procstat_open_kvm() to get > runtime args and env. If she wanted e.g. to get both runtime args and > file info (via sysctl) she would have to do procstat_open_kvm() for > args and procstat_open_sysctl() for files. Well, you could make procstat open a kvm handle in both cases (open a "live" handle in the procstat_open_sysctl() case). It just seems rather silly to be duplicating code in the two interfaces. More a question for Robert: does libprocstat intentionally duplicate the code in libkvm for other things as well in the live case? (Like fetching the list of processes?) -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 22:18:00 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 074993C6; Tue, 22 Jan 2013 22:18:00 +0000 (UTC) (envelope-from stas@freebsd.org) Received: from mx0.deglitch.com (backbone.deglitch.com [78.110.53.255]) by mx1.freebsd.org (Postfix) with ESMTP id B4AE99AD; Tue, 22 Jan 2013 22:17:58 +0000 (UTC) Received: from [10.0.1.4] (c-98-234-104-237.hsd1.ca.comcast.net [98.234.104.237]) by mx0.deglitch.com (Postfix) with ESMTPSA id 5DE008FC2B; Wed, 23 Jan 2013 02:17:45 +0400 (MSK) Content-Type: text/plain; charset=iso-8859-1 Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Subject: Re: libprocstat(3): retrieve process command line args and environment From: Stanislav Sedov In-Reply-To: <201301221648.50747.jhb@freebsd.org> Date: Tue, 22 Jan 2013 14:17:39 -0800 Content-Transfer-Encoding: quoted-printable Message-Id: <9679EEE4-BE52-493E-9188-CAECEE5E63D3@freebsd.org> References: <20130119151253.GB88025@gmail.com> <201301221201.06290.jhb@freebsd.org> <20130122211743.GA4490@gmail.com> <201301221648.50747.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1499) Cc: Mikolaj Golub , freebsd-hackers@freebsd.org, Robert Watson X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 22:18:00 -0000 On Jan 22, 2013, at 1:48 PM, John Baldwin wrote: >=20 > Well, you could make procstat open a kvm handle in both cases (open a = "live"=20 > handle in the procstat_open_sysctl() case). It just seems rather = silly to be=20 > duplicating code in the two interfaces. More a question for Robert: = does=20 > libprocstat intentionally duplicate the code in libkvm for other = things as=20 > well in the live case? (Like fetching the list of processes?) >=20 It does not actually has a duplicate code, the code for fetching the = list of processes via sysctl is different from the KVM case. The open file = descriptors processing is different as well. Because libprocstat implements almost = the same functionality both for sysctl and mvm backends, it can be used to = analyze both the live system and the kernel crash dumps. The code Mikolaj = proposed only implements the sysctl backend currently, so it does not seem to = have any relation to KVM, so it will be a bit weird to make it open a KVM = handle though it does not use it. -- ST4096-RIPE From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 22:41:03 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E4065C8A for ; Tue, 22 Jan 2013 22:41:03 +0000 (UTC) (envelope-from sushanth_rai@yahoo.com) Received: from nm17.access.bullet.mail.mud.yahoo.com (nm17.access.bullet.mail.mud.yahoo.com [66.94.237.218]) by mx1.freebsd.org (Postfix) with ESMTP id C4747A98 for ; Tue, 22 Jan 2013 22:41:01 +0000 (UTC) Received: from [66.94.237.199] by nm17.access.bullet.mail.mud.yahoo.com with NNFMP; 22 Jan 2013 22:40:55 -0000 Received: from [66.94.237.101] by tm10.access.bullet.mail.mud.yahoo.com with NNFMP; 22 Jan 2013 22:40:55 -0000 Received: from [127.0.0.1] by omp1006.access.mail.mud.yahoo.com with NNFMP; 22 Jan 2013 22:40:55 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 873610.18290.bm@omp1006.access.mail.mud.yahoo.com Received: (qmail 24250 invoked by uid 60001); 22 Jan 2013 22:40:55 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1358894455; bh=x3+TqvrUY7oz6WaNj8ehUYbUZLf1F0GDICyxqFgdb1s=; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:MIME-Version:Content-Type; b=Q2EgJCPtN4XVg79g14zHnAnYrGJ+7iODJTR5IrRbEDkD4939jw1///UWG/EDWqwDurLJO99xpqLBFLSGwwvTtcTZZ19qzfGHe57HRTurCNONrxnFeqbFVEYdfumq00kAyycGaGjp7S0gN+gDuo4VubjakNWLd2Km3tP1XzCJXcM= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:Message-ID:Date:From:Subject:To:MIME-Version:Content-Type; b=z1PWUZOZT2asoqXl9nVdIFYswcbJ+OSnm2K0KKkUTVJ6nzcALH1+l4vuE/WkKhU/Y07dVhq8jJp+bu0yTWeQizFFVZBUSmGkrIWqA5KfZyi0DaSwC1ESv0HOgN5Ir1kQXDD4UFw01UF1wSMQfYC99kpFdHTtkvlnvlR5JLKfH2U=; X-YMail-OSG: 3FGC4uoVM1mFo281iFjpOKKraDbmDmF6bR0MNuwXWECT8tm uaseyzn2iJA3v8arT0oyB4Ltc9R2CXoUtjn8CVEK0Tpv_ZAQDIa4k8bGRL0. UNhTWSLW0RKdR_yZE6QZujgJ_cPaBnih5rw6cE.YQll5kbdJIaOMigLvuTjq PCSZzNZGut5qE6yaUN74W6SEF9rdFBTVa9hSm0Yl4SW2VKlvgGkhOSSQQTrf dZFO0uoFj.biNlwA2eOhib8ulGUe1bouHAkLn9UlSgQyumVWlLIQrquuX4fR bMEA86i3rxMZYlupqrCegzBmVPYi1otMhY29CS3xoE.o84yLlvwl19zQMVs0 55umje6oOgdJJhzHcHIMMYfOx7LsuLa_YAGM7q_YWxXrSTZ3LPOHJkI__Nkm DGwOFaJYr78lt7EDE.OCv0hI9XV.XnuJwCDIGVdypwGxqRKkYidCIkuIVAt0 Ge_UarpEF0O5ZG4cM69Qx1y_9zqNv_VAvUfnMMJ8WucntNe8WFexBf4ICK6V w Received: from [66.2.48.195] by web181706.mail.ne1.yahoo.com via HTTP; Tue, 22 Jan 2013 14:40:55 PST X-Rocket-MIMEInfo: 001.001, SGksCgpEb2VzIGZyZWVic2QgaGF2ZSBzb21lIGZ1bmN0aW9uYWxpdHkgc2ltaWxhciB0byAgTGludXgncyBOTUkgd2F0Y2hkb2cgPyBJJ20gYXdhcmUgb2YgaWNod2QgZHJpdmVyLCBidXQgdGhhdCBkZXBlbmRzIHRvIFdEVCB0byBiZSBhdmFpbGFibGUgaW4gdGhlIGhhcmR3YXJlLiBFdmVuIHdoZW4gaXQgaXMgYXZhaWxhYmxlLCBCSU9TIG5lZWRzIHRvIHN1cHBvcnQgYSBtZWNoYW5pc20gdG8gdHJpZ2dlciBhIE9TIGxldmVsIHJlY292ZXJ5IHRvIGdldCBhbnkgdXNlZnVsIGluZm9ybWF0aW9uIHdoZW4gc3kBMAEBAQE- X-Mailer: YahooMailClassic/15.1.2 YahooMailWebService/0.8.130.496 Message-ID: <1358894455.17521.YahooMailClassic@web181706.mail.ne1.yahoo.com> Date: Tue, 22 Jan 2013 14:40:55 -0800 (PST) From: Sushanth Rai Subject: NMI watchdog functionality on Freebsd To: freebsd-hackers@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 22:41:04 -0000 Hi, Does freebsd have some functionality similar to Linux's NMI watchdog ? I'm aware of ichwd driver, but that depends to WDT to be available in the hardware. Even when it is available, BIOS needs to support a mechanism to trigger a OS level recovery to get any useful information when system is really wedged (with interrupt disabled). With Linux's NMI, APIC is programmed to periodically generate NMI and the OS NMI handler can check for some counters and invoke panic if the counters are not updated for a while. Thanks, Sushanth From owner-freebsd-hackers@FreeBSD.ORG Tue Jan 22 23:22:43 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 7ABCC6EC; Tue, 22 Jan 2013 23:22:43 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-vc0-f173.google.com (mail-vc0-f173.google.com [209.85.220.173]) by mx1.freebsd.org (Postfix) with ESMTP id D1DE9CB7; Tue, 22 Jan 2013 23:22:42 +0000 (UTC) Received: by mail-vc0-f173.google.com with SMTP id fy7so4018112vcb.4 for ; Tue, 22 Jan 2013 15:22:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=QVYpoF9mNN/yWXsdDowaIcwb0yfm0+A5bqqZZTqm0Ng=; b=tkfxzg9/ps834gRnfNd9iCycmE1JYhvgRyMDRgvyIZVOY7b/dK/DdWeT51agJx0njM 8e3y2P3NaPjyqtAWL2/ATuAZndg04tubHMoo8wSMYd+IvibVVwCfhctdz3zGR0sjdzjO Wtz1MmaDVlnBaO2EsfBXO0ai5FRLnWBw8uxgz3Vcr0Cx0vSw/HtjPXVMfWr9RWjDdyVO p4cZCUiarMrVK9oxO2MWkGvS80GLAn4eUsqwNFdt/32z7J60jMZJ+1V6alPazR7vsGcU gne0ZfVMTrqxcyKMBUGlzq93eZUfMmzDH9yLl7SVZ217y+/b8amVt05yT1XV4EQcbPkc SK5g== MIME-Version: 1.0 X-Received: by 10.52.67.45 with SMTP id k13mr23456165vdt.9.1358896956102; Tue, 22 Jan 2013 15:22:36 -0800 (PST) Sender: artemb@gmail.com Received: by 10.220.122.196 with HTTP; Tue, 22 Jan 2013 15:22:35 -0800 (PST) In-Reply-To: <20130121210645.GC1341@garage.freebsd.pl> References: <50F96A67.9080203@freebsd.org> <20130121210645.GC1341@garage.freebsd.pl> Date: Tue, 22 Jan 2013 15:22:35 -0800 X-Google-Sender-Auth: Wy68ekTm74emE4euEUDfVS6eJ2s Message-ID: Subject: Re: kmem_map auto-sizing and size dependencies From: Artem Belevich To: Pawel Jakub Dawidek Content-Type: text/plain; charset=ISO-8859-1 Cc: Matthew Fleming , FreeBSD Current , Andre Oppermann , freebsd-hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jan 2013 23:22:43 -0000 On Mon, Jan 21, 2013 at 1:06 PM, Pawel Jakub Dawidek wrote: > On Fri, Jan 18, 2013 at 08:26:04AM -0800, mdf@FreeBSD.org wrote: >> > Should it be set to a larger initial value based on min(physical,KVM) space >> > available? >> >> It needs to be smaller than the physical space, [...] > > Or larger, as the address space can get fragmented and you might not be > able to allocate memory even if you have physical pages available. +1 for relaxing upper limit. I routinely patch all my systems that use ZFS to allow kmem_map size to be larger than physical memory. Otherwise on a system where most of RAM goes towards ZFS ARC I used to eventually run into dreaded kmem_map too small panic. --Artem From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 00:03:24 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 390FACCA for ; Wed, 23 Jan 2013 00:03:24 +0000 (UTC) (envelope-from rysto32@gmail.com) Received: from mail-ob0-f171.google.com (mail-ob0-f171.google.com [209.85.214.171]) by mx1.freebsd.org (Postfix) with ESMTP id 08657E14 for ; Wed, 23 Jan 2013 00:03:23 +0000 (UTC) Received: by mail-ob0-f171.google.com with SMTP id lz20so2133851obb.2 for ; Tue, 22 Jan 2013 16:03:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=yj3koOrup0X2nVpWcTvq8mj5JbKLk9vCBffsWhGvRDU=; b=S/L9cJznI98Im4hMQi3oxkTGcE1HVTAmouDjBnRwV+Eb7fF1OmgBgAVDXOluxgZxbi 67OJJ2agP6GmpEt4HXSLJoBxtNKQmsLRzqt8sBkyb4KjPXVq9SvasmUFiEphP/rtkfBC LOC79elK0QGLCIvNY91oqWKsUyZydsvsgYY2dxZ06LQrVG1WBkPcrpj89bcRQdC3abmE yVQ9LIjaDeZi4htt4eLPaaQx/XMNNwxvF7Dskc/vUyQO6LAb8jxuFx303tWnBIOJDhSa zAjRozfVOS6TtUg2S86+namMsRCPLmA8HwPzuz5CmW3/ijF6/6GLxEbByW+OhtltDGTN x2KA== MIME-Version: 1.0 X-Received: by 10.182.194.2 with SMTP id hs2mr18208707obc.97.1358899396836; Tue, 22 Jan 2013 16:03:16 -0800 (PST) Received: by 10.76.128.68 with HTTP; Tue, 22 Jan 2013 16:03:16 -0800 (PST) In-Reply-To: <50FEEB6C.7090303@rawbw.com> References: <50FEEB6C.7090303@rawbw.com> Date: Tue, 22 Jan 2013 19:03:16 -0500 Message-ID: Subject: Re: Why DTrace sensor is listed but not called? From: Ryan Stone To: Yuri Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 00:03:24 -0000 On Tue, Jan 22, 2013 at 2:41 PM, Yuri wrote: > I tried to create my own DTrace sensors (for debugging purposes) through > adding of the simple function like this: > static u_int > xxx_my_trace(int arg) { > return 1; > } > > It is listed in dtrace -l with its entry and return sensors. > 8143 fbt kernel xxx_my_trace entry > 8144 fbt kernel xxx_my_trace return > This function is called, I know for sure because it is called from another > procedure which does get traced by DTrace. > However, these sensors are never triggered when run through dtrace(1M) > #!/usr/sbin/dtrace -s > ::xxx_my_trace:entry > { > printf("xxx_my_trace"); > } > It does print the following, but nothing else: > dtrace: script './dt.d' matched 1 probe > > Adding __attribute__((noinline)) doesn't help. > > What is the problem? Why dtrace sensors aren't invoked? > Offhand, I can't of why this isn't working. However there is already a way to add new DTrace probes to the kernel, and it's quite simple, so you could try it: /* The headers that you need to include. */ #include "opt_kdtrace.h" #include #include /* Declare a DTrace provider */ SDT_PROVIDER_DEFINE(your_provider); /* * Declare the DTrace probe your_provider:your_module:your_function:your_probe. You may * leave your_module or your_function blank. Most Solaris probes do, like sched:::dequeue. * We declare this probe to take 1 argument (DEFINE1) of type int. * * probe_uniquifier can be chosen arbitrarily if you like, but convention is to make it the same * as your_probe. The exception is if your-probe contains a character that is not valid in a C * (like a -, as in sched:::on-cpu). In that case the invalid character is usually replaced with an * underscore. */ SDT_PROBE_DEFINE1(your_provider, your_module, your_function, probe_uniquifier, your_probe, "int"); To call a probe: SDT_PROBE1(your_provider, your_module, your_function, probe_uniquifier, my_int_arg); (There is a wiki page on this, but it is out of date. I will clean it up). Your D script would look like: #!/usr/sbin/dtrace -s your_provider:your_module:your_function:your_probe { printf("xxx_my_trace"); } From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 07:25:07 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 674D9EAC; Wed, 23 Jan 2013 07:25:07 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-lb0-f175.google.com (mail-lb0-f175.google.com [209.85.217.175]) by mx1.freebsd.org (Postfix) with ESMTP id 3A585E43; Wed, 23 Jan 2013 07:25:06 +0000 (UTC) Received: by mail-lb0-f175.google.com with SMTP id n3so5540581lbo.20 for ; Tue, 22 Jan 2013 23:25:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=ybyZzFGuaKPOBID0PX9vedcH4wYQTytvtYxmmxEFCsY=; b=fxhbVM090Te9RwD6uXu43Md2UBMIy7GWWeKiWaBUh2TN8mimCBsHaXl4QzE83ge++q LWJGc+twzuiMQyNPXvdIgFqfHozTc3HJR/daXWDRGEan6zVaM2jKb7K7c4N65BWXE1gJ 94NHlyQzkgujNkBD3UnAPybzJC8FKlGoT7VajPj1lK1ENCUPrXoXdEubika/k+a1WcDV +eRij7cjyNfr22ceEFaoibiVHifq3KoH6wFWAfKY5tCqSqZqdPKuT8MV7hDyX6eZ5W+E tVYtf2Tav+Ssz97ee9xHOvCGOXzXxYWtfYm4RCL34OxfPkfXRE4ua+GReOydLOJ579s6 fvHQ== X-Received: by 10.112.13.133 with SMTP id h5mr280205lbc.99.1358925905135; Tue, 22 Jan 2013 23:25:05 -0800 (PST) Received: from localhost ([188.230.122.226]) by mx.google.com with ESMTPS id ox6sm7821781lab.16.2013.01.22.23.25.03 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Tue, 22 Jan 2013 23:25:04 -0800 (PST) Sender: Mikolaj Golub Date: Wed, 23 Jan 2013 09:25:00 +0200 From: Mikolaj Golub To: Stanislav Sedov Subject: Re: libprocstat(3): retrieve process command line args and environment Message-ID: <20130123072459.GA48402@gmail.com> References: <20130119151253.GB88025@gmail.com> <201301221201.06290.jhb@freebsd.org> <20130122211743.GA4490@gmail.com> <201301221648.50747.jhb@freebsd.org> <9679EEE4-BE52-493E-9188-CAECEE5E63D3@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9679EEE4-BE52-493E-9188-CAECEE5E63D3@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-hackers@freebsd.org, Robert Watson X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 07:25:07 -0000 On Tue, Jan 22, 2013 at 02:17:39PM -0800, Stanislav Sedov wrote: > > On Jan 22, 2013, at 1:48 PM, John Baldwin wrote: > > > > Well, you could make procstat open a kvm handle in both cases (open a "live" > > handle in the procstat_open_sysctl() case). It just seems rather silly to be > > duplicating code in the two interfaces. In this particular case I prefer code duplication to opening a kvm handle in procstat_open_sysctl(), as it looks a bit confusing. But I can do this way if the agreement is reached. > > More a question for Robert: does > > libprocstat intentionally duplicate the code in libkvm for other things as > > well in the live case? (Like fetching the list of processes?) > > > It does not actually has a duplicate code, the code for fetching the list of > processes via sysctl is different from the KVM case. The open file descriptors > processing is different as well. Because libprocstat implements almost the > same functionality both for sysctl and mvm backends, it can be used to analyze > both the live system and the kernel crash dumps. The code Mikolaj proposed > only implements the sysctl backend currently, so it does not seem to have > any relation to KVM, so it will be a bit weird to make it open a KVM handle > though it does not use it. IMHO, after adding procstat_getargv and procstat_getargv, the usage of kvm_getargv() and kvm_getenvv() (at least in the new code) may be deprecated. As this is stated in the man page, BUGS section, "these routines do not belong in the kvm interface". I suppose they are part of libkvm because there was no a better place for them. procstat(1) prefers direct sysctl to them (so, again, code duplication, which I am going to remove adding procstat_getargv/envv). -- Mikolaj Golub From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 08:10:45 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 767FD6C8 for ; Wed, 23 Jan 2013 08:10:45 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id C6E66FDA for ; Wed, 23 Jan 2013 08:10:44 +0000 (UTC) Received: (qmail 23874 invoked from network); 23 Jan 2013 09:32:03 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 23 Jan 2013 09:32:03 -0000 Message-ID: <50FF9AFE.9000406@freebsd.org> Date: Wed, 23 Jan 2013 09:10:38 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Artem Belevich Subject: Re: kmem_map auto-sizing and size dependencies References: <50F96A67.9080203@freebsd.org> <20130121210645.GC1341@garage.freebsd.pl> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Matthew Fleming , FreeBSD Current , freebsd-hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 08:10:45 -0000 On 23.01.2013 00:22, Artem Belevich wrote: > On Mon, Jan 21, 2013 at 1:06 PM, Pawel Jakub Dawidek wrote: >> On Fri, Jan 18, 2013 at 08:26:04AM -0800, mdf@FreeBSD.org wrote: >>>> Should it be set to a larger initial value based on min(physical,KVM) space >>>> available? >>> >>> It needs to be smaller than the physical space, [...] >> >> Or larger, as the address space can get fragmented and you might not be >> able to allocate memory even if you have physical pages available. > > +1 for relaxing upper limit. > > I routinely patch all my systems that use ZFS to allow kmem_map size > to be larger than physical memory. Otherwise on a system where most of > RAM goes towards ZFS ARC I used to eventually run into dreaded > kmem_map too small panic. During startup and VM initialization the following kernel VM maps are created: kernel_map (parent) specifying the entire kernel virtual address space. It is 512GB on amd64 currently. Out of the kernel_map a number of sub-maps are created: clean_map which isn't referenced anywhere else buffer_map used in vfs_bio.c for i/o buffers pager_map used in vm_page.c for paging exec_map used in kern/kern_exec.c and other places for program startup pipe_map used in kern/sys_pipe.c for pipe buffering kmem_map used in kern/kern_malloc. and vm/uma_core.c among other places and provides all kernel malloc and UMA zone memory allocations. Having the kernel occupy all of physical RAM eventually isn't pretty. So the problem you're describing is that even though enough kernel_map space is still available it is too fragmented to find a sufficiently large chunk. If the kmem_map is larger than the available physical memory another mechanism has to track and limit its physical memory consumption. This may become a SMP bottleneck due to synchronization issues. I haven't looked how the maps are managed internally. Maybe there is a natural hook to attach such a mechanism and to allow the sub-maps to be larger in kVM space than physical memory. Maybe ZFS then can have its own sub-map for ARC too. -- Andre From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 09:45:13 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A2D5CB1C; Wed, 23 Jan 2013 09:45:13 +0000 (UTC) (envelope-from danny@cs.huji.ac.il) Received: from kabab.cs.huji.ac.il (kabab.cs.huji.ac.il [132.65.16.84]) by mx1.freebsd.org (Postfix) with ESMTP id 5BA55402; Wed, 23 Jan 2013 09:45:13 +0000 (UTC) Received: from pampa.cs.huji.ac.il ([132.65.80.32]) by kabab.cs.huji.ac.il with esmtp id 1TxwtL-0004Zi-Vl; Wed, 23 Jan 2013 11:45:12 +0200 X-Mailer: exmh version 2.7.2 01/07/2005 with nmh-1.3 To: John Baldwin Subject: Re: solved: pmbr: Boot loader too large In-reply-to: <201301221206.58460.jhb@freebsd.org> References: <201301221206.58460.jhb@freebsd.org> Comments: In-reply-to John Baldwin message dated "Tue, 22 Jan 2013 12:06:58 -0500." Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Wed, 23 Jan 2013 11:45:11 +0200 From: Daniel Braniss Message-ID: Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 09:45:13 -0000 > > never underestimate the human stupidity (mine in this case) nor of the boot. > > pmbr will load the whole partition, which was 1M, instead of the size of > > gptboot :-( > > > > reducing the size of the slice/partition fixed the issue. > > pmbr doesn't have room to be but so smart. It can't parse a filesystem, so it > just loads a raw partition assuming that the partition is the boot loader. > The 545k bit has to do with where it is loaded. The boot loader has to live > in the lower 640k, but it starts at 0x7c00 (the address that the BIOS always > loads boot loaders). The 545k limit comes from 640k - 0x7c00. This is a > fundamental limit of the x86 BIOS architecture. Compared to the 15.5k that > UFS leaves for boot2 it is worlds of space. thanks for the info. If the error message was clearer might have saved some time :-) Partition size too big instead of Boot loader too large btw, thanks to grep -r I was to find it came from pmbr.s From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 15:43:05 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 002F151D for ; Wed, 23 Jan 2013 15:43:04 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id CE92CB2E for ; Wed, 23 Jan 2013 15:43:04 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 2292AB94A; Wed, 23 Jan 2013 10:43:04 -0500 (EST) From: John Baldwin To: freebsd-hackers@freebsd.org Subject: Re: NMI watchdog functionality on Freebsd Date: Wed, 23 Jan 2013 10:25:41 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <1358894455.17521.YahooMailClassic@web181706.mail.ne1.yahoo.com> In-Reply-To: <1358894455.17521.YahooMailClassic@web181706.mail.ne1.yahoo.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301231025.41118.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 23 Jan 2013 10:43:04 -0500 (EST) Cc: Sushanth Rai X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 15:43:05 -0000 On Tuesday, January 22, 2013 5:40:55 pm Sushanth Rai wrote: > Hi, > > Does freebsd have some functionality similar to Linux's NMI watchdog ? I'm aware of ichwd driver, but that depends to WDT to be available in the hardware. Even when it is available, BIOS needs to support a mechanism to trigger a OS level recovery to get any useful information when system is really wedged (with interrupt disabled). > > With Linux's NMI, APIC is programmed to periodically generate NMI and the OS NMI handler can check for some counters and invoke panic if the counters are not updated for a while. We currently use the local APIC timer as a timer with a normal interrupt. There's no reason you couldn't add a mode to make the local APIC timer operate in this fashion however. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 16:32:41 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id EF728A7B; Wed, 23 Jan 2013 16:32:41 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id BF751EEC; Wed, 23 Jan 2013 16:32:41 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 20D7CB911; Wed, 23 Jan 2013 11:32:41 -0500 (EST) From: John Baldwin To: Mikolaj Golub Subject: Re: libprocstat(3): retrieve process command line args and environment Date: Wed, 23 Jan 2013 11:31:43 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <20130119151253.GB88025@gmail.com> <9679EEE4-BE52-493E-9188-CAECEE5E63D3@freebsd.org> <20130123072459.GA48402@gmail.com> In-Reply-To: <20130123072459.GA48402@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301231131.43972.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Wed, 23 Jan 2013 11:32:41 -0500 (EST) Cc: Stanislav Sedov , freebsd-hackers@freebsd.org, Robert Watson X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 16:32:42 -0000 On Wednesday, January 23, 2013 2:25:00 am Mikolaj Golub wrote: > On Tue, Jan 22, 2013 at 02:17:39PM -0800, Stanislav Sedov wrote: > > > > On Jan 22, 2013, at 1:48 PM, John Baldwin wrote: > > > > > > Well, you could make procstat open a kvm handle in both cases (open a "live" > > > handle in the procstat_open_sysctl() case). It just seems rather silly to be > > > duplicating code in the two interfaces. > > In this particular case I prefer code duplication to opening a kvm > handle in procstat_open_sysctl(), as it looks a bit confusing. But I > can do this way if the agreement is reached. > > > > More a question for Robert: does > > > libprocstat intentionally duplicate the code in libkvm for other things as > > > well in the live case? (Like fetching the list of processes?) > > > > > It does not actually has a duplicate code, the code for fetching the list of > > processes via sysctl is different from the KVM case. The open file descriptors > > processing is different as well. Because libprocstat implements almost the > > same functionality both for sysctl and mvm backends, it can be used to analyze > > both the live system and the kernel crash dumps. The code Mikolaj proposed > > only implements the sysctl backend currently, so it does not seem to have > > any relation to KVM, so it will be a bit weird to make it open a KVM handle > > though it does not use it. > > IMHO, after adding procstat_getargv and procstat_getargv, the usage of > kvm_getargv() and kvm_getenvv() (at least in the new code) may be > deprecated. As this is stated in the man page, BUGS section, "these > routines do not belong in the kvm interface". I suppose they are part > of libkvm because there was no a better place for them. procstat(1) > prefers direct sysctl to them (so, again, code duplication, which I am > going to remove adding procstat_getargv/envv). Hmm, are you going to rewrite ps(1) to use libprocstat? Or rather, is that a goal someday? That is one current consumer of kvm_getargv/envv. That might be fine if we want to make more tools use libprocstat instead of using libkvm directly. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 16:47:52 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id B6AEF6A5 for ; Wed, 23 Jan 2013 16:47:52 +0000 (UTC) (envelope-from mjacob@freebsd.org) Received: from ns1.feral.com (ns1.feral.com [192.67.166.1]) by mx1.freebsd.org (Postfix) with ESMTP id 764AF75 for ; Wed, 23 Jan 2013 16:47:52 +0000 (UTC) Received: from [192.168.135.7] (quaver.net [76.14.49.207]) (authenticated bits=0) by ns1.feral.com (8.14.5/8.14.4) with ESMTP id r0NGlkJR032793 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 23 Jan 2013 08:47:46 -0800 (PST) (envelope-from mjacob@freebsd.org) Message-ID: <5100142D.7040904@freebsd.org> Date: Wed, 23 Jan 2013 08:47:41 -0800 From: Matthew Jacob Organization: FreeBSD User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Subject: Re: NMI watchdog functionality on Freebsd References: <1358894455.17521.YahooMailClassic@web181706.mail.ne1.yahoo.com> <201301231025.41118.jhb@freebsd.org> In-Reply-To: <201301231025.41118.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (ns1.feral.com [192.67.166.1]); Wed, 23 Jan 2013 08:47:46 -0800 (PST) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list Reply-To: mjacob@freebsd.org List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 16:47:52 -0000 On 1/23/2013 7:25 AM, John Baldwin wrote: > On Tuesday, January 22, 2013 5:40:55 pm Sushanth Rai wrote: >> Hi, >> >> Does freebsd have some functionality similar to Linux's NMI watchdog ? I'm > aware of ichwd driver, but that depends to WDT to be available in the > hardware. Even when it is available, BIOS needs to support a mechanism to > trigger a OS level recovery to get any useful information when system is > really wedged (with interrupt disabled) The principle purpose of a watchdog is to keep the system from hanging. Information is secondary. The ichwd driver can use the LPC part of ICH hardware that's been there since ICH version 4. I implemented this more fully at Panasas. The first importance is to keep the system from being hung. The next piece of information is to detect, on reboot, that a watchdog event occurred. Finally, trying to isolate why is good. This is equivalent to the tco_WDT stuff on Linux. It's not interrupt driven (it drives the reset line on the processor). From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 16:57:45 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D7D2FC53; Wed, 23 Jan 2013 16:57:45 +0000 (UTC) (envelope-from ian@FreeBSD.org) Received: from duck.symmetricom.us (duck.symmetricom.us [206.168.13.214]) by mx1.freebsd.org (Postfix) with ESMTP id 2C15C102; Wed, 23 Jan 2013 16:57:44 +0000 (UTC) Received: from damnhippie.dyndns.org (daffy.symmetricom.us [206.168.13.218]) by duck.symmetricom.us (8.14.6/8.14.6) with ESMTP id r0NGvbr0052719; Wed, 23 Jan 2013 09:57:37 -0700 (MST) (envelope-from ian@FreeBSD.org) Received: from [172.22.42.240] (revolution.hippie.lan [172.22.42.240]) by damnhippie.dyndns.org (8.14.3/8.14.3) with ESMTP id r0NGvYp9014257; Wed, 23 Jan 2013 09:57:34 -0700 (MST) (envelope-from ian@FreeBSD.org) Subject: Re: NMI watchdog functionality on Freebsd From: Ian Lepore To: mjacob@FreeBSD.org In-Reply-To: <5100142D.7040904@freebsd.org> References: <1358894455.17521.YahooMailClassic@web181706.mail.ne1.yahoo.com> <201301231025.41118.jhb@freebsd.org> <5100142D.7040904@freebsd.org> Content-Type: text/plain; charset="us-ascii" Date: Wed, 23 Jan 2013 09:57:33 -0700 Message-ID: <1358960253.32417.467.camel@revolution.hippie.lan> Mime-Version: 1.0 X-Mailer: Evolution 2.32.1 FreeBSD GNOME Team Port Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@FreeBSD.org, Sushanth Rai X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 16:57:45 -0000 On Wed, 2013-01-23 at 08:47 -0800, Matthew Jacob wrote: > On 1/23/2013 7:25 AM, John Baldwin wrote: > > On Tuesday, January 22, 2013 5:40:55 pm Sushanth Rai wrote: > >> Hi, > >> > >> Does freebsd have some functionality similar to Linux's NMI watchdog ? I'm > > aware of ichwd driver, but that depends to WDT to be available in the > > hardware. Even when it is available, BIOS needs to support a mechanism to > > trigger a OS level recovery to get any useful information when system is > > really wedged (with interrupt disabled) > The principle purpose of a watchdog is to keep the system from hanging. > Information is secondary. The ichwd driver can use the LPC part of ICH > hardware that's been there since ICH version 4. I implemented this more > fully at Panasas. The first importance is to keep the system from being > hung. The next piece of information is to detect, on reboot, that a > watchdog event occurred. Finally, trying to isolate why is good. > > This is equivalent to the tco_WDT stuff on Linux. It's not interrupt > driven (it drives the reset line on the processor). > I think there's value in the NMI watchdog idea, but unless you back it up with a real hardware watchdog you don't really have full watchdog functionality. If the NMI can get the OS to produce some extra info, that's great, and using an NMI gives you a good chance of doing that even if it is normal interrupt processing that has wedged the machine. But calling panic() invokes plenty of processing that can get wedged in other ways, so even an NMI-based watchdog isn't g'teed to get the machine running again. But adding a real hardware watchdog that fires on a slightly longer timeout than the NMI watchdog gives you the best of everything: you get information if it's possible to produce it, and you get a real hardware reset shortly thereafter if producing the info fails. -- Ian From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 20:14:51 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id F3BEB687; Wed, 23 Jan 2013 20:14:50 +0000 (UTC) (envelope-from carpeddiem@gmail.com) Received: from mail-oa0-f48.google.com (mail-oa0-f48.google.com [209.85.219.48]) by mx1.freebsd.org (Postfix) with ESMTP id A68A0C28; Wed, 23 Jan 2013 20:14:50 +0000 (UTC) Received: by mail-oa0-f48.google.com with SMTP id h2so8916266oag.21 for ; Wed, 23 Jan 2013 12:14:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=tD5eXihJNRkvfmzDh6+MSohTZo1h6Tf99obdWgrLxII=; b=soECJEEeCnDyBwJekV543XV+FVKnb8XJJiYuVnTTIopAsVb/4W1bYuy1ddSh3Bb7NM Ku4RCW8OXB5QJXRteWOAEaOOc5AEbp3ZLtv2yVBLhY+4gBvW/1jIS92h0Qykexd16SaE N/XW9m7W4Hys2CF3EhaFkCcgdBQvi5rZ5UyxoMhMsvyFhbVsOVyNzrtTLbvfFwP9IVFv fWqdDbycHTx/34knemKEwZfW+AM1Tg03teER5p7xgl4ZrqiT4Wwoy/plPzAc3qgYUONv ubjlrStXwbtgWbqeZNIiq+iXPYTzxE5C1jbTIQnt/8hqJEzgF0dJP2l0ngI8/w8l1BeI sMcw== MIME-Version: 1.0 X-Received: by 10.60.1.132 with SMTP id 4mr2038646oem.140.1358972084208; Wed, 23 Jan 2013 12:14:44 -0800 (PST) Sender: carpeddiem@gmail.com Received: by 10.60.36.234 with HTTP; Wed, 23 Jan 2013 12:14:44 -0800 (PST) In-Reply-To: <1358960253.32417.467.camel@revolution.hippie.lan> References: <1358894455.17521.YahooMailClassic@web181706.mail.ne1.yahoo.com> <201301231025.41118.jhb@freebsd.org> <5100142D.7040904@freebsd.org> <1358960253.32417.467.camel@revolution.hippie.lan> Date: Wed, 23 Jan 2013 15:14:44 -0500 X-Google-Sender-Auth: cQv7l8XMJc1H8fhrn8I7PU5U2Ko Message-ID: Subject: Re: NMI watchdog functionality on Freebsd From: Ed Maste To: Ian Lepore Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-hackers@freebsd.org, Sushanth Rai , mjacob@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 20:14:51 -0000 On 23 January 2013 11:57, Ian Lepore wrote: > > But adding a real hardware watchdog that fires on a slightly longer > timeout than the NMI watchdog gives you the best of everything: you get > information if it's possible to produce it, and you get a real hardware > reset shortly thereafter if producing the info fails. Yes, this is a great option if supported by hardware. Some Supermicro motherboards (like the X8STi) have two independent watchdogs, and one of them has a jumper to choose between NMI and reset upon expiry. In addition, the wbwd(4) driver has a sysctl to override the normal timeout to make the dual-stage watchdog possible: dev.wbwd.0.timeout_override This variable allows to program the timer to a value independent on the one provided by the watchdog(4) framework while still relying on the regular updates from e.g. watchdogd(8). This is particularly useful if your system provides multiple watchdogs and you want them to fire in a special sequence to trigger an NMI after a shorter period than the reset timeout for example. The value set must not be lower than the sleep time of watchdogd(8). A value of 0 disables this feature and the timeout value provided by watchdog(4) will be used. I hope this capability moves into the watchdog infrastructure rather than existing as a driver-specific kluge. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 20:23:02 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 86CD17DF; Wed, 23 Jan 2013 20:23:02 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id EA022CDE; Wed, 23 Jan 2013 20:23:01 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0NKMrEl001682; Wed, 23 Jan 2013 21:22:53 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0NKMqI3001679; Wed, 23 Jan 2013 21:22:53 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Wed, 23 Jan 2013 21:22:52 +0100 (CET) From: Wojciech Puchar To: Peter Jeremy Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: <20130122073641.GH30633@server.rulingia.com> Message-ID: References: <20130122073641.GH30633@server.rulingia.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Wed, 23 Jan 2013 21:22:53 +0100 (CET) Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 20:23:02 -0000 >> While RAID-Z is already a king of bad performance, > > I don't believe RAID-Z is any worse than RAID5. Do you have any actual > measurements to back up your claim? it is clearly described even in ZFS papers. Both on reads and writes it gives single drive random I/O performance. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 20:24:19 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9BC9A93F; Wed, 23 Jan 2013 20:24:19 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 0A006CF5; Wed, 23 Jan 2013 20:24:18 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0NKOGVX001691; Wed, 23 Jan 2013 21:24:16 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0NKOFFl001688; Wed, 23 Jan 2013 21:24:16 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Wed, 23 Jan 2013 21:24:15 +0100 (CET) From: Wojciech Puchar To: Matthew Ahrens Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <20130122073641.GH30633@server.rulingia.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Wed, 23 Jan 2013 21:24:16 +0100 (CET) Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 20:24:19 -0000 > This is because RAID-Z spreads each block out over all disks, whereas RAID5 > (as it is typically configured) puts each block on only one disk. So to > read a block from RAID-Z, all data disks must be involved, vs. for RAID5 > only one disk needs to have its head moved. > > For other workloads (especially streaming reads/writes), there is no > fundamental difference, though of course implementation quality may vary. streaming workload generally is always good. random I/O is what is important. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 20:26:44 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 35989CF5; Wed, 23 Jan 2013 20:26:44 +0000 (UTC) (envelope-from utisoft@gmail.com) Received: from mail-ie0-f178.google.com (mail-ie0-f178.google.com [209.85.223.178]) by mx1.freebsd.org (Postfix) with ESMTP id 00D61D55; Wed, 23 Jan 2013 20:26:43 +0000 (UTC) Received: by mail-ie0-f178.google.com with SMTP id c12so14695291ieb.9 for ; Wed, 23 Jan 2013 12:26:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=9XrBORLOe2IsLQljrp9MmYuhcFI8/4m7lInRuUOVuew=; b=v1vzwwfD+9DwZKuxlIRR1ok9A8sExeD+7auzbZ6SJvT2wWKx4GuhcBV87L8f5KREPX rO+qf0XJpnt8VpRZaKLJpOPHm9VOaoXIZpJQG1PyQPBgOveZyb5ENDxm8pqnAksICrkY rRYEgijFaQwGrFSZDASME7CLMIFFsDZJYoedTP7uHlB/cA3NGaKb+MrmOvgKjlb6uftw b6IhABdJbtmelaKRBPzjxvRJKIF71wXv31M+OC7Z5s+G5tjhGU0DiUaFsrKqK3Bms+Av v2B5BJTqzUylhN7rYSayRgKkI0WWXS5OBaxcz8BQgDDO4N9MxTgzEXRslB6sDhbYE+2s E8+g== MIME-Version: 1.0 X-Received: by 10.43.114.4 with SMTP id ey4mr1995217icc.27.1358972803501; Wed, 23 Jan 2013 12:26:43 -0800 (PST) Received: by 10.64.16.73 with HTTP; Wed, 23 Jan 2013 12:26:43 -0800 (PST) Received: by 10.64.16.73 with HTTP; Wed, 23 Jan 2013 12:26:43 -0800 (PST) In-Reply-To: References: <20130122073641.GH30633@server.rulingia.com> Date: Wed, 23 Jan 2013 20:26:43 +0000 Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. From: Chris Rees To: Wojciech Puchar Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 20:26:44 -0000 On 23 Jan 2013 20:23, "Wojciech Puchar" wrote: >>> >>> While RAID-Z is already a king of bad performance, >> >> >> I don't believe RAID-Z is any worse than RAID5. Do you have any actual >> measurements to back up your claim? > > > it is clearly described even in ZFS papers. Both on reads and writes it gives single drive random I/O performance. So we have to take your word for it? Provide a link if you're going to make assertions, or they're no more than your own opinion. Chris From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:09:36 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 40A15A73; Wed, 23 Jan 2013 21:09:36 +0000 (UTC) (envelope-from feld@feld.me) Received: from feld.me (unknown [IPv6:2607:f4e0:100:300::2]) by mx1.freebsd.org (Postfix) with ESMTP id E88DDF12; Wed, 23 Jan 2013 21:09:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=feld.me; s=blargle; h=In-Reply-To:Message-Id:From:Mime-Version:Date:References:Subject:Cc:To:Content-Type; bh=IFkEk7sxzpdO70vPGpC8J/+tzwpriEknEjCNY6nj8CI=; b=gwP/BUB4vMWXgEYRdTkkboCD7L4b2CuuCc2FlM8aFqvUKUSEwzsbtBJCAoRLSg9JdJdGEmNYxREFa3YBvnk2xjhGMr2AgIGniYuIaHdHqGWigjtT8cg+bVyBRAVsg8Ia; Received: from localhost ([127.0.0.1] helo=mwi1.coffeenet.org) by feld.me with esmtp (Exim 4.80.1 (FreeBSD)) (envelope-from ) id 1Ty7ZY-0003UU-Fb; Wed, 23 Jan 2013 15:09:28 -0600 Received: from feld@feld.me by mwi1.coffeenet.org (Archiveopteryx 3.1.4) with esmtpsa id 1358975362-13187-64949/5/1; Wed, 23 Jan 2013 21:09:22 +0000 Content-Type: text/plain; format=flowed; delsp=yes To: Wojciech Puchar , Chris Rees Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. References: <20130122073641.GH30633@server.rulingia.com> Date: Wed, 23 Jan 2013 15:09:22 -0600 Mime-Version: 1.0 From: Mark Felder Message-Id: In-Reply-To: User-Agent: Opera Mail/12.12 (FreeBSD) Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:09:36 -0000 On Wed, 23 Jan 2013 14:26:43 -0600, Chris Rees wrote: > > So we have to take your word for it? > Provide a link if you're going to make assertions, or they're no more > than > your own opinion. I've heard this same thing -- every vdev == 1 drive in performance. I've never seen any proof/papers on it though. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:10:55 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4CDF2D00; Wed, 23 Jan 2013 21:10:55 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-vc0-f182.google.com (mail-vc0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id F197DF40; Wed, 23 Jan 2013 21:10:54 +0000 (UTC) Received: by mail-vc0-f182.google.com with SMTP id fl17so4160851vcb.27 for ; Wed, 23 Jan 2013 13:10:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=f/8KJxGm7K3Y6HyhRdjkHuaatBFIhKtuAn2fAteXSO0=; b=IUtkWkHehzDzWfVuxnYY6/vaqlHyLwWsa8ktg0hkWDAOgoZs75m81WrPnCQS2PeaqJ NLiE/6A3B+uhfDuLOm2E/y8DU2douELunwmomb2+rA0UdrbVFrWgTsZqHXzDpm9YbsDL aOm658PmfJwjD1KQfLTuLdOBVJ02BYPDVlQBmpusm9/7Pak7R/aQMTHHYXqiM6a4jb8U WzPB1X1yZ4NMoTQmwfzq7g1H+k6OOAnA4ZnCvfN1X6KJ+1jHIvIRW+JmTz1RVcOW7QEW DdaW+skTNQ74kVQ2GL5ICI1KmbipFGCmG96skC4ntl8VKR2LKLTdKCQDa8nTdL12qXQb G7oA== MIME-Version: 1.0 X-Received: by 10.52.67.45 with SMTP id k13mr2820757vdt.9.1358975454188; Wed, 23 Jan 2013 13:10:54 -0800 (PST) Sender: artemb@gmail.com Received: by 10.220.123.2 with HTTP; Wed, 23 Jan 2013 13:10:54 -0800 (PST) In-Reply-To: References: <20130122073641.GH30633@server.rulingia.com> Date: Wed, 23 Jan 2013 13:10:54 -0800 X-Google-Sender-Auth: 2JIK8ydSsaEj5o2ILc8pnpmjeT8 Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. From: Artem Belevich To: Wojciech Puchar Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:10:55 -0000 On Wed, Jan 23, 2013 at 12:22 PM, Wojciech Puchar wrote: >>> While RAID-Z is already a king of bad performance, >> >> >> I don't believe RAID-Z is any worse than RAID5. Do you have any actual >> measurements to back up your claim? > > > it is clearly described even in ZFS papers. Both on reads and writes it > gives single drive random I/O performance. For reads - true. For writes it's probably behaves better than RAID5 as it does not have to go through read-modify-write for partial block updates. Search for RAID-5 write hole. If you need higher performance, build your pool out of multiple RAID-Z vdevs. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:23:48 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id E024648C; Wed, 23 Jan 2013 21:23:48 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-vb0-f47.google.com (mail-vb0-f47.google.com [209.85.212.47]) by mx1.freebsd.org (Postfix) with ESMTP id 84C6BFCC; Wed, 23 Jan 2013 21:23:48 +0000 (UTC) Received: by mail-vb0-f47.google.com with SMTP id e21so1650535vbm.34 for ; Wed, 23 Jan 2013 13:23:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=24nHO+462d1riarIWZ5ktl944GDji+uBqmN8eJS75Cw=; b=wtDtseVtMbZ3aY9LCKhAvibga+3GBvllFcKrnr5X67xko9tcHej7CHS+5de55u6fPI mSw6UGC/vy7CCECS9XokyHrobwLanuG3gm1QUAbt5y5N6T47panY+5z88Plsbf9aYRwb BJ+8tcZyJOmfCWQiFx6pewFIAOURP7XVK3wOU5hUrF2n3gqhSZtAWoiOA/NF2vHRXjep XJTePdWwVJlP0KMPB9DQl+xJZ6xvPtA/W4SOFPS3uZFCcWQ3JUVtmSOXikZV/P2wK6Le vygIVFSomCJ6GD/z9Bp+NF/6F6yBB4QOUZnZn8/XbNuhvUyc2GIT26pdjqqP2LMPOjn+ X4hQ== MIME-Version: 1.0 X-Received: by 10.52.76.7 with SMTP id g7mr2694351vdw.95.1358976227679; Wed, 23 Jan 2013 13:23:47 -0800 (PST) Sender: artemb@gmail.com Received: by 10.220.123.2 with HTTP; Wed, 23 Jan 2013 13:23:47 -0800 (PST) In-Reply-To: References: <20130122073641.GH30633@server.rulingia.com> Date: Wed, 23 Jan 2013 13:23:47 -0800 X-Google-Sender-Auth: H1jEo4QrSn9xnvkNQGX15PeFTCk Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. From: Artem Belevich To: Mark Felder Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs , Wojciech Puchar , Chris Rees , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:23:48 -0000 On Wed, Jan 23, 2013 at 1:09 PM, Mark Felder wrote: > On Wed, 23 Jan 2013 14:26:43 -0600, Chris Rees wrote: > >> >> So we have to take your word for it? >> Provide a link if you're going to make assertions, or they're no more than >> your own opinion. > > > I've heard this same thing -- every vdev == 1 drive in performance. I've > never seen any proof/papers on it though. "1 drive in performance" only applies to number of random i/o operations vdev can perform. You still get increased throughput. I.e. 5-drive RAIDZ will have 4x bandwidth of individual disks in vdev, but would deliver only as many IOPS as the slowest drive as record would have to be read back from N-1 or N-2 drived in vdev. It's the same for RAID5. IMHO for identical record/block size RAID5 has no advantage over RAID-Z for reads and does have disadvantage when it comes to small writes. Never mind lack of data integrity checks and other bells and whistles ZFS provides. --Artem From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:24:26 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A57855E3; Wed, 23 Jan 2013 21:24:26 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 1937FFDE; Wed, 23 Jan 2013 21:24:25 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0NLOF1t001978; Wed, 23 Jan 2013 22:24:15 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0NLOFxU001975; Wed, 23 Jan 2013 22:24:15 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Wed, 23 Jan 2013 22:24:15 +0100 (CET) From: Wojciech Puchar To: Mark Felder Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <20130122073641.GH30633@server.rulingia.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Wed, 23 Jan 2013 22:24:16 +0100 (CET) Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, Chris Rees X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:24:26 -0000 > > I've heard this same thing -- every vdev == 1 drive in performance. I've > never seen any proof/papers on it though. read original ZFS papers. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:25:09 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 4CF28740; Wed, 23 Jan 2013 21:25:09 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id B2C7BFF5; Wed, 23 Jan 2013 21:25:08 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0NLP79g001989; Wed, 23 Jan 2013 22:25:07 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0NLP6YK001986; Wed, 23 Jan 2013 22:25:06 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Wed, 23 Jan 2013 22:25:06 +0100 (CET) From: Wojciech Puchar To: Artem Belevich Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <20130122073641.GH30633@server.rulingia.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Wed, 23 Jan 2013 22:25:07 +0100 (CET) Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:25:09 -0000 >> gives single drive random I/O performance. > > For reads - true. For writes it's probably behaves better than RAID5 yes, because as with reads it gives single drive performance. small writes on RAID5 gives lower than single disk performance. > If you need higher performance, build your pool out of multiple RAID-Z vdevs. even you need normal performance use gmirror and UFS From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:27:06 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 885A199F; Wed, 23 Jan 2013 21:27:06 +0000 (UTC) (envelope-from utisoft@gmail.com) Received: from mail-ie0-f181.google.com (mail-ie0-f181.google.com [209.85.223.181]) by mx1.freebsd.org (Postfix) with ESMTP id 53DE990; Wed, 23 Jan 2013 21:27:06 +0000 (UTC) Received: by mail-ie0-f181.google.com with SMTP id 16so14553810iea.12 for ; Wed, 23 Jan 2013 13:27:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=Ivy7C2YjvDfWp3QZS8yC0nQrhMbEcWyVaemhiS/3fa8=; b=zD5E0aIEDSnyWNt0WzLtOfa9WlNyfsbpL1BeP1fFSSuMTjB8U/Xx2KqiEuVtnzcfmo 1TieAnjL4moMybwoRUZoPMLW7FMXC1cEEbfD01pnIvos9EiOisrk8xmjer0wZvJiQFjO CaV9YMQVmxQ6k5xe8VtmoNAYp2tKxJTJs4p+zIIDi3PMVHQLNVJRBNUvs9zHOrhlKxf0 BqeIid6lCH7n0MFMEcimxhv+Wf4oU6/vVBaT31ozBkRC2JR3jOY1Xw6cD58Zk+gnaBF1 LZ6foO2KcXV/xmcV+VDMdTuNVO8djuadNVQbfqX0LONHxku0Yfnee/S9oLDh9129MiyQ wS4Q== X-Received: by 10.50.202.97 with SMTP id kh1mr16748010igc.15.1358976425937; Wed, 23 Jan 2013 13:27:05 -0800 (PST) MIME-Version: 1.0 Received: by 10.64.16.73 with HTTP; Wed, 23 Jan 2013 13:26:35 -0800 (PST) In-Reply-To: References: <20130122073641.GH30633@server.rulingia.com> From: Chris Rees Date: Wed, 23 Jan 2013 21:26:35 +0000 Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. To: Wojciech Puchar Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-fs@freebsd.org" , "freebsd-hackers@freebsd.org" , Mark Felder X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:27:06 -0000 On 23 January 2013 21:24, Wojciech Puchar wrote: >> >> I've heard this same thing -- every vdev == 1 drive in performance. I've >> never seen any proof/papers on it though. > > read original ZFS papers. No, you are making the assertion, provide a link. Chris From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:40:44 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 6E2811E6; Wed, 23 Jan 2013 21:40:44 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id BAF5C15F; Wed, 23 Jan 2013 21:40:43 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0NLeesC002071; Wed, 23 Jan 2013 22:40:40 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0NLedY5002068; Wed, 23 Jan 2013 22:40:39 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Wed, 23 Jan 2013 22:40:39 +0100 (CET) From: Wojciech Puchar To: Artem Belevich Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <20130122073641.GH30633@server.rulingia.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Wed, 23 Jan 2013 22:40:40 +0100 (CET) Cc: freebsd-fs , FreeBSD Hackers , Mark Felder , Chris Rees X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:40:44 -0000 > "1 drive in performance" only applies to number of random i/o > operations vdev can perform. You still get increased throughput. I.e. > 5-drive RAIDZ will have 4x bandwidth of individual disks in vdev, but unless your work is serving movies it doesn't matter. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:28:34 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 76C56B06 for ; Wed, 23 Jan 2013 21:28:34 +0000 (UTC) (envelope-from talon@lpthe.jussieu.fr) Received: from shiva.jussieu.fr (shiva.jussieu.fr [134.157.0.129]) by mx1.freebsd.org (Postfix) with ESMTP id E8B92B1 for ; Wed, 23 Jan 2013 21:28:33 +0000 (UTC) Received: from parthe.lpthe.jussieu.fr (parthe.lpthe.jussieu.fr [134.157.10.1]) by shiva.jussieu.fr (8.14.4/jtpda-5.4) with ESMTP id r0NLMopj014617 ; Wed, 23 Jan 2013 22:23:04 +0100 (CET) X-Ids: 168 Received: from [192.168.1.100] (sge91-2-82-227-32-26.fbx.proxad.net [82.227.32.26]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (Client did not present a certificate) by parthe.lpthe.jussieu.fr (Postfix) with ESMTPSA id 9B7082268D; Wed, 23 Jan 2013 22:22:49 +0100 (CET) From: Michel Talon Content-Type: multipart/signed; boundary="Apple-Mail=_60D7FFA2-FBE5-4978-BAFE-C56FF486A309"; protocol="application/pkcs7-signature"; micalg=sha1 Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. Date: Wed, 23 Jan 2013 22:22:48 +0100 Message-Id: <101D6382-BF57-43EB-A5FA-A63D4062F5FD@lpthe.jussieu.fr> To: Mark Felder Mime-Version: 1.0 (Apple Message framework v1283) X-Mailer: Apple Mail (2.1283) X-Miltered: at jchkmail.jussieu.fr with ID 510054AB.000 by Joe's j-chkmail (http : // j-chkmail dot ensmp dot fr)! X-j-chkmail-Enveloppe: 510054AB.000/134.157.10.1/parthe.lpthe.jussieu.fr/parthe.lpthe.jussieu.fr/ X-Mailman-Approved-At: Wed, 23 Jan 2013 21:45:15 +0000 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:28:34 -0000 --Apple-Mail=_60D7FFA2-FBE5-4978-BAFE-C56FF486A309 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii On Wed, 23 Jan 2013 14:26:43 -0600, Chris Rees wrote: > > So we have to take your word for it? > Provide a link if you're going to make assertions, or they're no more > than > your own opinion. I've heard this same thing -- every vdev == 1 drive in performance. I've never seen any proof/papers on it though. first google answer from request "raids performance" https://blogs.oracle.com/roch/entry/when_to_and_not_to Effectively, as a first approximation, an N-disk RAID-Z group will behave as a single device in terms of delivered random input IOPS. Thus a 10-disk group of devices each capable of 200-IOPS, will globally act as a 200-IOPS capable RAID-Z group. This is the price to pay to achieve proper data protection without the 2X block overhead associated with mirroring. -- Michel Talon talon@lpthe.jussieu.fr --Apple-Mail=_60D7FFA2-FBE5-4978-BAFE-C56FF486A309 Content-Disposition: attachment; filename=smime.p7s Content-Type: application/pkcs7-signature; name=smime.p7s Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIIbzCCA7Yw ggKeoAMCAQICAQMwDQYJKoZIhvcNAQEFBQAwLDELMAkGA1UEBhMCRlIxDTALBgNVBAoTBENOUlMx DjAMBgNVBAMTBUNOUlMyMB4XDTA5MDEyMTA5MDM1MloXDTI5MDEyMDA5MDM1MlowNTELMAkGA1UE BhMCRlIxDTALBgNVBAoTBENOUlMxFzAVBgNVBAMTDkNOUlMyLVN0YW5kYXJkMIIBIjANBgkqhkiG 9w0BAQEFAAOCAQ8AMIIBCgKCAQEAnKlkarQHIxnDvggIxOIqXe3UKN7+P6DtkkRrFkc1EzeNdKn1 TYPkBRuPCGFM3ndb16n/u2Wdyaw8D/GJe5MioEcPXwa+jnigC3nXQmVhcmOSQIpbZxD61ic+2HdN Hnnbb0sSAFJY4thCBbIzN3fgjWwdvPj28pRYJfeC2YbZXPPYLs39cIkEh+850SrYkoxpLxxSZfpg jxB/zI/5XC4U7UyL4J03uNI8lMpQ/UF63vY87K7svVwW3bDwc5l6gf87M9IAnk2Mxls4LjPDdobK clTbLeIQ/ZJQaJOE7XepiWlRhevglKP5lwgRjCTwD7o4tCzW12xOY/60MZ/vj6ZapQIDAQABo4HZ MIHWMA8GA1UdEwEB/wQFMAMBAf8wHQYDVR0OBBYEFBHj2dFSRxtZsTwbeGZr9KGI7QpbMFQGA1Ud IwRNMEuAFFCXtg33rDMXr/EdRjxrO/8AoOXloTCkLjAsMQswCQYDVQQGEwJGUjENMAsGA1UEChME Q05SUzEOMAwGA1UEAxMFQ05SUzKCAQAwDgYDVR0PAQH/BAQDAgEGMD4GA1UdHwQ3MDUwM6AxoC+G LWh0dHA6Ly9jcmxzLnNlcnZpY2VzLmNucnMuZnIvQ05SUzIvZ2V0ZGVyLmNybDANBgkqhkiG9w0B AQUFAAOCAQEAT+njF+ZMJ/UXalBV6u7PTKq97izddj5ZoC8LaInaQ9AeHSxrEvlnE55lK6SE0jHP gqDK7yLoEGzpzxd8rK2HhUyK4dV7TObZDrKh5CmeIK8PPnu5fyRMMuCI/nrarBZgoXWuiZyKZp2U un6rDiAj7ffHhF2CSBTexNSwxU4sh9SNAxEvNtUpb66ZZxkMjW1aIN/Rn8bLr1XuC8qxWw/vXHT0 80aJY0d+LM6/yDANAEb2GOZsPzB+kG4QjR85Sc+TaevInsJnc69Ki/Z8Qijdpd3tr8lVG2Q/VLxh JhDrkdXp9+7Q9gsL+qaQ3WD0QJ0Lp5z4zi8hOP6rBr/aDXf6ZzCCBLEwggOZoAMCAQICAlA7MA0G CSqGSIb3DQEBBQUAMDUxCzAJBgNVBAYTAkZSMQ0wCwYDVQQKEwRDTlJTMRcwFQYDVQQDEw5DTlJT Mi1TdGFuZGFyZDAeFw0xMjA0MjQxMTE2MDJaFw0xNDA0MjQxMTE2MDJaMGwxCzAJBgNVBAYTAkZS MQ0wCwYDVQQKEwRDTlJTMRAwDgYDVQQLEwdVTVI3NTg5MRUwEwYDVQQDEwxNaWNoZWwgVGFsb24x JTAjBgkqhkiG9w0BCQEWFnRhbG9uQGxwdGhlLmp1c3NpZXUuZnIwggEiMA0GCSqGSIb3DQEBAQUA A4IBDwAwggEKAoIBAQDgBxV/HgKdmPL6b8jVvUTR1SWD0aKiqo5WtdQqBIcmN2/dNqXbNt9YNznR Y4KLzV8VIFSj4WqDfzxCx6Xxulww7iwP8FE+Mt9NEQFtYoh9yZdokMTPQgYQ/sJTbvTSLK2f9IRV HCTnJkG1jRLzNKp9T+jbLpauKNX97jAzGFa5pqip0ARGduLmVTSEY4yDzgMjUXT/ghrA+emngSBW RDMj2nbcvPi5UqvbHv6oC/HrSl8YiN7zck9A7sWQqw8Dkzi7y5Az9cHaK5baoDJsmXtJjdE2poJU QFgj+qw5v1//rzWao6vF1Y2YuALbDk2rY0FGbHKBvQdUIOd9FPWVHcDhAgMBAAGjggGSMIIBjjAM BgNVHRMBAf8EAjAAMBEGCWCGSAGG+EIBAQQEAwIEsDAOBgNVHQ8BAf8EBAMCBeAwegYJYIZIAYb4 QgENBG0Wa0NlcnRpZmljYXQgQ05SUzItU3RhbmRhcmQuIFBvdXIgdG91dGUgaW5mb3JtYXRpb24g c2UgcmVwb3J0ZXIg4CBodHRwOi8vaWdjLnNlcnZpY2VzLmNucnMuZnIvQ05SUzItU3RhbmRhcmQv MB0GA1UdDgQWBBQVwmDyLZNTeQ9+lLkAnYAWHPUEyTBUBgNVHSMETTBLgBQR49nRUkcbWbE8G3hm a/ShiO0KW6EwpC4wLDELMAkGA1UEBhMCRlIxDTALBgNVBAoTBENOUlMxDjAMBgNVBAMTBUNOUlMy ggEDMCEGA1UdEQQaMBiBFnRhbG9uQGxwdGhlLmp1c3NpZXUuZnIwRwYDVR0fBEAwPjA8oDqgOIY2 aHR0cDovL2NybHMuc2VydmljZXMuY25ycy5mci9DTlJTMi1TdGFuZGFyZC9nZXRkZXIuY3JsMA0G CSqGSIb3DQEBBQUAA4IBAQBZqTWOw11I3rblql8PEPHhnPntu2IEg9DQRgP8jucvURw8KpZ7s94W 4hP13kmDhqGC4/St/XH4IlT86L64Zqt3toUoncLPjBfBYmQWGUgC08qhhuopLqsE1uvkdg5IX0bz RqOpEeybEpdrWrTs5DmGycZf93C1ud8PTLhSQYjYhGPfrOw5/3RNUQvL1yZK4ZaqzWucmvT9fjMW h24ofSN0sJyrBTInv30CzauKlIacgBtTfaMyQkoR1tch3hn57UI9w0/Ad5ywkdgFbA7PVBO12Bsi WHNpHVk1muWUYaGLYE7L4rvx1klv2uS+yQ7aJ1De9MtFy655R//GaJJlgfLvMYICXDCCAlgCAQEw OzA1MQswCQYDVQQGEwJGUjENMAsGA1UEChMEQ05SUzEXMBUGA1UEAxMOQ05SUzItU3RhbmRhcmQC AlA7MAkGBSsOAwIaBQCggfcwGAYJKoZIhvcNAQkDMQsGCSqGSIb3DQEHATAcBgkqhkiG9w0BCQUx DxcNMTMwMTIzMjEyMjQ5WjAjBgkqhkiG9w0BCQQxFgQUy9jBJu0rTz6WuxCftuTwOGYHrm0wSgYJ KwYBBAGCNxAEMT0wOzA1MQswCQYDVQQGEwJGUjENMAsGA1UEChMEQ05SUzEXMBUGA1UEAxMOQ05S UzItU3RhbmRhcmQCAlA7MEwGCyqGSIb3DQEJEAILMT2gOzA1MQswCQYDVQQGEwJGUjENMAsGA1UE ChMEQ05SUzEXMBUGA1UEAxMOQ05SUzItU3RhbmRhcmQCAlA7MA0GCSqGSIb3DQEBAQUABIIBAMfe tJdTTuE0KCyT/icPW1slQ4MHlAaJl3/JIAHMZgdZ4Fp7F+csgaVN1arLhIg19e4KUuAbM6LmVCSm BBhlsVbwA7fxXC/PqBPr8V2EuXHF+qal0qBzxiXIBY0lOZA56ZdSiFR9ayie6zyk11hQQIzG2S2W 4t9y0pPZazgHNJyStzBoPH+CE8Tvu2ZAQRGRdxfgp3AX3REnM7bON2I34da3fK2qh+NjDCw3uuy/ WeDCOPXOmYlN74Lt7HR3rAspUardYYrNn8ZjIHQ4Blcy7UgGLvBWROf8aNQVN2SnPHQs8ypdMm5J eph5zqMujU/ZQ+GQwiN7m88EPRT581XOr+oAAAAAAAA= --Apple-Mail=_60D7FFA2-FBE5-4978-BAFE-C56FF486A309-- From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:49:38 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 074AB93B; Wed, 23 Jan 2013 21:49:38 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-vb0-f47.google.com (mail-vb0-f47.google.com [209.85.212.47]) by mx1.freebsd.org (Postfix) with ESMTP id AB8B6204; Wed, 23 Jan 2013 21:49:37 +0000 (UTC) Received: by mail-vb0-f47.google.com with SMTP id e21so1664343vbm.6 for ; Wed, 23 Jan 2013 13:49:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=z4/E1kLIId/DXawG3IwyNiUPD7HpgHwVXChoKr62Ehs=; b=otq09/+ZHVb0J5x6vQY6LCtcvyBozJEi7CUmJ9PybWB0PbtqGiYL5Ji9lr+JL9I0Gd NIfIuY17oqebpp93lkjUM0p+nUce+/f18QkKfSus6DExRPptLDWvfcXox/PC5Dsz68Z5 iQ4vHz14I4l8wOHMvyFFpG/K1TL4zh3U6KhGLDJjcVGv1yvq/mWCEgAAFL4SIe+z/tPx nD2EUdxovoIp/PsHJC5Wa4HLw9/+miFcQqOk4waDE7c/BjX6g6T4QXb7NRmIV8Z4H2ki /3/kd6qL5Nujb0G/4y9EYSyFC8wehQ4IjZGf4mqujActecSvqEOGRxwKuon+1nfvKl0o 8fhw== MIME-Version: 1.0 X-Received: by 10.220.156.197 with SMTP id y5mr3146793vcw.17.1358977777180; Wed, 23 Jan 2013 13:49:37 -0800 (PST) Sender: artemb@gmail.com Received: by 10.220.123.2 with HTTP; Wed, 23 Jan 2013 13:49:37 -0800 (PST) In-Reply-To: References: <20130122073641.GH30633@server.rulingia.com> Date: Wed, 23 Jan 2013 13:49:37 -0800 X-Google-Sender-Auth: vIBQ5fJ98BDdbeD0elQH_y-D0JM Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. From: Artem Belevich To: Wojciech Puchar Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:49:38 -0000 On Wed, Jan 23, 2013 at 1:25 PM, Wojciech Puchar wrote: >>> gives single drive random I/O performance. >> >> >> For reads - true. For writes it's probably behaves better than RAID5 > > > yes, because as with reads it gives single drive performance. small writes > on RAID5 gives lower than single disk performance. > > >> If you need higher performance, build your pool out of multiple RAID-Z >> vdevs. > > even you need normal performance use gmirror and UFS I've no objection. If it works for you -- go for it. For me personally ZFS performance is good enough, and data integrity verification is something that I'm willing to sacrifice some performance for. ZFS scrub gives me either warm and fuzzy feeling that everything is OK, or explicitly tells me that something bad happened *and* reconstructs the data if it's possible. Just my $0.02, --Artem From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:49:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 08A65A87; Wed, 23 Jan 2013 21:49:56 +0000 (UTC) (envelope-from to.my.trociny@gmail.com) Received: from mail-la0-f46.google.com (mail-la0-f46.google.com [209.85.215.46]) by mx1.freebsd.org (Postfix) with ESMTP id EAE7020B; Wed, 23 Jan 2013 21:49:54 +0000 (UTC) Received: by mail-la0-f46.google.com with SMTP id fq12so6911485lab.33 for ; Wed, 23 Jan 2013 13:49:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:sender:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=/lP5uzRRBXQPOa5C3Dk4kV1he7/AuOyhXWxXG83S2Yg=; b=bug4+mu0bPCUHR8nYlKGXGnt3ukejEpT7oze6em4sHcKlJPAQnL/cc+laMGEG0nAQQ u22pLoj/k9tJSJpHgcb63QCLKwWa/c/8tYAmwFn0f1c/aRatJYpsPKh57/8Kf3TQQG4M ICovVJjb36qop6gAiIJNsyTv6QQVRykBkdkoc/bxxArP9swdRMRZyKZgY47cs9F5V+pn 3EWuC4BooXJ1HbeZg0/6TbOmsR3ReE46qoAC6mbPOt0uboeu0hGQiRKhlZR7gav66yT0 aQdc6/ZvakQPZ50h6vC7JljwMy+Yzc3sviBoWzHb04qQl8nCn0tHazhN+DHE9Hus+NSp p4MA== X-Received: by 10.152.121.212 with SMTP id lm20mr2715734lab.42.1358977793621; Wed, 23 Jan 2013 13:49:53 -0800 (PST) Received: from localhost ([178.150.115.244]) by mx.google.com with ESMTPS id er8sm8886435lbb.9.2013.01.23.13.49.51 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 23 Jan 2013 13:49:52 -0800 (PST) Sender: Mikolaj Golub Date: Wed, 23 Jan 2013 23:49:50 +0200 From: Mikolaj Golub To: John Baldwin Subject: Re: libprocstat(3): retrieve process command line args and environment Message-ID: <20130123214949.GA3120@gmail.com> References: <20130119151253.GB88025@gmail.com> <9679EEE4-BE52-493E-9188-CAECEE5E63D3@freebsd.org> <20130123072459.GA48402@gmail.com> <201301231131.43972.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201301231131.43972.jhb@freebsd.org> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: Stanislav Sedov , freebsd-hackers@freebsd.org, Robert Watson X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:49:56 -0000 On Wed, Jan 23, 2013 at 11:31:43AM -0500, John Baldwin wrote: > On Wednesday, January 23, 2013 2:25:00 am Mikolaj Golub wrote: > > IMHO, after adding procstat_getargv and procstat_getargv, the usage of > > kvm_getargv() and kvm_getenvv() (at least in the new code) may be > > deprecated. As this is stated in the man page, BUGS section, "these > > routines do not belong in the kvm interface". I suppose they are part > > of libkvm because there was no a better place for them. procstat(1) > > prefers direct sysctl to them (so, again, code duplication, which I am > > going to remove adding procstat_getargv/envv). > > Hmm, are you going to rewrite ps(1) to use libprocstat? Or rather, is that a > goal someday? That is one current consumer of kvm_getargv/envv. That might > be fine if we want to make more tools use libprocstat instead of using libkvm > directly. I didn't have any plans for ps(1) :-) That is why I wrote about "new code". But if you think it is good to do I might look at it one day... -- Mikolaj Golub From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:52:27 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id C684BC5A; Wed, 23 Jan 2013 21:52:27 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-bk0-f42.google.com (mail-bk0-f42.google.com [209.85.214.42]) by mx1.freebsd.org (Postfix) with ESMTP id 2D11723D; Wed, 23 Jan 2013 21:52:26 +0000 (UTC) Received: by mail-bk0-f42.google.com with SMTP id ji2so4817147bkc.1 for ; Wed, 23 Jan 2013 13:52:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:subject:mime-version:content-type:from:in-reply-to:date :cc:message-id:references:to:x-mailer; bh=XiQiaGkVwDhhJNAEPPX1sWe7zXw4tcQ5J6NnQz+bXkY=; b=HAU8BQsV7QPkb0WccBWwF+B9+c7yab46DnAnuWig7HgRqHhEYHTtLqNwjyABhl3+oq uhO4PBu3qRQzPJjm/Uxq1yIxs9MgxFIIMBfh+mOtoCLMy+leCzo4xqenJ4PIPuoEWXF5 wIr6RBuuMYj2qQOtxWGB0Q0LjXCpG69sRroCg+gVB6opFEGc+FTy7GZk37l+t7sXnxcn SMKHxNfC+pk3xYKqFHv6VcJisy0W6VAnQ+oD8L5Zq5+swnLKDgmTWSmkKg7jW9BsYqK7 74jct2gjtLuPng0snCnbRPDD4K2Q6D3pAOJHlftbQ/CgTL01QEXSR8q06V3k1oKlOfdu 7wCw== X-Received: by 10.205.122.9 with SMTP id ge9mr1020974bkc.59.1358977940417; Wed, 23 Jan 2013 13:52:20 -0800 (PST) Received: from [10.0.0.3] ([93.152.184.10]) by mx.google.com with ESMTPS id v8sm15533709bku.6.2013.01.23.13.52.18 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 23 Jan 2013 13:52:19 -0800 (PST) Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) From: Nikolay Denev In-Reply-To: Date: Wed, 23 Jan 2013 23:52:21 +0200 Message-Id: <6DBE5200-47E6-4D00-AB25-83CB5250DFC0@gmail.com> References: <20130122073641.GH30633@server.rulingia.com> To: Mark Felder X-Mailer: Apple Mail (2.1499) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org, Wojciech Puchar , Chris Rees , freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:52:27 -0000 On Jan 23, 2013, at 11:09 PM, Mark Felder wrote: > On Wed, 23 Jan 2013 14:26:43 -0600, Chris Rees = wrote: >=20 >>=20 >> So we have to take your word for it? >> Provide a link if you're going to make assertions, or they're no more = than >> your own opinion. >=20 > I've heard this same thing -- every vdev =3D=3D 1 drive in = performance. I've never seen any proof/papers on it though. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" Here is a blog post that describes why this is true for IOPS: = http://constantin.glez.de/blog/2010/04/ten-ways-easily-improve-oracle-sola= ris-zfs-filesystem-performance From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 21:53:58 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 2989BDBD for ; Wed, 23 Jan 2013 21:53:58 +0000 (UTC) (envelope-from utisoft@gmail.com) Received: from mail-ie0-f179.google.com (mail-ie0-f179.google.com [209.85.223.179]) by mx1.freebsd.org (Postfix) with ESMTP id EE66E25B for ; Wed, 23 Jan 2013 21:53:57 +0000 (UTC) Received: by mail-ie0-f179.google.com with SMTP id k14so14401536iea.38 for ; Wed, 23 Jan 2013 13:53:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:mime-version:in-reply-to:references:from:date:message-id :subject:to:cc:content-type; bh=TCi2TIgpprzb6yn/frdpJfCPM7eAnbwirxbPdf+6fDI=; b=KK2MdJ+jDdgRfOYN4vQ74tvmhQSA8a/tNAFVK2ayEtBi8jNAzeysEdaju3huEa9m/L zmyWHDtBzgZVUQ4x6SCMGAxjlI8y3KEzGmTC3WRTC6XuuIMjppdKTxRfqpMe8NMPC1FA pmE8put5FS+xvqf7u+Ye7iHMfy5XYP5Qak3d9rYmQqgOkQhYEKCPBciE5wEuJDzkZ1AS LA4PbgInCVphXVFHmffMBxR6+ILUMcVUal/f2JkRPfz/6v8jTrxMkcOkyGfLAox1N4mI pbPo8/fQl/QIKwEvbosmja+AGdpy1UiSP0R58rEgf3tPszDF12T3lJW/iVFjr7g3fROo 3WNQ== X-Received: by 10.42.92.72 with SMTP id s8mr2204154icm.0.1358978037559; Wed, 23 Jan 2013 13:53:57 -0800 (PST) MIME-Version: 1.0 Received: by 10.64.16.73 with HTTP; Wed, 23 Jan 2013 13:53:27 -0800 (PST) In-Reply-To: <101D6382-BF57-43EB-A5FA-A63D4062F5FD@lpthe.jussieu.fr> References: <101D6382-BF57-43EB-A5FA-A63D4062F5FD@lpthe.jussieu.fr> From: Chris Rees Date: Wed, 23 Jan 2013 21:53:27 +0000 Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. To: Michel Talon Content-Type: text/plain; charset=ISO-8859-1 Cc: "freebsd-hackers@freebsd.org" , Mark Felder X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 21:53:58 -0000 On 23 Jan 2013 21:45, "Michel Talon" wrote: > > On Wed, 23 Jan 2013 14:26:43 -0600, Chris Rees wrote: > > > > > So we have to take your word for it? > > Provide a link if you're going to make assertions, or they're no more > > than > > your own opinion. > > I've heard this same thing -- every vdev == 1 drive in performance. I've > never seen any proof/papers on it though. > > > first google answer from request "raids performance" > https://blogs.oracle.com/roch/entry/when_to_and_not_to > > Effectively, as a first approximation, an N-disk RAID-Z group will > behave as a single device in terms of delivered random input > IOPS. Thus a 10-disk group of devices each capable of 200-IOPS, will > globally act as a 200-IOPS capable RAID-Z group. This is the price to > pay to achieve proper data protection without the 2X block overhead > associated with mirroring. Thanks for the link, but I could have done that; I am attempting to explain to Wojciech that his habit of making bold assertions and arrogantly refusing to back them up makes for frustrating reading. Chris From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 22:21:50 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 3788A571 for ; Wed, 23 Jan 2013 22:21:50 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 905AA369 for ; Wed, 23 Jan 2013 22:21:49 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0NMLjaq002306; Wed, 23 Jan 2013 23:21:45 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0NMLjC4002303; Wed, 23 Jan 2013 23:21:45 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Wed, 23 Jan 2013 23:21:45 +0100 (CET) From: Wojciech Puchar To: Chris Rees Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <101D6382-BF57-43EB-A5FA-A63D4062F5FD@lpthe.jussieu.fr> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Wed, 23 Jan 2013 23:21:45 +0100 (CET) Cc: "freebsd-hackers@freebsd.org" , Mark Felder , Michel Talon X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 22:21:50 -0000 >> associated with mirroring. > > Thanks for the link, but I could have done that; I am attempting to > explain to Wojciech that his habit of making bold assertions and as you can see it is not a bold assertion, just you use something without even reading it's docs. Not mentioning doing any more research. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 22:27:56 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D61F6757; Wed, 23 Jan 2013 22:27:56 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id EA5E43D5; Wed, 23 Jan 2013 22:27:55 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0NMRs2Y002335; Wed, 23 Jan 2013 23:27:54 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0NMRr3K002332; Wed, 23 Jan 2013 23:27:54 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Wed, 23 Jan 2013 23:27:53 +0100 (CET) From: Wojciech Puchar To: Artem Belevich Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <20130122073641.GH30633@server.rulingia.com> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Wed, 23 Jan 2013 23:27:54 +0100 (CET) Cc: freebsd-fs , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 22:27:56 -0000 >> >> even you need normal performance use gmirror and UFS > > I've no objection. If it works for you -- go for it. both "works". For todays trend of solving everything by more hardware ZFS may even have "enough" performance. But still it is dangerous for a reasons i explained, as well as it promotes bad setups and layouts like making single filesystem out of large amount of disks. This is bad for no matter what filesystem and RAID setup you use, or even what OS. From owner-freebsd-hackers@FreeBSD.ORG Wed Jan 23 22:39:52 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id BC9B3A2A; Wed, 23 Jan 2013 22:39:52 +0000 (UTC) (envelope-from sendtomatt@gmail.com) Received: from mail-pb0-f52.google.com (mail-pb0-f52.google.com [209.85.160.52]) by mx1.freebsd.org (Postfix) with ESMTP id 3884C641; Wed, 23 Jan 2013 22:39:52 +0000 (UTC) Received: by mail-pb0-f52.google.com with SMTP id uo5so39382pbc.11 for ; Wed, 23 Jan 2013 14:39:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:x-enigmail-version:content-type :content-transfer-encoding; bh=bd3sz9qPeMWHGxqmErArzCpY3zda65LV918si2p09y4=; b=K0q2Bt7K/faC6QCw9z6UjxJ3ir4W3AGafxkmOH6FtNsF0PFr1iCWIk9GutCYlBW/Px ZxHaG05+2U38K8cyudKzL2BC2/RIO5vn3lCUq6j1NzIJHXIYgnYGqKd9WerhOHFh3aks De3QDz9E1j8ZBJ3ZhiRiRE7kJ3C7cxYjhQC14Fj+yPjaQpyyLzFmtVlGd9PucKXKqVh1 U5p/Uev3j7U11dubDJ6+NXF4ue3qkR4IUCnJCnohVyC4C/3pE/e3bCRh7mCvjcF/ObwY lPxvhG1Q5OzD2ctbum/Ts9HEW/6MEaB+TnG2QlHsvemEbzqSyDLxth3qNSqU/nmiZZh9 /4lA== X-Received: by 10.68.239.232 with SMTP id vv8mr7292677pbc.53.1358980483013; Wed, 23 Jan 2013 14:34:43 -0800 (PST) Received: from bakeneko.local (108-213-216-134.lightspeed.sntcca.sbcglobal.net. [108.213.216.134]) by mx.google.com with ESMTPS id l5sm12460471pax.10.2013.01.23.14.34.40 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 23 Jan 2013 14:34:41 -0800 (PST) Message-ID: <51006548.7050807@gmail.com> Date: Wed, 23 Jan 2013 14:33:44 -0800 From: matt User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.11) Gecko/20121203 Thunderbird/10.0.11 MIME-Version: 1.0 To: Wojciech Puchar Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. References: <20130122073641.GH30633@server.rulingia.com> In-Reply-To: X-Enigmail-Version: 1.3.5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs , Artem Belevich , FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jan 2013 22:39:52 -0000 On 01/23/13 14:27, Wojciech Puchar wrote: >> > > both "works". For todays trend of solving everything by more hardware > ZFS may even have "enough" performance. > > But still it is dangerous for a reasons i explained, as well as it > promotes bad setups and layouts like making single filesystem out of > large amount of disks. This is bad for no matter what filesystem and > RAID setup you use, or even what OS. > > ZFS mirror performance is quite good (both random IO and sequential), and resilvers/scrubs are measured in an hour or less. You can always make pool out of these instead of RAIDZ if you can get away with less total available space. I think RAIDZ vs Gmirror is a bad comparison, you can use a ZFS mirror with all the ZFS features, plus N-way (not sure if gmirror does this). Regarding single large filesystems, there is an old saying about not putting all your eggs into one basket, even if it's a great basket :) Matt From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 13:13:04 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id CB9D6CC2; Thu, 24 Jan 2013 13:13:04 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 7FA9E76C; Thu, 24 Jan 2013 13:13:04 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id AA1DD47E11; Thu, 24 Jan 2013 14:12:56 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.4 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.0.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id 08DB247DE6; Thu, 24 Jan 2013 14:12:56 +0100 (CET) Message-ID: <51013345.8010701@platinum.linux.pl> Date: Thu, 24 Jan 2013 14:12:37 +0100 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. References: <20130122073641.GH30633@server.rulingia.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 13:13:04 -0000 On 2013-01-23 21:22, Wojciech Puchar wrote: >>> While RAID-Z is already a king of bad performance, >> >> I don't believe RAID-Z is any worse than RAID5. Do you have any actual >> measurements to back up your claim? > > it is clearly described even in ZFS papers. Both on reads and writes it > gives single drive random I/O performance. With ZFS and RAID-Z the situation is a bit more complex. Lets assume 5 disk raidz1 vdev with ashift=9 (512 byte sectors). A worst case scenario could happen if your random i/o workload was reading random files each of 2048 bytes. Each file read would require data from 4 disks (5th is parity and won't be read unless there are errors). However if files were 512 bytes or less then only one disk would be used. 1024 bytes - two disks, etc. So ZFS is probably not the best choice to store millions of small files if random access to whole files is the primary concern. But lets look at a different scenario - a PostgreSQL database. Here table data is split and stored in 1GB files. ZFS splits the file into 128KiB records (recordsize property). This record is then again split into 4 columns each 32768 bytes. 5th column is generated containing parity. Each column is then stored on a different disk. You could think of it as a regular RAID-5 with stripe size of 32768 bytes. PostgreSQL uses 8192 byte pages that fit evenly both into ZFS record size and column size. Each page access requires only a single disk read. Random i/o performance here should be 5 times that of a single disk. For me the reliability ZFS offers is far more important than pure performance. From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 14:24:40 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id D2A0FA70; Thu, 24 Jan 2013 14:24:40 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 37800A5F; Thu, 24 Jan 2013 14:24:39 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0OEObtO005693; Thu, 24 Jan 2013 15:24:37 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0OEObiI005690; Thu, 24 Jan 2013 15:24:37 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Thu, 24 Jan 2013 15:24:37 +0100 (CET) From: Wojciech Puchar To: Adam Nowacki Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: <51013345.8010701@platinum.linux.pl> Message-ID: References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Thu, 24 Jan 2013 15:24:37 +0100 (CET) Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 14:24:40 -0000 > then stored on a different disk. You could think of it as a regular RAID-5 > with stripe size of 32768 bytes. > > PostgreSQL uses 8192 byte pages that fit evenly both into ZFS record size and > column size. Each page access requires only a single disk read. Random i/o > performance here should be 5 times that of a single disk. think about writing 8192 byte pages randomly. and then doing linear search over table. > > For me the reliability ZFS offers is far more important than pure > performance. Except it is on paper reliability. From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 14:45:55 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 08BD6567; Thu, 24 Jan 2013 14:45:55 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-lb0-f178.google.com (mail-lb0-f178.google.com [209.85.217.178]) by mx1.freebsd.org (Postfix) with ESMTP id 5F4F9BE9; Thu, 24 Jan 2013 14:45:54 +0000 (UTC) Received: by mail-lb0-f178.google.com with SMTP id n1so4755178lba.23 for ; Thu, 24 Jan 2013 06:45:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=oRSFPzW30F41pK0pa7TGi3bqr8iGepeaoQzU5EOC5QE=; b=LBeEA0fN2+Yjzl4ytf9aDe/LDyMReEDJj2P0dU1tDdC6Nznu7so6zEznfhuQ306/Wx GQaOCHvygKB+RDW2ryYWqKPpjXxEHTeqcFtECaS9Tx9jHu2gihaOWonkG1qJw0S2xVWY pIlgdA1AkmfZslHIdLFgg7oK/vLXNaG8zHh2g9ULD5KB4m3uxs+lhSRVych6Ai495LYR TIGrg49KWkUZozFpnOkfKO8qaq3LfIbU8K73DoDuXnRO9G7oVxIP2vEw2BWtpVAH9KaD jvij6YquNyP1BUg2zhxdHTKIshNQesQP2IgktcbzNCEzZakGISB7YLAj9O0IYZfmfs99 UA3A== MIME-Version: 1.0 X-Received: by 10.112.38.67 with SMTP id e3mr872339lbk.105.1359038753054; Thu, 24 Jan 2013 06:45:53 -0800 (PST) Received: by 10.112.6.38 with HTTP; Thu, 24 Jan 2013 06:45:52 -0800 (PST) In-Reply-To: <51013345.8010701@platinum.linux.pl> References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> Date: Thu, 24 Jan 2013 09:45:52 -0500 Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. From: Zaphod Beeblebrox To: Adam Nowacki Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 14:45:55 -0000 Wow!.! OK. It sounds like you (or someone like you) can answer some of my burning questions about ZFS. On Thu, Jan 24, 2013 at 8:12 AM, Adam Nowacki wrote: > Lets assume 5 disk raidz1 vdev with ashift=9 (512 byte sectors). > > A worst case scenario could happen if your random i/o workload was reading > random files each of 2048 bytes. Each file read would require data from 4 > disks (5th is parity and won't be read unless there are errors). However if > files were 512 bytes or less then only one disk would be used. 1024 bytes - > two disks, etc. > > So ZFS is probably not the best choice to store millions of small files if > random access to whole files is the primary concern. > > But lets look at a different scenario - a PostgreSQL database. Here table > data is split and stored in 1GB files. ZFS splits the file into 128KiB > records (recordsize property). This record is then again split into 4 > columns each 32768 bytes. 5th column is generated containing parity. Each > column is then stored on a different disk. You could think of it as a > regular RAID-5 with stripe size of 32768 bytes. > Ok... so my question then would be... what of the small files. If I write several small files at once, does the transaction use a record, or does each file need to use a record? Additionally, if small files use sub-records, when you delete that file, does the sub-record get moved or just wasted (until the record is completely free)? I'm considering the difference, say, between cyrus imap (one file per message ZFS, database files on different ZFS filesystem) and dbmail imap (postgresql on ZFS). ... now I realize that PostgreSQL on ZFS has some special issues (but I don't have a choice here between ZFS and non-ZFS ... ZFS has already been chosen), but I'm also figuring that PostgreSQL on ZFS has some waste compared to cyrus IMAP on ZFS. So far in my research, Cyrus makes some compelling arguments that the common use case of most IMAP database files is full scan --- for which it's database files are optimized and SQL-based files are not. I agree that some operations can be more efficient in a good SQL database, but full scan (as a most often used query) is not. Cyrus also makes sense to me as a collection of small files ... for which I expect ZFS to excel... including the ability to snapshot with impunity... but I am terribly curious how the files are handled in transactions. I'm actually (right now) running some filesize statistics (and I'll get back to the list, if asked), but I'd like to know how ZFS is going to store the arriving mail... :). From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 14:53:19 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 9E795A2F; Thu, 24 Jan 2013 14:53:19 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 05EDFCE4; Thu, 24 Jan 2013 14:53:18 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0OErHwx005776; Thu, 24 Jan 2013 15:53:17 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0OErHh0005773; Thu, 24 Jan 2013 15:53:17 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Thu, 24 Jan 2013 15:53:17 +0100 (CET) From: Wojciech Puchar To: Zaphod Beeblebrox Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Thu, 24 Jan 2013 15:53:17 +0100 (CET) Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, Adam Nowacki X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 14:53:19 -0000 > several small files at once, does the transaction use a record, or does > each file need to use a record? Additionally, if small files use > sub-records, when you delete that file, does the sub-record get moved or > just wasted (until the record is completely free)? writes of small files are always good with ZFS. From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 14:54:34 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0F4F2B98; Thu, 24 Jan 2013 14:54:34 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id BDC46CFF; Thu, 24 Jan 2013 14:54:33 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id 1BF1147E11; Thu, 24 Jan 2013 15:54:32 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.4 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.0.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id C7BC447DE6; Thu, 24 Jan 2013 15:54:31 +0100 (CET) Message-ID: <51014B28.8070404@platinum.linux.pl> Date: Thu, 24 Jan 2013 15:54:32 +0100 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: Wojciech Puchar Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 14:54:34 -0000 On 2013-01-24 15:24, Wojciech Puchar wrote: >> For me the reliability ZFS offers is far more important than pure >> performance. > Except it is on paper reliability. This "on paper" reliability in practice saved a 20TB pool. See one of my previous emails. Any other filesystem or hardware/software raid without per-disk checksums would have failed. Silent corruption of non-important files would be the best case, complete filesystem death by important metadata corruption as the worst case. I've been using ZFS for 3 years in many systems. Biggest one has 44 disks and 4 ZFS pools - this one survived SAS expander disconnects, a few kernel panics and countless power failures (UPS only holds for a few hours). So far I've not lost a single ZFS pool or any data stored. From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 15:37:09 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D7ACECCB for ; Thu, 24 Jan 2013 15:37:09 +0000 (UTC) (envelope-from nowakpl@platinum.linux.pl) Received: from platinum.linux.pl (platinum.edu.pl [81.161.192.4]) by mx1.freebsd.org (Postfix) with ESMTP id 97ECCFAF for ; Thu, 24 Jan 2013 15:37:09 +0000 (UTC) Received: by platinum.linux.pl (Postfix, from userid 87) id 907D947E11; Thu, 24 Jan 2013 16:37:07 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on platinum.linux.pl X-Spam-Level: X-Spam-Status: No, score=-1.4 required=3.0 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.3.2 Received: from [10.255.0.2] (unknown [83.151.38.73]) by platinum.linux.pl (Postfix) with ESMTPA id AD23147DE6 for ; Thu, 24 Jan 2013 16:37:07 +0100 (CET) Message-ID: <51015523.2060701@platinum.linux.pl> Date: Thu, 24 Jan 2013 16:37:07 +0100 From: Adam Nowacki User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 15:37:09 -0000 On 2013-01-24 15:45, Zaphod Beeblebrox wrote: > Ok... so my question then would be... what of the small files. If I write > several small files at once, does the transaction use a record, or does > each file need to use a record? Additionally, if small files use > sub-records, when you delete that file, does the sub-record get moved or > just wasted (until the record is completely free)? Each file is a fully self-contained object (together with full parity) all the way to the physical storage. A 1 byte file on RAID-Z2 pool will always use 3 disks, 3 sectors total for data alone. You can use du to verify - it reports physical size together with parity. Metadata like directory entry or file attributes is stored separately and shared with other files. For small files there may be a lot of "wasted" space. From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 16:38:00 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 62D7251F for ; Thu, 24 Jan 2013 16:38:00 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-lb0-f172.google.com (mail-lb0-f172.google.com [209.85.217.172]) by mx1.freebsd.org (Postfix) with ESMTP id CF0AA3F4 for ; Thu, 24 Jan 2013 16:37:59 +0000 (UTC) Received: by mail-lb0-f172.google.com with SMTP id n8so5792351lbj.17 for ; Thu, 24 Jan 2013 08:37:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=tESeIrVBrTVOSHvnOmI+LtdRwZEiK2KJb5kQoWC2PBg=; b=DiR3OGyvFYRhUXuQ0aYI7LFly4An8xmbywFMYCHnytIB52MFYmE8o5iO3DRHC8hAd1 9gfEdD68xGaR1zxI4enV5TsD8qGEX1nwS6grkbwAKr+dA1rNwYjGVq7RhyAROiqVdFcp YdEAtitEG6ynoDmCQNJIhue0aPl2i/SOETxj/RJgB63oNlGRWl9qRdhqlxY9hRQCibTn YGlhsPli70I+kptsZPM5uj+YRUYDcFxocDcYEl0alXqyUw1W1nxrBC7VwsNQyJHGRiSh QJ0gC9g5kph9y/R63Aif5r51SlxMBORuc9i1y2Ksc9sM3r19/SmM/Baz23bcs+9cFVZ4 bWCA== MIME-Version: 1.0 X-Received: by 10.152.131.168 with SMTP id on8mr2459571lab.38.1359045478464; Thu, 24 Jan 2013 08:37:58 -0800 (PST) Received: by 10.112.6.38 with HTTP; Thu, 24 Jan 2013 08:37:58 -0800 (PST) In-Reply-To: <51015523.2060701@platinum.linux.pl> References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> <51015523.2060701@platinum.linux.pl> Date: Thu, 24 Jan 2013 11:37:58 -0500 Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. From: Zaphod Beeblebrox To: Adam Nowacki Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 16:38:00 -0000 Ok... here's the existing data: There are 3,236,316 files summing to 97,500,008,691 bytes. That puts the "average" file at 30,127 bytes. But for the full breakdown: 512 : 7758 1024 : 139046 2048 : 1468904 4096 : 325375 8192 : 492399 16384 : 324728 32768 : 263210 65536 : 102407 131072 : 43046 262144 : 22259 524288 : 17136 1048576 : 13788 2097152 : 8279 4194304 : 4501 8388608 : 2317 16777216 : 1045 33554432 : 119 67108864 : 2 I produced that list with the output of ls -R's byte counts, sorted and then processed with: (while read num; do count=$[count+1]; if [ $num -gt $size ]; then echo $size : $count;size=$[size*2]; count=0; fi; done) Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id A13D54D2 for ; Thu, 24 Jan 2013 18:46:04 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id 6E685DE1 for ; Thu, 24 Jan 2013 18:46:04 +0000 (UTC) Received: from eagle.yuri.org (localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id r0OIjwJM081678; Thu, 24 Jan 2013 10:45:58 -0800 (PST) (envelope-from yuri@rawbw.com) Message-ID: <51018166.4000706@rawbw.com> Date: Thu, 24 Jan 2013 10:45:58 -0800 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130112 Thunderbird/17.0.2 MIME-Version: 1.0 To: Ryan Stone Subject: Re: Why DTrace sensor is listed but not called? References: <50FEEB6C.7090303@rawbw.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD Hackers X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 18:46:04 -0000 On 01/22/2013 16:03, Ryan Stone wrote: > Offhand, I can't of why this isn't working. However there is already a way > to add new DTrace probes to the kernel, and it's quite simple, so you could > try it: Thank you for this information, this works. As for my previous approach, there is a bug in gcc that static empty functions with 'noinline' attributes get eliminated by the optimizer. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56099 Yuri From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 19:24:41 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 39DFC403; Thu, 24 Jan 2013 19:24:41 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 9D776F89; Thu, 24 Jan 2013 19:24:40 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0OJOcdw008269; Thu, 24 Jan 2013 20:24:38 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0OJOcx2008266; Thu, 24 Jan 2013 20:24:38 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Thu, 24 Jan 2013 20:24:38 +0100 (CET) From: Wojciech Puchar To: Adam Nowacki Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: <51014B28.8070404@platinum.linux.pl> Message-ID: References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> <51014B28.8070404@platinum.linux.pl> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Thu, 24 Jan 2013 20:24:38 +0100 (CET) Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 19:24:41 -0000 > So far I've not lost a single ZFS pool or any data stored. so far my house wasn't robbed. From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 19:26:08 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 2F5CF569 for ; Thu, 24 Jan 2013 19:26:08 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id 7F687FA5 for ; Thu, 24 Jan 2013 19:26:07 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0OJQ6rF008284; Thu, 24 Jan 2013 20:26:06 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0OJQ6Cl008281; Thu, 24 Jan 2013 20:26:06 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Thu, 24 Jan 2013 20:26:06 +0100 (CET) From: Wojciech Puchar To: Zaphod Beeblebrox Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> <51015523.2060701@platinum.linux.pl> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Thu, 24 Jan 2013 20:26:06 +0100 (CET) Cc: freebsd-hackers@freebsd.org, Adam Nowacki X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 19:26:08 -0000 > There are 3,236,316 files summing to 97,500,008,691 bytes. That puts the > "average" file at 30,127 bytes. But for the full breakdown: quite low. what do you store. here is my real world production example of users mail as well as documents. /dev/mirror/home1.eli 2788 1545 1243 55% 1941057 20981181 8% /home From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 19:52:34 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 27A6FC03; Thu, 24 Jan 2013 19:52:34 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 00ABD138; Thu, 24 Jan 2013 19:52:33 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 5343FB94B; Thu, 24 Jan 2013 14:52:33 -0500 (EST) From: John Baldwin To: freebsd-hackers@freebsd.org Subject: Re: NMI watchdog functionality on Freebsd Date: Thu, 24 Jan 2013 11:11:01 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <1358894455.17521.YahooMailClassic@web181706.mail.ne1.yahoo.com> <5100142D.7040904@freebsd.org> <1358960253.32417.467.camel@revolution.hippie.lan> In-Reply-To: <1358960253.32417.467.camel@revolution.hippie.lan> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301241111.01629.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 24 Jan 2013 14:52:33 -0500 (EST) Cc: Sushanth Rai , mjacob@freebsd.org, Ian Lepore X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 19:52:34 -0000 On Wednesday, January 23, 2013 11:57:33 am Ian Lepore wrote: > On Wed, 2013-01-23 at 08:47 -0800, Matthew Jacob wrote: > > On 1/23/2013 7:25 AM, John Baldwin wrote: > > > On Tuesday, January 22, 2013 5:40:55 pm Sushanth Rai wrote: > > >> Hi, > > >> > > >> Does freebsd have some functionality similar to Linux's NMI watchdog ? I'm > > > aware of ichwd driver, but that depends to WDT to be available in the > > > hardware. Even when it is available, BIOS needs to support a mechanism to > > > trigger a OS level recovery to get any useful information when system is > > > really wedged (with interrupt disabled) > > The principle purpose of a watchdog is to keep the system from hanging. > > Information is secondary. The ichwd driver can use the LPC part of ICH > > hardware that's been there since ICH version 4. I implemented this more > > fully at Panasas. The first importance is to keep the system from being > > hung. The next piece of information is to detect, on reboot, that a > > watchdog event occurred. Finally, trying to isolate why is good. > > > > This is equivalent to the tco_WDT stuff on Linux. It's not interrupt > > driven (it drives the reset line on the processor). > > > > I think there's value in the NMI watchdog idea, but unless you back it > up with a real hardware watchdog you don't really have full watchdog > functionality. If the NMI can get the OS to produce some extra info, > that's great, and using an NMI gives you a good chance of doing that > even if it is normal interrupt processing that has wedged the machine. > But calling panic() invokes plenty of processing that can get wedged in > other ways, so even an NMI-based watchdog isn't g'teed to get the > machine running again. > > But adding a real hardware watchdog that fires on a slightly longer > timeout than the NMI watchdog gives you the best of everything: you get > information if it's possible to produce it, and you get a real hardware > reset shortly thereafter if producing the info fails. The IPMI watchdog facility has support for a pre-interrupt that fires before the real watchdog. I have coded up support for it in a branch but haven't found any hardware that supports it that I could use to test them. However, you could use an NMI pre-timer via the local APIC timer as a generic pre-timer for other hardware watchdogs. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 19:52:39 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 88C6CC43; Thu, 24 Jan 2013 19:52:39 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id 66EBC13D; Thu, 24 Jan 2013 19:52:39 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id D90B8B94B; Thu, 24 Jan 2013 14:52:38 -0500 (EST) From: John Baldwin To: Mikolaj Golub Subject: Re: libprocstat(3): retrieve process command line args and environment Date: Thu, 24 Jan 2013 11:20:51 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <20130119151253.GB88025@gmail.com> <201301231131.43972.jhb@freebsd.org> <20130123214949.GA3120@gmail.com> In-Reply-To: <20130123214949.GA3120@gmail.com> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301241120.52054.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Thu, 24 Jan 2013 14:52:38 -0500 (EST) Cc: Stanislav Sedov , freebsd-hackers@freebsd.org, Robert Watson X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 19:52:39 -0000 On Wednesday, January 23, 2013 4:49:50 pm Mikolaj Golub wrote: > On Wed, Jan 23, 2013 at 11:31:43AM -0500, John Baldwin wrote: > > On Wednesday, January 23, 2013 2:25:00 am Mikolaj Golub wrote: > > > IMHO, after adding procstat_getargv and procstat_getargv, the usage of > > > kvm_getargv() and kvm_getenvv() (at least in the new code) may be > > > deprecated. As this is stated in the man page, BUGS section, "these > > > routines do not belong in the kvm interface". I suppose they are part > > > of libkvm because there was no a better place for them. procstat(1) > > > prefers direct sysctl to them (so, again, code duplication, which I am > > > going to remove adding procstat_getargv/envv). > > > > Hmm, are you going to rewrite ps(1) to use libprocstat? Or rather, is that a > > goal someday? That is one current consumer of kvm_getargv/envv. That might > > be fine if we want to make more tools use libprocstat instead of using libkvm > > directly. > > I didn't have any plans for ps(1) :-) That is why I wrote about "new > code". But if you think it is good to do I might look at it one day... I'm mostly hoping Robert chimes in to see if that was his intention for libprocstat. :) If we can ultimately replace all uses of kvm_get*v() with calls to procstat_get*v*() then I'm fine with some code duplication in the interim. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 21:11:55 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id F2D74FCB for ; Thu, 24 Jan 2013 21:11:55 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-lb0-f173.google.com (mail-lb0-f173.google.com [209.85.217.173]) by mx1.freebsd.org (Postfix) with ESMTP id 4D7E0750 for ; Thu, 24 Jan 2013 21:11:54 +0000 (UTC) Received: by mail-lb0-f173.google.com with SMTP id gf7so7204191lbb.32 for ; Thu, 24 Jan 2013 13:11:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type; bh=xB+SPNBpswWz1K8FVa+9AOacar8srexY2LG4OzhRQ0Y=; b=i3Zf1cvEJI3T5H8xgpulsirjzvln2HBORXqac2qDjWPONuQdg6Rnq26+id9g3UMq9n WLgHwuKR4/jM/D04jXV7HyiqmjYlHcgM1XIp4Y4C3uqRiTAJWkFu/Eq9gEAHT9T6BxrO kUGbZIYh0Mc99Q+MBfUf6rpFoX6DowsydsNTd+qu9/UQJlACDJeGnvJS6nDe+K24Njx1 VPur7k3STezDwqP/526doM8IpEAoi4LxhxA3m0qWbuKGduV/uKaQEtrTKoEw3yQbg0ck yEyK1686v2M3U4eOak+V1q3du9fEczHJOoumOybEwhT/Lli3c4+Rwa4cEK2zMcBle62W S9+Q== MIME-Version: 1.0 X-Received: by 10.112.29.229 with SMTP id n5mr1349802lbh.130.1359061908494; Thu, 24 Jan 2013 13:11:48 -0800 (PST) Received: by 10.112.6.38 with HTTP; Thu, 24 Jan 2013 13:11:48 -0800 (PST) In-Reply-To: References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> <51015523.2060701@platinum.linux.pl> Date: Thu, 24 Jan 2013 16:11:48 -0500 Message-ID: Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. From: Zaphod Beeblebrox To: Wojciech Puchar Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.14 Cc: freebsd-hackers@freebsd.org, Adam Nowacki X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 21:11:56 -0000 On Thu, Jan 24, 2013 at 2:26 PM, Wojciech Puchar < wojtek@wojtek.tensor.gdynia.pl> wrote: > There are 3,236,316 files summing to 97,500,008,691 bytes. That puts the >> "average" file at 30,127 bytes. But for the full breakdown: >> > > quite low. what do you store. > Apparently you're not really following this thread... just trolling? I had said that it was cyrus IMAP data (which, for reference, is one file per email message). > here is my real world production example of users mail as well as > documents. > > > /dev/mirror/home1.eli 2788 1545 1243 55% 1941057 20981181 8% > /home > Not the same data, I imagine. I was dealing with the actual byte counts ... that figure is going to be in whole blocks. From owner-freebsd-hackers@FreeBSD.ORG Thu Jan 24 23:32:03 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 976951D7; Thu, 24 Jan 2013 23:32:03 +0000 (UTC) (envelope-from ndenev@gmail.com) Received: from mail-bk0-f52.google.com (mail-bk0-f52.google.com [209.85.214.52]) by mx1.freebsd.org (Postfix) with ESMTP id 0719BE1D; Thu, 24 Jan 2013 23:32:02 +0000 (UTC) Received: by mail-bk0-f52.google.com with SMTP id jk13so1054414bkc.11 for ; Thu, 24 Jan 2013 15:31:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:subject:mime-version:content-type:from:in-reply-to:date :cc:content-transfer-encoding:message-id:references:to:x-mailer; bh=hNhdEjjgWuOGeQh4UlpJcQNV1MGOZosoZxTU/uQpSn4=; b=EYInC7OLDDYM/nCBZIHB3cx1mxbNr3NWppIsFmXUUA+W7KSWOjScKtRot4sK2DqGth UojRU0BwAorsoi76bqXqiMZg8e9mELKR4kGCIRBizctW7qdQVAMKiqGoE7kuafh50lyr Hi5PY3ccYYDNFjJckZdrTx5vH2qMDJL0FkVXpQxWJyXsA/MG/s8nbJwOHjv8w6OfFx3J iYMd5EFKwRyJ1IpkN+ALgPwgiDKFLoUtgifFKmAHqbo8WiCLkjaiebOZde2eJKAXQ5oA m1BLb15kAKepBG34hNimXGN8QWFgHNxJKVn5184kIyduPIxytD8VF6B9mbqPEFjhfIgQ 6ZBg== X-Received: by 10.204.128.151 with SMTP id k23mr1333499bks.65.1359070316405; Thu, 24 Jan 2013 15:31:56 -0800 (PST) Received: from [10.0.0.3] ([93.152.184.10]) by mx.google.com with ESMTPS id o9sm18380914bko.15.2013.01.24.15.31.54 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 24 Jan 2013 15:31:55 -0800 (PST) Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. Mime-Version: 1.0 (Mac OS X Mail 6.2 \(1499\)) Content-Type: text/plain; charset=us-ascii From: Nikolay Denev In-Reply-To: Date: Fri, 25 Jan 2013 01:31:53 +0200 Content-Transfer-Encoding: quoted-printable Message-Id: <4241CA0A-9AFC-4EB4-89B7-18BC7E645B03@gmail.com> References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> To: Wojciech Puchar X-Mailer: Apple Mail (2.1499) Cc: freebsd-fs@freebsd.org, freebsd-hackers@freebsd.org, Adam Nowacki X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 24 Jan 2013 23:32:03 -0000 On Jan 24, 2013, at 4:24 PM, Wojciech Puchar = wrote: >>=20 > Except it is on paper reliability. This "on paper" reliability saved my ass numerous times. For example I had one home NAS server machine with flaky SATA controller = that would not detect one of the four drives from time to time on = reboot. This made my pool degraded several times, and even rebooting with let's = say disk4 failed to a situation that disk3 is failed did not corrupt any = data. I don't think this is possible with any other open source FS, let alone = hardware RAID that would drop the whole array because of this. I have never ever personally lost any data on ZFS. Yes, the performance = is another topic, and you must know what you are doing, and what is your usage pattern, but from reliability standpoint, to me ZFS looks more = durable than anything else. P.S.: My home NAS is running freebsd-CURRENT with ZFS from the first = version available. Several drives died, two times the pool was expanded by replacing all drives one by one and resilvered, no single byte lost. From owner-freebsd-hackers@FreeBSD.ORG Fri Jan 25 09:26:54 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 74FD9C18 for ; Fri, 25 Jan 2013 09:26:54 +0000 (UTC) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from wojtek.tensor.gdynia.pl (wojtek.tensor.gdynia.pl [188.252.31.196]) by mx1.freebsd.org (Postfix) with ESMTP id D81BE984 for ; Fri, 25 Jan 2013 09:26:53 +0000 (UTC) Received: from wojtek.tensor.gdynia.pl (localhost [127.0.0.1]) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5) with ESMTP id r0P9QiCu026014; Fri, 25 Jan 2013 10:26:44 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Received: from localhost (wojtek@localhost) by wojtek.tensor.gdynia.pl (8.14.5/8.14.5/Submit) with ESMTP id r0P9Qhil026011; Fri, 25 Jan 2013 10:26:43 +0100 (CET) (envelope-from wojtek@wojtek.tensor.gdynia.pl) Date: Fri, 25 Jan 2013 10:26:43 +0100 (CET) From: Wojciech Puchar To: Zaphod Beeblebrox Subject: Re: ZFS regimen: scrub, scrub, scrub and scrub again. In-Reply-To: Message-ID: References: <20130122073641.GH30633@server.rulingia.com> <51013345.8010701@platinum.linux.pl> <51015523.2060701@platinum.linux.pl> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="2456600518-866709546-1359106004=:25985" X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.7 (wojtek.tensor.gdynia.pl [127.0.0.1]); Fri, 25 Jan 2013 10:26:44 +0100 (CET) Cc: freebsd-hackers@freebsd.org, Adam Nowacki X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2013 09:26:54 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --2456600518-866709546-1359106004=:25985 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8BIT > > here is my real world production example of users mail as well as documents. > > > /dev/mirror/home1.eli      2788 1545  1243    55% 1941057 20981181    8%   /home > > > Not the same data, I imagine. A mix. 90% Mailboxes and user data (documents, pictures), rest are some .tar.gz backups. At other places i have similar situation. one or more gmirror sets, 1-3TB each depends on drives. For those who puts 1000 of mailboxes i recommend dovecot with mdbox storage backend >  I was dealing with the actual byte counts ... that figure is going to be in whole blocks. > > --2456600518-866709546-1359106004=:25985-- From owner-freebsd-hackers@FreeBSD.ORG Fri Jan 25 14:16:45 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 69F18FDC; Fri, 25 Jan 2013 14:16:45 +0000 (UTC) (envelope-from fodillemlinkarim@gmail.com) Received: from mail-ie0-x232.google.com (ie-in-x0232.1e100.net [IPv6:2607:f8b0:4001:c03::232]) by mx1.freebsd.org (Postfix) with ESMTP id 14F3ED44; Fri, 25 Jan 2013 14:16:45 +0000 (UTC) Received: by mail-ie0-f178.google.com with SMTP id c12so72686ieb.23 for ; Fri, 25 Jan 2013 06:16:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=x-received:message-id:date:from:user-agent:mime-version:to:cc :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=z18AKoVaTYKjNqDNZdeKOpX9kQA19HoqT5K82QlETQk=; b=R1HAvys9bN3cQgOepJpMz0qAtoLIzoWG5oYHgmcz2oupCKquetkzmbby5QE5RSgHoy OdkB5wuDryIwNZfkPiEPbLLozruJebQJTSWVd8Y8SYwTt5X4F+rLmpMDZC/IGFNCaL9t 0G7s6LBgRXW1ldjkXUXH/EgtEHplB4M1aHTRpRKCAUVIlbX+kyj/3/yFwqJwwXp5eWrC bH2HayiwmVz8IqpvrD4OTxcdL6lF6XN8/nVnWJkjkcv9kKd8g8VL7iMGdpJXYeMsu6XR +M51pqw1qcTbZCEk4uRYGh9uw3FwwtSGFhufhq2MNNbPGAELI6YnJgYvRhO04d0ppWa8 9FNA== X-Received: by 10.42.39.80 with SMTP id g16mr3609008ice.26.1359123404682; Fri, 25 Jan 2013 06:16:44 -0800 (PST) Received: from [192.168.1.73] ([208.85.112.101]) by mx.google.com with ESMTPS id fa6sm760658igb.2.2013.01.25.06.16.42 (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Fri, 25 Jan 2013 06:16:43 -0800 (PST) Message-ID: <510293C4.3060304@gmail.com> Date: Fri, 25 Jan 2013 09:16:36 -0500 From: Karim Fodil-Lemelin User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130107 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-hackers@freebsd.org Subject: Re: IBM blade server abysmal disk write performances References: <6C0B86E6-195C-4D35-AE40-3D2F9F6D28FB@yahoo.com> <1358544287.32417.251.camel@revolution.hippie.lan> <50F9CFEB.5060302@feral.com> <50F9DB9A.9050303@gmail.com> In-Reply-To: <50F9DB9A.9050303@gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Fri, 25 Jan 2013 14:21:45 +0000 Cc: gibbs@FreeBSD.org, scottl@FreeBSD.org, mjacob@FreeBSD.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2013 14:16:45 -0000 Hi, Quick follow up on this. As I mentioned in a previous email we have moved to SATA drives and the SAS drives have been shelved for now. The current project will be using those so further tests on SAS have been postponed to an undefined date. Thanks, Karim. PS: I'll keep the SAS tests in my back pocket so I get a head start when we get around SAS testing again. On 18/01/2013 6:32 PM, Karim Fodil-Lemelin wrote: > On 18/01/2013 5:42 PM, Matthew Jacob wrote: >> This is all turning into a bikeshed discussion. As far as I can tell, >> the basic original question was why a *SAS* (not a SATA) drive was >> not performing as well as expected based upon experiences with Linux. >> I still don't know whether reads or writes were being used for dd. >> >> This morning, I ran a fio test with a single threaded read component >> and a multithreaded write component to see if there were differences. >> All I had connected to my MPT system were ATA drives (Seagate 500GBs) >> and I'm remote now and won't be back until Sunday to put one of my >> 'good' SAS drives (140 GB Seagates, i.e., real SAS 15K RPM drives, >> not "fat SATA" bs drives). >> >> The numbers were pretty much the same for both FreeBSD and Linux. In >> fact, FreeBSD was slightly faster. I won't report the exact numbers >> right now, but only mention this as a piece of information that at >> least in my case the differences between the OS platform involved is >> negligible. This would, at least in my case, rule out issues based >> upon different platform access methods and different drivers. >> >> All of this other discussion, about WCE and what not is nice, but for >> all intents and purposes it serves could be moved to *-advocacy. >> > Thanks for the clarifications! > > I did mention at some point those were write speeds and reads were > just fine and those were either writes to the filesystem or direct > access (only on SAS again). > > Here is what I am planning to do next week when I get the chance: > > 0) I plan on focusing on the SAS driver tests _only_ since SATA is > working as expected so nothing to report there. > 1) Look carefully at how the drives are physically connected. Although > it feels like if the SATA works fine the SAS should also but I'll > check anyway. > 2) Boot verbose with "boot -v" and send the dmesg output. mpt driver > might give us a clue. > 3) Run gstat -abc in a loop for the test duration. Although I would > think ctlstat(8) might be more interesting here so I'll run it too for > good measure :). > > Please note that in all tests write caching was enabled as I think > this is the default with FBSD 9.1 GENERIC but I'll confirm this with > camcontrol(8). > > I've also seen quite a lot of 'quirks' for tagged command queuing in > the source code (/sys/cam/scsi/scps_xtp.c) but a particular one got my > attention (thanks to whomever writes good comments in source code :) : > > /* > * Slow when tagged queueing is enabled. Write > performance > * steadily drops off with more and more concurrent > * transactions. Best sequential write performance with > * tagged queueing turned off and write caching turned > on. > * > * PR: kern/10398 > * Submitted by: Hideaki Okada > * Drive: DCAS-34330 w/ "S65A" firmware. > * > * The drive with the problem had the "S65A" firmware > * revision, and has also been reported (by Stephen J. > * Roznowski ) for a drive with the "S61A" > * firmware revision. > * > * Although no one has reported problems with the 2 gig > * version of the DCAS drive, the assumption is that it > * has the same problems as the 4 gig version. Therefore > * this quirk entries disables tagged queueing for all > * DCAS drives. > */ > { T_DIRECT, SIP_MEDIA_FIXED, "IBM", "DCAS*", "*" }, > /*quirks*/0, /*mintags*/0, /*maxtags*/0 > > So I looked at the kern/10398 pr and got some feeling of 'deja vu' > although the original problem was on FreeBSD 3.1 so its most likely > not that but I though I would mention it. The issue described is > awfully familiar. Basically the SAS drive (scsi back then) is slow on > writes but fast on reads with dd. Could be a coincidence or a ghost > from the past who knows... > > Cheers, > > Karim. From owner-freebsd-hackers@FreeBSD.ORG Fri Jan 25 18:37:47 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id AD624DAA; Fri, 25 Jan 2013 18:37:47 +0000 (UTC) (envelope-from rwatson@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 73F45FA5; Fri, 25 Jan 2013 18:37:47 +0000 (UTC) Received: from cilantro.sec.cl.cam.ac.uk (cilantro.sec.cl.cam.ac.uk [128.232.18.69]) by cyrus.watson.org (Postfix) with ESMTPSA id AC86846B5B; Fri, 25 Jan 2013 13:37:46 -0500 (EST) Subject: Re: libprocstat(3): retrieve process command line args and environment Mime-Version: 1.0 (Apple Message framework v1283) Content-Type: text/plain; charset=iso-8859-1 From: "Robert N. M. Watson" In-Reply-To: <201301241120.52054.jhb@freebsd.org> Date: Fri, 25 Jan 2013 18:37:45 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20130119151253.GB88025@gmail.com> <201301231131.43972.jhb@freebsd.org> <20130123214949.GA3120@gmail.com> <201301241120.52054.jhb@freebsd.org> To: John Baldwin X-Mailer: Apple Mail (2.1283) Cc: Mikolaj Golub , Stanislav Sedov , freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2013 18:37:47 -0000 On 24 Jan 2013, at 16:20, John Baldwin wrote: >>> Hmm, are you going to rewrite ps(1) to use libprocstat? Or rather, = is that a >>> goal someday? That is one current consumer of kvm_getargv/envv. = That might >>> be fine if we want to make more tools use libprocstat instead of = using libkvm >>> directly. >>=20 >> I didn't have any plans for ps(1) :-) That is why I wrote about "new >> code". But if you think it is good to do I might look at it one = day... >=20 > I'm mostly hoping Robert chimes in to see if that was his intention = for > libprocstat. :) If we can ultimately replace all uses of kvm_get*v() = with > calls to procstat_get*v*() then I'm fine with some code duplication in = the > interim. Originally there was just proctstat(1), but it made sense to begin = re-encapsulating it in a libprocstat(3) because the code there is = potentially extremely reusable. This conflicts a bit with libkvm(3), = which mysteriously knows about sysctlbyname(3) despite a name suggesting = otherwise. You can imagine various approaches to fixing this, but = indeed, making libprocstat(3) the first-class citizen and preferring it = for both kvm and sysctl methods sounds like the way to go. I actually = want to make libprocstat also support snmp, but I've never actually = found the time to investigate doing that. One of my main unmet goals for = procstat(1) was to introduce an extremely machine-readable output format = for it -- e.g., something XML-based or similar. I'd still love to see = that happen. Robert= From owner-freebsd-hackers@FreeBSD.ORG Fri Jan 25 20:23:00 2013 Return-Path: Delivered-To: hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 8AB71D98 for ; Fri, 25 Jan 2013 20:23:00 +0000 (UTC) (envelope-from yuri@rawbw.com) Received: from shell0.rawbw.com (shell0.rawbw.com [198.144.192.45]) by mx1.freebsd.org (Postfix) with ESMTP id 699C8774 for ; Fri, 25 Jan 2013 20:23:00 +0000 (UTC) Received: from eagle.yuri.org (stunnel@localhost [127.0.0.1]) (authenticated bits=0) by shell0.rawbw.com (8.14.4/8.14.4) with ESMTP id r0PKMo36005811 for ; Fri, 25 Jan 2013 12:22:51 -0800 (PST) (envelope-from yuri@rawbw.com) Message-ID: <5102E99A.2070402@rawbw.com> Date: Fri, 25 Jan 2013 12:22:50 -0800 From: Yuri User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130112 Thunderbird/17.0.2 MIME-Version: 1.0 To: FreeBSD Hackers Subject: Calling ustack(); from DTrace script crashes the user process Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2013 20:23:00 -0000 I am calling ustack(); from the 'ioctl' handler called by Xorg process. My intention is to see the user stack. On the first few instances I got this error: dtrace: ERROR: open failed: No such file or directory -- no file name is mentioned, double-space is printed in the message After a while the same exact script began to crash Xorg process. Before crashes occurred I was able to get the truss log, showing that multiple dev-files failed to open: 5191: open("/dev/dtrace/dtrace",O_RDONLY,00) = 3 (0x3) 5191: open("/dev/dtrace/io",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/dtmalloc",O_RDONLY,00) = 4 (0x4) 5191: open("/dev/dtrace/nfscl",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/nfsclient",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/fbt",O_RDONLY,00) = 5 (0x5) 5191: open("/dev/dtrace/lockstat",O_RDONLY,00) = 6 (0x6) 5191: open("/dev/dtrace/priv",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/sched",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/mac",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/mac_framework",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/cbb",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/sctp",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/callout_execute",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/vfs",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/proc",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/syscall",O_RDONLY,00) ERR#2 'No such file or directory' 5191: open("/dev/dtrace/syscall",O_RDONLY,00) ERR#2 'No such file or directory' I satisfied all conditions mentioned in https://wiki.freebsd.org/DTrace on how to run DTrace on 9.0 (I am on 9.1-STABLE). kernel modules are loaded, see below. So: * Why/How ustack kills the user process? (amazing this is even possible) * Why files like /dev/dtrace/io don't exist? Maybe some extra-options are required for ustack() to work? If this is the case this should be mentioned in wiki. Yuri # kldstat Id Refs Address Size Name 3 16 0xffffffff81861000 84c0 opensolaris.ko 4 4 0xffffffff8186a000 53a00 linux.ko 10 1 0xffffffff82612000 9e50 linprocfs.ko 12 1 0xffffffff82627000 25b linux_adobe.ko 13 2 0xffffffff82628000 baa dtraceall.ko 14 1 0xffffffff82629000 4eca profile.ko 15 3 0xffffffff8262e000 4005 cyclic.ko 16 12 0xffffffff82633000 23baaf dtrace.ko 17 1 0xffffffff8286f000 fae8 systrace_freebsd32.ko 18 1 0xffffffff8287f000 109a5 systrace.ko 19 1 0xffffffff82890000 45a8 sdt.ko 20 1 0xffffffff82895000 4938 lockstat.ko 21 1 0xffffffff8289a000 be09 fasttrap.ko 22 1 0xffffffff828a6000 65e2 fbt.ko 23 1 0xffffffff828ad000 4ee4 dtnfsclient.ko 24 1 0xffffffff828b2000 1dbeb nfsclient.ko 25 1 0xffffffff828d0000 47da nfs_common.ko 26 1 0xffffffff828d5000 55ec dtnfscl.ko 27 1 0xffffffff828db000 4597 dtmalloc.ko 28 1 0xffffffff828e0000 44fd dtio.ko 29 1 0xffffffff828e5000 2466 dtrace_test.ko From owner-freebsd-hackers@FreeBSD.ORG Fri Jan 25 20:41:07 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 0204837C; Fri, 25 Jan 2013 20:41:07 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from bigwig.baldwin.cx (bigknife-pt.tunnel.tserv9.chi1.ipv6.he.net [IPv6:2001:470:1f10:75::2]) by mx1.freebsd.org (Postfix) with ESMTP id D1A6F81B; Fri, 25 Jan 2013 20:41:06 +0000 (UTC) Received: from pakbsde14.localnet (unknown [38.105.238.108]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 345D5B93E; Fri, 25 Jan 2013 15:41:03 -0500 (EST) From: John Baldwin To: "Robert N. M. Watson" Subject: Re: libprocstat(3): retrieve process command line args and environment Date: Fri, 25 Jan 2013 15:31:43 -0500 User-Agent: KMail/1.13.5 (FreeBSD/8.2-CBSD-20110714-p22; KDE/4.5.5; amd64; ; ) References: <20130119151253.GB88025@gmail.com> <201301241120.52054.jhb@freebsd.org> In-Reply-To: MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201301251531.43540.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (bigwig.baldwin.cx); Fri, 25 Jan 2013 15:41:03 -0500 (EST) Cc: Mikolaj Golub , Stanislav Sedov , freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2013 20:41:07 -0000 On Friday, January 25, 2013 1:37:45 pm Robert N. M. Watson wrote: > > On 24 Jan 2013, at 16:20, John Baldwin wrote: > > >>> Hmm, are you going to rewrite ps(1) to use libprocstat? Or rather, is that a > >>> goal someday? That is one current consumer of kvm_getargv/envv. That might > >>> be fine if we want to make more tools use libprocstat instead of using libkvm > >>> directly. > >> > >> I didn't have any plans for ps(1) :-) That is why I wrote about "new > >> code". But if you think it is good to do I might look at it one day... > > > > I'm mostly hoping Robert chimes in to see if that was his intention for > > libprocstat. :) If we can ultimately replace all uses of kvm_get*v() with > > calls to procstat_get*v*() then I'm fine with some code duplication in the > > interim. > > > Originally there was just proctstat(1), but it made sense to begin re- encapsulating it in a libprocstat(3) because the code there is potentially extremely reusable. This conflicts a bit with libkvm(3), which mysteriously knows about sysctlbyname(3) despite a name suggesting otherwise. You can imagine various approaches to fixing this, but indeed, making libprocstat(3) the first-class citizen and preferring it for both kvm and sysctl methods sounds like the way to go. I actually want to make libprocstat also support snmp, but I've never actually found the time to investigate doing that. One of my main unmet goals for procstat(1) was to introduce an extremely machine- readable output format for it -- e.g., something XML-based or similar. I'd still love to see that happen. BTW, one off-ball thought I have is that I would like to have a mode where libprocstat operates on a core file (of a process, not a kernel crash dump), so it could list the threads from a core dump, and possibly file descriptor info (if PR kern/173723 is implemented). We certainly could have a 'raw' mode where it spat out name: value or XML of the entire kinfo_proc perhaps. -- John Baldwin From owner-freebsd-hackers@FreeBSD.ORG Fri Jan 25 21:09:38 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id 0C19ECCA; Fri, 25 Jan 2013 21:09:38 +0000 (UTC) (envelope-from stas@FreeBSD.org) Received: from mx0.deglitch.com (cl-414.sto-01.se.sixxs.net [IPv6:2001:16d8:ff00:19d::2]) by mx1.freebsd.org (Postfix) with ESMTP id B553396F; Fri, 25 Jan 2013 21:09:37 +0000 (UTC) Received: from freebsd.corp.qc (unknown [72.5.114.2]) by mx0.deglitch.com (Postfix) with ESMTPA id C7F2C8FC2B; Sat, 26 Jan 2013 01:09:31 +0400 (MSK) Date: Fri, 25 Jan 2013 13:09:29 -0800 From: Stanislav Sedov To: John Baldwin Subject: Re: libprocstat(3): retrieve process command line args and environment Message-Id: <20130125130929.3803f993e3df877f30653dbf@FreeBSD.org> In-Reply-To: <201301251531.43540.jhb@freebsd.org> References: <20130119151253.GB88025@gmail.com> <201301241120.52054.jhb@freebsd.org> <201301251531.43540.jhb@freebsd.org> Organization: The FreeBSD Project X-Mailer: carrier-pigeon Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Mikolaj Golub , Stanislav Sedov , "Robert N. M. Watson" , freebsd-hackers@freebsd.org X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jan 2013 21:09:38 -0000 On Fri, 25 Jan 2013 15:31:43 -0500 John Baldwin mentioned: > On Friday, January 25, 2013 1:37:45 pm Robert N. M. Watson wrote: > > > > On 24 Jan 2013, at 16:20, John Baldwin wrote: > > > > >>> Hmm, are you going to rewrite ps(1) to use libprocstat? Or rather, is > that a > > >>> goal someday? That is one current consumer of kvm_getargv/envv. That > might > > >>> be fine if we want to make more tools use libprocstat instead of using > libkvm > > >>> directly. > > >> > > >> I didn't have any plans for ps(1) :-) That is why I wrote about "new > > >> code". But if you think it is good to do I might look at it one day... > > > > > > I'm mostly hoping Robert chimes in to see if that was his intention for > > > libprocstat. :) If we can ultimately replace all uses of kvm_get*v() with > > > calls to procstat_get*v*() then I'm fine with some code duplication in the > > > interim. > > > > > > Originally there was just proctstat(1), but it made sense to begin re- > encapsulating it in a libprocstat(3) because the code there is potentially > extremely reusable. This conflicts a bit with libkvm(3), which mysteriously > knows about sysctlbyname(3) despite a name suggesting otherwise. You can > imagine various approaches to fixing this, but indeed, making libprocstat(3) > the first-class citizen and preferring it for both kvm and sysctl methods > sounds like the way to go. I actually want to make libprocstat also support > snmp, but I've never actually found the time to investigate doing that. One of > my main unmet goals for procstat(1) was to introduce an extremely machine- > readable output format for it -- e.g., something XML-based or similar. I'd > still love to see that happen. > > BTW, one off-ball thought I have is that I would like to have a mode where > libprocstat operates on a core file (of a process, not a kernel crash dump), > so it could list the threads from a core dump, and possibly file descriptor > info (if PR kern/173723 is implemented). > That's actually a good idea. I was thinking about the same for some time. AFAIK Solaris' pfiles can do that, and maybe some other tools as well. I have not had time to look into yet, though. -- Stanislav Sedov ST4096-RIPE () ascii ribbon campaign - against html e-mail /\ www.asciiribbon.org - against proprietary attachments From owner-freebsd-hackers@FreeBSD.ORG Sat Jan 26 11:52:08 2013 Return-Path: Delivered-To: freebsd-hackers@FreeBSD.org Received: from mx1.freebsd.org (mx1.FreeBSD.org [8.8.178.115]) by hub.freebsd.org (Postfix) with ESMTP id D345B163; Sat, 26 Jan 2013 11:52:08 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id C44DCB4E; Sat, 26 Jan 2013 11:52:07 +0000 (UTC) Received: from porto.starpoint.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id NAA11336; Sat, 26 Jan 2013 13:52:03 +0200 (EET) (envelope-from avg@FreeBSD.org) Received: from localhost ([127.0.0.1]) by porto.starpoint.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1Tz4Il-000Hod-AF; Sat, 26 Jan 2013 13:52:03 +0200 Message-ID: <5103C361.6060605@FreeBSD.org> Date: Sat, 26 Jan 2013 13:52:01 +0200 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:17.0) Gecko/20130121 Thunderbird/17.0.2 MIME-Version: 1.0 To: freebsd-current@FreeBSD.org, freebsd-hackers@FreeBSD.org Subject: some questions on kern_linker and pre-loaded modules X-Enigmail-Version: 1.4.6 Content-Type: text/plain; charset=X-VIET-VPS Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jan 2013 11:52:08 -0000 I. It seems that linker_preload checks a module for being a duplicate module only if the module has MDT_VERSION. This is probably designed to allow different version of the same module to co-exist (for some definition of co-exist)? But, OTOH, this doesn't work well if the module is version-less (no MODULE_VERSION in the code) and is pre-loaded twice (e.g. once in kernel and once in a preloaded file). At present a good example of this is zfsctrl module, which could be present both in kernel (options ZFS) and in zfs.ko. I haven't thought about any linker-level resolution for this issue. I've just tried a plug the ZFS hole for now. commit ed8b18f2d6c4d1be915bff94cdec0c51a479529f Author: Andriy Gapon Date: Wed Dec 19 23:29:23 2012 +0200 [bugfix] zfs: add MODULE_VERSION for zfsctrl This should allow the kernel linker to easily detect a situation when the module is present both in a kernel and in a preloaded file (zfs.ko). diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c index 10d28c2..5721010 100644 --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c @@ -5599,6 +5599,7 @@ static moduledata_t zfs_mod = { 0 }; DECLARE_MODULE(zfsctrl, zfs_mod, SI_SUB_VFS, SI_ORDER_ANY); +MODULE_VERSION(zfsctrl, 1); MODULE_DEPEND(zfsctrl, opensolaris, 1, 1, 1); MODULE_DEPEND(zfsctrl, krpc, 1, 1, 1); MODULE_DEPEND(zfsctrl, acl_nfs4, 1, 1, 1); II. It seems that linker_file_register_modules() for the kernel is called after linker_file_register_modules() is called for all the pre-loaded files. linker_file_register_modules() for the kernel is called from linker_init_kernel_modules via SYSINIT(SI_SUB_KLD, SI_ORDER_ANY) and that happens after linker_preload() which is executed via SYSINIT(SI_SUB_KLD, SI_ORDER_MIDDLE). Perhaps this is designed to allow modules in the preloaded files to override modules compiled into the kernel? But this doesn't seem to work well. Because modules from the kernel are not registered yet, linker_file_register_modules() would be successful for the duplicate modules in a preloaded file and thus any sysinits present in the file will also be registered. So, if the module is present both in the kernel and in the preloaded file and the module has a module event handler (modeventhand_t), then the handler will registered and called twice. I cobbled together the following hack, but I am not sure if it is OK or if it violates fundamental architecture/design of this subsystem. commit 14ebf07633d0f0ea393801c3e4161d6c37393661 Author: Andriy Gapon Date: Wed Dec 19 23:27:46 2012 +0200 [wip][experiment] kernel linker: register kernel modules before preloaded modules... Also, skip adding sysinit and sysctl stuff from preloaded modules if module registration fails. This should result in much saner behavior if a module is present in both the kernel and a preloaded file. Perhaps, the original intent was to allow the preloaded files to override modules present in kernel, but that was extremly fragile because of double sysinit registration. diff --git a/sys/kern/kern_linker.c b/sys/kern/kern_linker.c index b3ab4df..be46cdf 100644 --- a/sys/kern/kern_linker.c +++ b/sys/kern/kern_linker.c @@ -365,6 +365,7 @@ linker_file_register_modules(linker_file_t lf) return (first_error); } +#if 0 static void linker_init_kernel_modules(void) { @@ -374,6 +375,7 @@ linker_init_kernel_modules(void) SYSINIT(linker_kernel, SI_SUB_KLD, SI_ORDER_ANY, linker_init_kernel_modules, 0); +#endif static int linker_load_file(const char *filename, linker_file_t *result) @@ -1599,7 +1601,11 @@ restart: printf("KLD file %s is missing dependencies\n", lf->filename); linker_file_unload(lf, LINKER_UNLOAD_FORCE); } - +#if 1 + error = linker_file_register_modules(linker_kernel_file); + if (error) + printf("linker_file_register_modules(linker_kernel_file) failed: %d\n", error); +#endif /* * We made it. Finish off the linking in the order we determined. */ @@ -1642,13 +1648,15 @@ restart: * Now do relocation etc using the symbol search paths * established by the dependencies */ + error = linker_file_register_modules(lf); + if (error) + goto fail; error = LINKER_LINK_PRELOAD_FINISH(lf); if (error) { printf("KLD file %s - could not finalize loading\n", lf->filename); goto fail; } - linker_file_register_modules(lf); if (linker_file_lookup_set(lf, "sysinit_set", &si_start, &si_stop, NULL) == 0) sysinit_add(si_start, si_stop); -- Andriy Gapon From owner-freebsd-hackers@FreeBSD.ORG Sat Jan 26 23:56:49 2013 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by hub.freebsd.org (Postfix) with ESMTP id 23982C71 for ; Sat, 26 Jan 2013 23:56:49 +0000 (UTC) (envelope-from kientzle@freebsd.org) Received: from monday.kientzle.com (99-115-135-74.uvs.sntcca.sbcglobal.net [99.115.135.74]) by mx1.freebsd.org (Postfix) with ESMTP id CB42E685 for ; Sat, 26 Jan 2013 23:56:48 +0000 (UTC) Received: (from root@localhost) by monday.kientzle.com (8.14.4/8.14.4) id r0QNulvW023887 for freebsd-hackers@freebsd.org; Sat, 26 Jan 2013 23:56:47 GMT (envelope-from kientzle@freebsd.org) Received: from [192.168.2.143] (CiscoE3000 [192.168.1.65]) by kientzle.com with SMTP id m5y5vauejaub6vn3kp977x6fqi; for freebsd-hackers@freebsd.org; Sat, 26 Jan 2013 23:56:47 +0000 (UTC) (envelope-from kientzle@freebsd.org) From: Tim Kientzle Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: Testing SIOCADDMULTI? Date: Sat, 26 Jan 2013 15:56:47 -0800 Message-Id: To: freebsd-hackers Mime-Version: 1.0 (Apple Message framework v1283) X-Mailer: Apple Mail (2.1283) X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jan 2013 23:56:49 -0000 My next TODO items for this network driver is to implement the SIOCADDMULTI and SIOCDELMULTI ioctls. I'm not quite sure what they do, though, and have no idea how to test them to see if they are working correctly. Any suggestions? Cheers, Tim