From owner-freebsd-arch@FreeBSD.ORG Sun Dec 16 00:11:03 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 51C6216A41B for ; Sun, 16 Dec 2007 00:11:03 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.178]) by mx1.freebsd.org (Postfix) with ESMTP id 2FB2E13C467 for ; Sun, 16 Dec 2007 00:11:03 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: by wa-out-1112.google.com with SMTP id k17so2421138waf.3 for ; Sat, 15 Dec 2007 16:11:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=uNCqbFGc4YTtUJCFm5qvRZua1EWxtGydtQ4Ww1RxOFs=; b=o6a3fRsx7hs3NaozV1q70tjYMGRNmIKj3AIzNTm717SGo6Pae1yNPBLC0oCWIQXkgg79zNYeBNPvsDpUqeQtGibp5brLI/vl6ozJgIXFvS8cu68ywVSZ/gTUuRoNUqIGH9NpBEOWgji3A8aNcEn3yMng6Qqll/2u4ZObiIARYyI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=PkIZvtkt+YwHr+KrKDA2K/qbZlU85/q+kFmTKSk6Nj3Z5NCu7cQsCdJ8dQotybwwOJZiXUyBpQXrHNxplH64Esef8YKag3qbCEp2ecJ+bmq6pNcLq6yhjsHwex8nvKHe5bqWoMK4mGneHGpxXD9PCXUvCaDWBL1CjIS6p2wcJuI= Received: by 10.114.52.1 with SMTP id z1mr78516waz.123.1197763861347; Sat, 15 Dec 2007 16:11:01 -0800 (PST) Received: by 10.114.255.11 with HTTP; Sat, 15 Dec 2007 16:11:01 -0800 (PST) Message-ID: Date: Sat, 15 Dec 2007 16:11:01 -0800 From: "Kip Macy" To: "Robert Watson" In-Reply-To: <20071215214253.N85668@fledge.watson.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20071215100351.Q70617@fledge.watson.org> <20071215190252.I85668@fledge.watson.org> <20071215214253.N85668@fledge.watson.org> Cc: FreeBSD Current , freebsd-arch@freebsd.org Subject: Re: pending changes for TOE support X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Dec 2007 00:11:03 -0000 > Actually, what I was asking for in the omitted context above was something > along the lines of the following, adapted for whatever the reality may be: > > Returning a non-zero value will lead to the software stack beginning a > disconnect. > > Or, say, > > Non-zero return values will be ignored. (*) > > This is not intended as a contrarian point. I'm not looking for a complete > exposition of the behavior of the stack -- rather, basic information that we > should be documenting about a KPI, such as what an error being returned will > do. I believe that my latest patch makes at least a perfunctory effort to address all of the points you've raised with the notable exception of widening the ops vector. Please take a quick look at the same URL as before. -Kip From owner-freebsd-arch@FreeBSD.ORG Sun Dec 16 09:37:47 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F024716A421 for ; Sun, 16 Dec 2007 09:37:47 +0000 (UTC) (envelope-from darrenr@freebsd.org) Received: from out3.smtp.messagingengine.com (out3.smtp.messagingengine.com [66.111.4.27]) by mx1.freebsd.org (Postfix) with ESMTP id B1F3D13C458 for ; Sun, 16 Dec 2007 09:37:47 +0000 (UTC) (envelope-from darrenr@freebsd.org) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id E8B8C7D966; Sun, 16 Dec 2007 04:20:27 -0500 (EST) Received: from heartbeat1.messagingengine.com ([10.202.2.160]) by compute1.internal (MEProxy); Sun, 16 Dec 2007 04:20:27 -0500 X-Sasl-enc: oUeKRPfkGmHRY1s/7RGoKyuoEDoN2Ao54yPlFcv3+yYq 1197796826 Received: from [192.168.1.100] (dsl-202-45-110-141-static.VIC.netspace.net.au [202.45.110.141]) by mail.messagingengine.com (Postfix) with ESMTP id BBC25BA40; Sun, 16 Dec 2007 04:20:25 -0500 (EST) Message-ID: <4764EDD0.5050101@freebsd.org> Date: Sun, 16 Dec 2007 20:20:16 +1100 From: Darren Reed Organization: FreeBSD User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: Kip Macy References: <20071215100351.Q70617@fledge.watson.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: FreeBSD Current , Robert Watson , freebsd-arch@freebsd.org Subject: Re: pending changes for TOE support X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: darrenr@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Dec 2007 09:37:48 -0000 Kip Macy wrote: ... >> My initial feeling is that, even if an interface supports TOE, we shouldn't >> enable the capability in the enabled vector by default, as TOE bypasses >> firewall behavior, etc, and would certainly be a surprise if an admin swapped >> a chelsio card for a non-TOE supporting card. What's your feeling on this? > > > The current implementation bypasses the firewall. This and likely > other hardware has extensive filtering support so it isn't > neccessarily intrinsic. I'm not convinced that we're quite there yet. There are some important points that need to be addressed here that basic filtering won't allow: - reporting on "denied" packets - interaction with per host/network limits - interaction with complex pools of addresses What I would like to see (but I'm not sure if it's possible yet) is the ability for the TCP MIB to be properly updated with what happens on the wire - e.g. proper counting of retransmits, timeouts, packet counts, bytes sent, etc. >> + * The TOE API assumes that the tcp offload engine can offload the >> + * the entire connection from set up to teardown, with some provision >> + * being made to allowing the software stack to handle time wait. If >> + * the device does not meet these criteria, it is the driver's responsibility >> + * to overload the functions that it needs to in tcp_usrreqs and make >> + * its own calls to tcp_output if it needs to do so. >> >> While I'm familiar with TCP, I'm less familiar with the scope of what cards >> support for TOE. Do we know of any cards that are less capable than the >> chelsio card in this respect, or are they all sort of on-par on that front? >> I.e., do we think the above eventuality is likely? > > I don't have any way of knowing. I think it is probably safe to say > that any vendors that don't meet that criteria now will in the future > as transistor density increases. There are cards (or at least I've heard talk of this) that do partial TCP offload - that is the connection setup and teardown are handled by the operating system and that only data transfer is offloaded. I'm in the wrong country to chase down details on this ;( I'm given to believe that TSO (transmit segement offload) and RSO (receive segment offload) are also around. Both of these should allow for "packets" of upto 64k to be exchanged with the NIC and for it to take care of dealing with the MTU sized chunks. Darren From owner-freebsd-arch@FreeBSD.ORG Sun Dec 16 11:17:13 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F45E16A41B; Sun, 16 Dec 2007 11:17:13 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id ED7DA13C458; Sun, 16 Dec 2007 11:17:12 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 9D4FC4D23D; Sun, 16 Dec 2007 06:17:12 -0500 (EST) Date: Sun, 16 Dec 2007 11:17:12 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Kip Macy In-Reply-To: Message-ID: <20071216111315.C49036@fledge.watson.org> References: <20071215100351.Q70617@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: FreeBSD Current , freebsd-arch@freebsd.org Subject: Re: pending changes for TOE support X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Dec 2007 11:17:13 -0000 On Sat, 15 Dec 2007, Kip Macy wrote: >> + * + tu_abort >> + * - closes the connection and sends a RST to peer >> + * - driver is expectd to trigger an RST and detach the toepcb >> >> In regular TCP, the pru_abort method is only called on pending connections >> while still in the listen queues of a listen socket. Is this true of >> tu_abort, or is tu_abort a more general method to be used to cancel >> connections? If so, probably worth commenting on that. > > tu_abort is called in place of tcp_output in pru_abort. The reason I ask is that it appears tu_abort appears to be the only interface allowing the stack to request that TOE reset of a connection. In regular TCP, soabort/pru_abort/tcp_usr_abort are used only on nascent unaccepted connections; at least one other path, used by tcpdrop(8), can lead to connections being reset as well. Perhaps a more general tu_reset could be used to address this? I'm not sure what other direct-to-reset paths exist but a review for them may be called for. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Sun Dec 16 16:48:13 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 532E216A46B for ; Sun, 16 Dec 2007 16:48:13 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.179]) by mx1.freebsd.org (Postfix) with ESMTP id 1A91C13C47E for ; Sun, 16 Dec 2007 16:48:12 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: by wa-out-1112.google.com with SMTP id k17so2892285waf.3 for ; Sun, 16 Dec 2007 08:48:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=M+luHTHa4aZzvs4dmrQPPg9kISIClW557vjs7as9RTw=; b=QKIem97MmBwu+9gpPvikalRU4QABM33yomTS+d6TwufptvPA3p98pT/MDM/g2lkDT3cQOtm2dz3xRRHMaWnZLPvnlXc1T/58wAtVyjwzS/JwxVOyNLv5DZi+RaJFHf6IF8fCHfe9Lrh3excjYIO7G7K4XYQXTap67F2vEtbc2m8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=itykqsxvQAVn7wMpevuoS7FKCqIr3rNQXLK5bCgUtJJShQ/E+Pt9Ac968aXa2x1Cj2gPwnXYStzpCLSl1FLaYip7zTLoG0jOtbD0lkErLSPMwmsO0KyIY1Golt/3kLo+jBMsObLaSjO7xgzebJSXn6QQWWWCGS+2MpFk+Bmklko= Received: by 10.114.120.1 with SMTP id s1mr2234300wac.107.1197823692718; Sun, 16 Dec 2007 08:48:12 -0800 (PST) Received: by 10.114.255.11 with HTTP; Sun, 16 Dec 2007 08:48:12 -0800 (PST) Message-ID: Date: Sun, 16 Dec 2007 08:48:12 -0800 From: "Kip Macy" To: "Robert Watson" , "Sam Leffler" , freebsd-arch@freebsd.org, "FreeBSD Current" In-Reply-To: <20071216111315.C49036@fledge.watson.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20071215100351.Q70617@fledge.watson.org> <20071216111315.C49036@fledge.watson.org> Cc: Subject: Re: pending changes for TOE support X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Dec 2007 16:48:13 -0000 Reset is fine, given that internally it just calls t3_send_reset. On 12/16/07, Robert Watson wrote: > > On Sat, 15 Dec 2007, Kip Macy wrote: > > >> + * + tu_abort > >> + * - closes the connection and sends a RST to peer > >> + * - driver is expectd to trigger an RST and detach the toepcb > >> > >> In regular TCP, the pru_abort method is only called on pending > connections > >> while still in the listen queues of a listen socket. Is this true of > >> tu_abort, or is tu_abort a more general method to be used to cancel > >> connections? If so, probably worth commenting on that. > > > > tu_abort is called in place of tcp_output in pru_abort. > > The reason I ask is that it appears tu_abort appears to be the only > interface > allowing the stack to request that TOE reset of a connection. In regular > TCP, > soabort/pru_abort/tcp_usr_abort are used only on nascent unaccepted > connections; at least one other path, used by tcpdrop(8), can lead to > connections being reset as well. Perhaps a more general tu_reset could be > used to address this? I'm not sure what other direct-to-reset paths exist > but > a review for them may be called for. > > Robert N M Watson > Computer Laboratory > University of Cambridge > From owner-freebsd-arch@FreeBSD.ORG Sun Dec 16 18:07:48 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ADE0216A468 for ; Sun, 16 Dec 2007 18:07:48 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.177]) by mx1.freebsd.org (Postfix) with ESMTP id 62BA313C46A for ; Sun, 16 Dec 2007 18:07:48 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: by wa-out-1112.google.com with SMTP id k17so2931457waf.3 for ; Sun, 16 Dec 2007 10:07:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=oI4jeJcNbg+WEPlzoCnZYvn04zHqPtn8WqpCsp4MN0M=; b=v/fdwZfAgJZCtINOoI+SRNv5uBcqQutdgseQUTzVuwCDgMSjfljwLAHcCy6zxdc1yY51O1jJ9uBGeDnjktk/DpAH5KkG2DWy1gw00gq5kvZNupIyAxfwlFncWeNM3BJPsoBtNLbMpVk/Hq7oY/WadrKdgxf2W45lxk0e+2LZdDI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=CCM+Zpoia5dT3RuAh9zoN5hXC5enV9OE2Kj6vnk0pjSgWoN/7RE5rXJoTGdNoMV4/NsxmbKBXh7XWiUQ6wV8Z/07dehSQFbgHZcbVEWYO0xpQXBYSYsMYHtcywkGqHqqsPvzgCUKUUNSFPJb8MN97Dp5AhWiMS7aiwEhu/+/0CY= Received: by 10.114.199.1 with SMTP id w1mr1469508waf.109.1197828467368; Sun, 16 Dec 2007 10:07:47 -0800 (PST) Received: by 10.114.255.11 with HTTP; Sun, 16 Dec 2007 10:07:47 -0800 (PST) Message-ID: Date: Sun, 16 Dec 2007 10:07:47 -0800 From: "Kip Macy" To: "Robert Watson" In-Reply-To: <20071216111315.C49036@fledge.watson.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20071215100351.Q70617@fledge.watson.org> <20071216111315.C49036@fledge.watson.org> Cc: FreeBSD Current , freebsd-arch@freebsd.org Subject: Re: pending changes for TOE support X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 16 Dec 2007 18:07:48 -0000 On Dec 16, 2007 3:17 AM, Robert Watson wrote: > > On Sat, 15 Dec 2007, Kip Macy wrote: > > >> + * + tu_abort > >> + * - closes the connection and sends a RST to peer > >> + * - driver is expectd to trigger an RST and detach the toepcb > >> > >> In regular TCP, the pru_abort method is only called on pending connections > >> while still in the listen queues of a listen socket. Is this true of > >> tu_abort, or is tu_abort a more general method to be used to cancel > >> connections? If so, probably worth commenting on that. > > > > tu_abort is called in place of tcp_output in pru_abort. > > The reason I ask is that it appears tu_abort appears to be the only interface > allowing the stack to request that TOE reset of a connection. In regular TCP, > soabort/pru_abort/tcp_usr_abort are used only on nascent unaccepted > connections; at least one other path, used by tcpdrop(8), can lead to > connections being reset as well. Perhaps a more general tu_reset could be > used to address this? I'm not sure what other direct-to-reset paths exist but > a review for them may be called for. > The patch is at the same location as before: http://www.fsmware.com/freebsd/tcp/tcp_offload.diff I've made all the changes that I mentioned previously (documenting calls and what fields the callers are expected to look at, etc.). In addition I've widened the interface to listen as I know that it only has start and stop. I've changed _abort to _reset, and I now check the capenable field instead of flags in connect. From owner-freebsd-arch@FreeBSD.ORG Mon Dec 17 02:19:22 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F8A216A421; Mon, 17 Dec 2007 02:19:22 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from harmony.bsdimp.com (bsdimp.com [199.45.160.85]) by mx1.freebsd.org (Postfix) with ESMTP id F2CE313C458; Mon, 17 Dec 2007 02:19:21 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from localhost (localhost [127.0.0.1]) by harmony.bsdimp.com (8.14.1/8.14.1) with ESMTP id lBH2Ekmu062691; Sun, 16 Dec 2007 19:14:46 -0700 (MST) (envelope-from imp@bsdimp.com) Date: Sun, 16 Dec 2007 19:16:41 -0700 (MST) Message-Id: <20071216.191641.-1402074010.imp@bsdimp.com> To: jan.grant@bristol.ac.uk From: "M. Warner Losh" In-Reply-To: <20071204085502.N83722@tribble.ilrt.bris.ac.uk> References: <20071203235929.685d3674@Karsten.Behrmanns.Kasten> <20071204014614.GE76623@elvis.mu.org> <20071204085502.N83722@tribble.ilrt.bris.ac.uk> X-Mailer: Mew version 5.2 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: BearPerson@gmx.net, alfred@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Code review request: small optimization to localtime.c X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 02:19:22 -0000 In message: <20071204085502.N83722@tribble.ilrt.bris.ac.uk> Jan Grant writes: : On Mon, 3 Dec 2007, Alfred Perlstein wrote: : : [on the double-checked locking idiom] : : > Karsten, _typically_ (but not always) an "unlock" operation : > requires that writes prior to the unlock be globally visible. : > : > This is why it works almost everywhere. : : Perhaps, but if you use it you should probably mark the code with : /* XXX not guaranteed to be correct by POSIX */ : : Double-checked locking is broken without an appropriate barrier. : "Correctness over speed" should surely be our watchword :-) Actually, the code I posted for review *IS* posixly correct. It doesn't matter if the write posts or not. If it doesn't post, then we know the guard variable will be false still and we take out the lock, test it see that it is true (since nothing would work well if the lock/unlock pairs didn't force a consistent variable after the lock is released). If it is posted, we don't take the branch. Since these variables are initialized to zero and set exactly once to true, the above is true. pthread_once() is more optimal, but a larger code change. Warner From owner-freebsd-arch@FreeBSD.ORG Mon Dec 17 10:38:08 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9E65D16A419 for ; Mon, 17 Dec 2007 10:38:08 +0000 (UTC) (envelope-from raffaele.delorenzo@libero.it) Received: from grupposervizi.it (mail1.tagetik.com [85.18.71.243]) by mx1.freebsd.org (Postfix) with SMTP id E2C6B13C4D1 for ; Mon, 17 Dec 2007 10:38:07 +0000 (UTC) (envelope-from raffaele.delorenzo@libero.it) Received: (qmail 3459 invoked by uid 453); 17 Dec 2007 10:11:24 -0000 Received: from [192.9.210.26] (HELO noel.grupposervizi.it) (192.9.210.26) by grupposervizi.it (qpsmtpd/0.31.1) with ESMTP; Mon, 17 Dec 2007 11:11:24 +0100 Message-ID: <47664B4B.4050805@libero.it> Date: Mon, 17 Dec 2007 11:11:23 +0100 From: Raffaele De Lorenzo User-Agent: Thunderbird 2.0.0.9 (X11/20071204) MIME-Version: 1.0 To: John E Hein References: <4759022A.4020105@libero.it> <47599AE1.6060805@elischer.org> <475D2185.3090405@libero.it> <868x4291ap.fsf@ds4.des.no> <475D417D.5020303@libero.it> <18273.25559.26231.178154@gromit.timing.com> In-Reply-To: <18273.25559.26231.178154@gromit.timing.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-arch@freebsd.org, "raffaele.delorenzo" , net@freebsd.org, Julian Elischer , security@freebsd.org Subject: Re: Added native socks support to libc in FreeBSD 7 X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 10:38:08 -0000 John E Hein wrote: > Raffaele De Lorenzo wrote at 14:39 +0100 on Dec 10, 2007: > > You can see in the port-tree my project "csocks" and > > http://csocks.altervista.org. > > Thanks for lettings us know about your project. Here are > just a few comments. > > Why don't you provide the source code in the port? > > For an open source, security sensitive project such as this, I think > that's important for users to gain confidence in it. > > > As far as putting the code in the base FreeBSD, that's a pretty large > hurdle. The FreeBSD maintainers tend to put something in base only > after a significant part of the user base uses it, and it has become > the [or a] de facto preferred implementation of some industry > standard. > > SOCKS is a standard, but the csocks implementation is not (yet). > Continue to adhere to RFCs and grow your user base, and perhaps > inclusion in FreeBSD's base system will happen organically. > > For things to go into the base system ... > > 1) The software (and its developers) need a proven track record > (which you can gain by getting a large user base in ports). > Personally, I hadn't heard about your SOCKS implementation until > this week. > > 2) A significant number of FreeBSD users can't do without it. Now, > this is quite subjective. In some sense, people can't do without > a web browser in this day and age, but there's no browser in the > FreeBSD base system. Of course, comparing firefox to csocks is > not fair. Maybe grep is a better comparison. Web browsers are > monstrous. > > 3) There is a significant benefit to having it tightly integrated > with the base system (as opposed to a more loose integration in > the ports tree). Wireless LAN is perhaps a good example here (and > for #2 for that matter). Not everyone needs it, but when you do > it is good to have it in the base system where it is given > system level architecture love and care. > > 4) You need someone with commit privs to shepherd this thing along > _and_ agreement from lots of other people (including FreeBSD's > core). Hint: the freebsd-arch list is often a good place to > discuss additions to the FreeBSD base. > > 5) Lots of other criteria (both implied and explicitly documented) > that I'll not go into further (everyone together: "Hear, Hear"). > > Note that the larger the base system becomes, the harder it is to > maintain it well as a core, well integrated body of work. And once it > is in the base, more people are now automatically signed on to > maintain it (indirectly)... not just you anymore. When someone makes > a change to the base tcp implementation, for instance, they have to > make sure it also doesn't break the shiny new socks code now in the > base system as well. This probably won't be a significant burden in > this particular case, but it's something that people have to consider. > > > As far as your specific patch to add socks support to libc ... > > Why not just make a patch that puts it in src/lib/libsocks? And a > binary in src/usr.bin/csocks (that does the LD_PRELOAD dance to > preload libsocks)? Why does it have to be in libc? > > I don't speak for the FreeBSD project, but that's a few of my thoughts > after looking at your implementation... which I did since it tickled > my curiosity. Keep up the good work. > . > > Hi, many tanks for your interested. Socks is a protocol used (for my experience) a lot in some banks for security reasons, so it has a large impact for the network security. Recently versions of IBM AIX OS introduced a native socks support. The IBM socks implementation is inside the AIX libc (AIX 4 has socks5 library in libc.a already), in fact, there are not externally socks libraries preloaded, and for socksify scope you must insert a socks rule in a particulary configuration file (default is "/etc/socks5c.conf"). The AIX native socks mode is very appreciated by the users, so my idea to add native socks support inside the libc in FreeBSD (that i think is a very good secure OS! ) is motivated by these considerations. This is a comparative table "AIX SOCKS" VS "CSOCKS": The IBM AIX Socks implementation: 1) doesn't support Socks V4 2) doesn't support GSS-API Authentication 3) Support IPv6 4) doesn't support Socks v5 User Authentication. 5) doesn't support Socks under UDP 6) Support sample Socks V5 connect and bind 7) The configuration file doesn't support detailed rules (you cannot specify the port an the protocol to socksify... for details you can see http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/IBMp690/IBM/usr/share/man/info/en_US/a_doc_lib/files/aixfiles/socks5c.conf.htm) The CSOCKS Socks implementation: 1) Support Socks V4 Connect and Bind 2) Support Socks V5 Connect and Bind 3) Support Socks V5 Sample User Authentication method 4) Support Socks V5 Under UDP 5) The configuration file support detailed rules (you can see: http://csocks.altervista.org/doc.htm) 6) doesn't support IPv6 (under development) 7) doesn't support GSS-API Authentication (under development) The source code of "csocks/port version" is practically the same of the source code for the FreeBSD native support (the link is: http://csocks.altervista.org/download/FreeBSD_libc.tar.gz). Now i posted this discussion in FreeBSD arch mailing list (tanks for your advice). Raffaele From owner-freebsd-arch@FreeBSD.ORG Mon Dec 17 16:25:06 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4C60616A417; Mon, 17 Dec 2007 16:25:06 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail35.syd.optusnet.com.au (mail35.syd.optusnet.com.au [211.29.133.51]) by mx1.freebsd.org (Postfix) with ESMTP id E8BAE13C4CE; Mon, 17 Dec 2007 16:25:05 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c211-30-219-213.carlnfd3.nsw.optusnet.com.au (c211-30-219-213.carlnfd3.nsw.optusnet.com.au [211.30.219.213]) by mail35.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id lBHGOXdS011141 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 18 Dec 2007 03:24:44 +1100 Date: Tue, 18 Dec 2007 03:24:32 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: "M. Warner Losh" In-Reply-To: <20071216.191641.-1402074010.imp@bsdimp.com> Message-ID: <20071218030602.A31080@delplex.bde.org> References: <20071203235929.685d3674@Karsten.Behrmanns.Kasten> <20071204014614.GE76623@elvis.mu.org> <20071204085502.N83722@tribble.ilrt.bris.ac.uk> <20071216.191641.-1402074010.imp@bsdimp.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: BearPerson@gmx.net, alfred@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: Code review request: small optimization to localtime.c X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 16:25:06 -0000 On Sun, 16 Dec 2007, M. Warner Losh wrote: > In message: <20071204085502.N83722@tribble.ilrt.bris.ac.uk> > Jan Grant writes: > : On Mon, 3 Dec 2007, Alfred Perlstein wrote: > : > : [on the double-checked locking idiom] > : > : > Karsten, _typically_ (but not always) an "unlock" operation > : > requires that writes prior to the unlock be globally visible. > : > > : > This is why it works almost everywhere. > : > : Perhaps, but if you use it you should probably mark the code with > : /* XXX not guaranteed to be correct by POSIX */ > : > : Double-checked locking is broken without an appropriate barrier. > : "Correctness over speed" should surely be our watchword :-) > > Actually, the code I posted for review *IS* posixly correct. No, bugs in it were pointed out in the review. > It doesn't matter if the write posts or not. If it doesn't post, then > we know the guard variable will be false still and we take out the > lock, test it see that it is true (since nothing would work well if > the lock/unlock pairs didn't force a consistent variable after the > lock is released). If it is posted, we don't take the branch. That works for the code that tests the variable, but not for the code that sets it. > Since these variables are initialized to zero and set exactly once to > true, the above is true. IIRC, the code that sets the variable does something like: initialized = 1; actually_do_the_initialization(); It would be little better to do: actually_do_the_initialization(); initialized = 1; since the code that tests the variable may see the changes in any order, because it doesn't do any locking in cases where it sees the variiable at 1. Either order would work if everything used locking, but the point of the optimization is to not use any locking in most paths. Bruce From owner-freebsd-arch@FreeBSD.ORG Mon Dec 17 19:22:10 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 54EA516A417 for ; Mon, 17 Dec 2007 19:22:10 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.179]) by mx1.freebsd.org (Postfix) with ESMTP id 39EB213C468 for ; Mon, 17 Dec 2007 19:22:10 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: by wa-out-1112.google.com with SMTP id k17so3675256waf.3 for ; Mon, 17 Dec 2007 11:22:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; bh=iOQXJDkPLqcqOvhR7XDmCVFCsfPuUcjaEuucUeMIKEg=; b=jyKev3v/W373AlTtTqdjE/w2Jvjx5ltT+q1vSkEAehBh7/Nib+xoQkgKOXQDaLoSr8U/TXIff7BD87YmoG4o6cR/NWF5nnd5/5dBXcAiOqWLyGPmd0Kgc0LnSxNSIGSepdImkigCM+cPOxT6vKOQiBuw/SijTzu+FohE9NkNzhc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition; b=mYPVjmzMBi5saD+tk1Nv5pyJJP2/16ieCZDrMOaTZswcLcNiRgvD+kpPnF4osbJ4g3LqRbA6TjUoHoi24nI1FPq2EqLHjjB2kThuVjEW8fjqr2p4Wz+9yRdZUVKW2YsglqtA6l9C2Ii2hqpag/2ZYRnZm/rwxRaGyEdH2u/R9uc= Received: by 10.114.120.1 with SMTP id s1mr2240504wac.125.1197919329666; Mon, 17 Dec 2007 11:22:09 -0800 (PST) Received: by 10.114.255.11 with HTTP; Mon, 17 Dec 2007 11:22:09 -0800 (PST) Message-ID: Date: Mon, 17 Dec 2007 11:22:09 -0800 From: "Kip Macy" To: freebsd-arch@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline Subject: RDMA support on FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 19:22:10 -0000 Chelsio's T3 card supports iWARP (RDMA over TCP). I've ported OpenFabric's kernel infrastructure for supporting RDMA to FreeBSD. Do we think that other RDMA providers (IB or iWARP) will be interested in supporting FreeBSD? If so it makes sense to put it under sys/contrib/rdma, otherwise I'll just add it as another module under cxgb. -Kip From owner-freebsd-arch@FreeBSD.ORG Mon Dec 17 19:38:06 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2765D16A418 for ; Mon, 17 Dec 2007 19:38:06 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (cl-162.ewr-01.us.sixxs.net [IPv6:2001:4830:1200:a1::2]) by mx1.freebsd.org (Postfix) with ESMTP id BB25213C457 for ; Mon, 17 Dec 2007 19:38:05 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (localhost [127.0.0.1]) by lor.one-eyed-alien.net (8.14.1/8.13.8) with ESMTP id lBHJc45U017457; Mon, 17 Dec 2007 13:38:04 -0600 (CST) (envelope-from brooks@lor.one-eyed-alien.net) Received: (from brooks@localhost) by lor.one-eyed-alien.net (8.14.1/8.13.8/Submit) id lBHJc4xS017456; Mon, 17 Dec 2007 13:38:04 -0600 (CST) (envelope-from brooks) Date: Mon, 17 Dec 2007 13:38:04 -0600 From: Brooks Davis To: Kip Macy Message-ID: <20071217193804.GA17357@lor.one-eyed-alien.net> References: MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Qxx1br4bt0+wmkIi" Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.16 (2007-06-09) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (lor.one-eyed-alien.net [127.0.0.1]); Mon, 17 Dec 2007 13:38:04 -0600 (CST) Cc: freebsd-arch@freebsd.org Subject: Re: RDMA support on FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 19:38:06 -0000 --Qxx1br4bt0+wmkIi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Mon, Dec 17, 2007 at 11:22:09AM -0800, Kip Macy wrote: > Chelsio's T3 card supports iWARP (RDMA over TCP). I've ported > OpenFabric's kernel infrastructure for supporting RDMA to FreeBSD. Do > we think that other RDMA providers (IB or iWARP) will be interested > in supporting FreeBSD? If so it makes sense to put it under > sys/contrib/rdma, otherwise I'll just add it as another module under > cxgb. It seems unlikely that anyone would bother with IB without RDMA. I can't see any value in mixing it with the cxgb bits. -- Brooks --Qxx1br4bt0+wmkIi Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFHZtAbXY6L6fI4GtQRAqXUAKC542EIRRNNz/80Jz0s+t9zSiiRsQCdF8So 7aHA5aSmtfuFCQWoIcnVIUk= =4VAc -----END PGP SIGNATURE----- --Qxx1br4bt0+wmkIi-- From owner-freebsd-arch@FreeBSD.ORG Mon Dec 17 21:32:31 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E324216A41A; Mon, 17 Dec 2007 21:32:31 +0000 (UTC) (envelope-from davidch@broadcom.com) Received: from mms1.broadcom.com (mms1.broadcom.com [216.31.210.17]) by mx1.freebsd.org (Postfix) with ESMTP id B7A7713C45B; Mon, 17 Dec 2007 21:32:31 +0000 (UTC) (envelope-from davidch@broadcom.com) Received: from [10.10.64.154] by mms1.broadcom.com with ESMTP (Broadcom SMTP Relay (Email Firewall v6.3.1)); Mon, 17 Dec 2007 13:02:01 -0800 X-Server-Uuid: 6B5CFB92-F616-4477-B110-55F967A57302 Received: by mail-irva-10.broadcom.com (Postfix, from userid 47) id C4C922AF; Mon, 17 Dec 2007 13:02:01 -0800 (PST) Received: from mail-irva-8.broadcom.com (mail-irva-8 [10.10.64.221]) by mail-irva-10.broadcom.com (Postfix) with ESMTP id AF4102AE; Mon, 17 Dec 2007 13:02:01 -0800 (PST) Received: from mail-irva-12.broadcom.com (mail-irva-12.broadcom.com [10.10.64.146]) by mail-irva-8.broadcom.com (MOS 3.7.5a-GA) with ESMTP id GCN13681; Mon, 17 Dec 2007 13:02:01 -0800 (PST) Received: from NT-IRVA-0750.brcm.ad.broadcom.com ( nt-irva-0750.brcm.ad.broadcom.com [10.8.194.64]) by mail-irva-12.broadcom.com (Postfix) with ESMTP id 0C09969CA3; Mon, 17 Dec 2007 13:02:01 -0800 (PST) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Mon, 17 Dec 2007 13:01:59 -0800 Message-ID: <09BFF2FA5EAB4A45B6655E151BBDD903068DBCBF@NT-IRVA-0750.brcm.ad.broadcom.com> In-Reply-To: <4764EDD0.5050101@freebsd.org> Thread-Topic: pending changes for TOE support thread-index: Acg/xR4BZza3skhCQKCeixqNMbs0JgBKVPcA References: <20071215100351.Q70617@fledge.watson.org> <4764EDD0.5050101@freebsd.org> From: "David Christensen" To: darrenr@freebsd.org, "Kip Macy" X-WSS-ID: 6B783C435IG13960771-01-01 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Cc: FreeBSD Current , Robert Watson , freebsd-arch@freebsd.org Subject: RE: pending changes for TOE support X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 21:32:32 -0000 > >> + * The TOE API assumes that the tcp offload engine can offload the > >> + * the entire connection from set up to teardown, with=20 > some provision > >> + * being made to allowing the software stack to handle=20 > time wait. If > >> + * the device does not meet these criteria, it is the=20 > driver's responsibility > >> + * to overload the functions that it needs to in=20 > tcp_usrreqs and make > >> + * its own calls to tcp_output if it needs to do so. > >> > >> While I'm familiar with TCP, I'm less familiar with the=20 > scope of what cards > >> support for TOE. Do we know of any cards that are less=20 > capable than the > >> chelsio card in this respect, or are they all sort of=20 > on-par on that front? > >> I.e., do we think the above eventuality is likely? > >=20 > > I don't have any way of knowing. I think it is probably safe to say > > that any vendors that don't meet that criteria now will in=20 > the future > > as transistor density increases. >=20 > There are cards (or at least I've heard talk of this) that do partial > TCP offload - that is the connection setup and teardown are handled by > the operating system and that only data transfer is offloaded. I'm in > the wrong country to chase down details on this ;( You are referring to Microsoft Chimney architecture which would be supported by all TOE adapters that operate under Windows (our=20 NetXtreme II controllers included). There are IP issues related to a chimney style implementation that would likely preclude their use under FreeBSD including passing TCP state information between the host OS and the controller among them. Dave From owner-freebsd-arch@FreeBSD.ORG Mon Dec 17 22:23:39 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9CD1B16A41B for ; Mon, 17 Dec 2007 22:23:39 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.227]) by mx1.freebsd.org (Postfix) with ESMTP id 6594613C459 for ; Mon, 17 Dec 2007 22:23:39 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: by nz-out-0506.google.com with SMTP id l8so945794nzf.13 for ; Mon, 17 Dec 2007 14:23:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=MNvjVO7pECBldW5Gv82OgRTJ1cRBwOHvLlozZUFeIIw=; b=FjBqxjuRrtNnLeWhQaLmqFXsQvUb7OmTBRCo0RMamtYvByc21EVx5CKa55O+8IkHIUZRO+puCuwOuySsoC+aaBv6WosPV1PEA4+SPWLEyG+dYkdNo42Wx67m0cNw3p7njtBxoMIGW1O1orGjs71iM7S3k+kbeYSJF/UH2q8DhC0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=EL6dcL+g5Fee8TcpVUm0VzdBp6GIDAH5jZpMlJTWI18sXo+0BLHlqXH05BkyfgxGtINdk0uo7cgJp8FGM5BLH3hdOXKmLBQLXMSbIhmd0P/73Ul4AfH67LMp/BcL3DcXUFVt7ayuw8a9Xp5/gfDpPTx5fjTwTrGvmXLDN/MbbuQ= Received: by 10.114.61.1 with SMTP id j1mr3378681waa.62.1197930217597; Mon, 17 Dec 2007 14:23:37 -0800 (PST) Received: by 10.114.255.11 with HTTP; Mon, 17 Dec 2007 14:23:37 -0800 (PST) Message-ID: Date: Mon, 17 Dec 2007 14:23:37 -0800 From: "Kip Macy" To: "Brooks Davis" In-Reply-To: <20071217193804.GA17357@lor.one-eyed-alien.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20071217193804.GA17357@lor.one-eyed-alien.net> Cc: freebsd-arch@freebsd.org Subject: Re: RDMA support on FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 22:23:39 -0000 On Dec 17, 2007 11:38 AM, Brooks Davis wrote: > > On Mon, Dec 17, 2007 at 11:22:09AM -0800, Kip Macy wrote: > > Chelsio's T3 card supports iWARP (RDMA over TCP). I've ported > > OpenFabric's kernel infrastructure for supporting RDMA to FreeBSD. Do > > we think that other RDMA providers (IB or iWARP) will be interested > > in supporting FreeBSD? If so it makes sense to put it under > > sys/contrib/rdma, otherwise I'll just add it as another module under > > cxgb. > > It seems unlikely that anyone would bother with IB without RDMA. I > can't see any value in mixing it with the cxgb bits. I'll give a heads up of my intent to check it in to contrib when its stable enough for wider use. It may be a couple of weeks. I just asked in advance because sometimes these discussions can drag on. -Kip From owner-freebsd-arch@FreeBSD.ORG Mon Dec 17 22:42:25 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A8A4716A418; Mon, 17 Dec 2007 22:42:25 +0000 (UTC) (envelope-from darrenr@freebsd.org) Received: from out4.smtp.messagingengine.com (out4.smtp.messagingengine.com [66.111.4.28]) by mx1.freebsd.org (Postfix) with ESMTP id 7170913C442; Mon, 17 Dec 2007 22:42:25 +0000 (UTC) (envelope-from darrenr@freebsd.org) Received: from compute1.internal (compute1.internal [10.202.2.41]) by out1.messagingengine.com (Postfix) with ESMTP id 01D2D7F293; Mon, 17 Dec 2007 17:42:25 -0500 (EST) Received: from heartbeat2.messagingengine.com ([10.202.2.161]) by compute1.internal (MEProxy); Mon, 17 Dec 2007 17:42:25 -0500 X-Sasl-enc: kvZEYk3moJN9Mj9k+t2rCiATqFn0kFe0ERmctZA1g/71 1197931344 Received: from [192.168.1.100] (dsl-202-45-110-141-static.VIC.netspace.net.au [202.45.110.141]) by mail.messagingengine.com (Postfix) with ESMTP id 9451F2A8FB; Mon, 17 Dec 2007 17:42:22 -0500 (EST) Message-ID: <4766FB4A.8020506@freebsd.org> Date: Tue, 18 Dec 2007 09:42:18 +1100 From: Darren Reed Organization: FreeBSD User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: David Christensen References: <20071215100351.Q70617@fledge.watson.org> <4764EDD0.5050101@freebsd.org> <09BFF2FA5EAB4A45B6655E151BBDD903068DBCBF@NT-IRVA-0750.brcm.ad.broadcom.com> In-Reply-To: <09BFF2FA5EAB4A45B6655E151BBDD903068DBCBF@NT-IRVA-0750.brcm.ad.broadcom.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Kip Macy , FreeBSD Current , Robert Watson , freebsd-arch@freebsd.org Subject: Re: pending changes for TOE support X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: darrenr@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 17 Dec 2007 22:42:25 -0000 David Christensen wrote: >>>> + * The TOE API assumes that the tcp offload engine can offload the >>>> + * the entire connection from set up to teardown, with >> some provision >>>> + * being made to allowing the software stack to handle >> time wait. If >>>> + * the device does not meet these criteria, it is the >> driver's responsibility >>>> + * to overload the functions that it needs to in >> tcp_usrreqs and make >>>> + * its own calls to tcp_output if it needs to do so. >>>> >>>> While I'm familiar with TCP, I'm less familiar with the >> scope of what cards >>>> support for TOE. Do we know of any cards that are less >> capable than the >>>> chelsio card in this respect, or are they all sort of >> on-par on that front? >>>> I.e., do we think the above eventuality is likely? >>> I don't have any way of knowing. I think it is probably safe to say >>> that any vendors that don't meet that criteria now will in >> the future >>> as transistor density increases. >> There are cards (or at least I've heard talk of this) that do partial >> TCP offload - that is the connection setup and teardown are handled by >> the operating system and that only data transfer is offloaded. I'm in >> the wrong country to chase down details on this ;( > > You are referring to Microsoft Chimney architecture which would be > supported by all TOE adapters that operate under Windows (our > NetXtreme II controllers included). There are IP issues related to > a chimney style implementation that would likely preclude their use > under FreeBSD including passing TCP state information between the > host OS and the controller among them. No, I'm not referring to anything Microsoft. They aren't the only operating system vendor that's working in this space. It would be preferable if FreeBSD could just see what the raw hardware is capable of and decide for itself what kind of architecture makes sense. Darren From owner-freebsd-arch@FreeBSD.ORG Tue Dec 18 09:37:42 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EA9A216A41A for ; Tue, 18 Dec 2007 09:37:42 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id 9CE0213C458 for ; Tue, 18 Dec 2007 09:37:42 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 1A54D66AB8C for ; Tue, 18 Dec 2007 10:22:24 +0100 (CET) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id SHFHyApYeE2z for ; Tue, 18 Dec 2007 10:22:23 +0100 (CET) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 0144B66AB8A for ; Tue, 18 Dec 2007 10:22:22 +0100 (CET) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id lBI9MM4u009714 for arch@freebsd.org; Tue, 18 Dec 2007 10:22:22 +0100 (CET) (envelope-from rdivacky) Date: Tue, 18 Dec 2007 10:22:22 +0100 From: Roman Divacky To: arch@freebsd.org Message-ID: <20071218092222.GA9695@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: final decision about *at syscalls X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Dec 2007 09:37:43 -0000 Dear arch@ Over this summer I was working (among other things) on *at family of syscalls kindly sponsored by Google (in their Summer of Code). The resulting patch is almost finished but I need to decide one design question. If you are not interested in *at/namei feel free to skip this mail. The *at syscalls are a threads-oriented extension to basic file syscalls (think of open(), fstat(), etc.) adding the possibility to specify from where the search for relative path should start. image that we have /tmp/foo/bar and CWD is set to "/tmp/", and the process has opened "foo" as dirfd. with ordinary open() syscall you have to either chdir("/tmp/foo");open("./bar"); or open("/tmp/foo/bar"); The first approach is problematic because it changes CWD for all threads in the process, the second is prone to race-conditions as some of the components of the path can change in parallel with the "open". So POSIX introduced a new API, called "Extended API set part 2, ISBN: 1-931624-67-4" (at least this was the latest when I looked last time), which solves that by introducing "*at" syscalls that supply an fd of previously opened directory which is used instead of CWD for searching relative path, ie. the previous example becomes dirfd = open("/tmp/foo"); openat("foo", dirfd); I implemented the whole API as native FreeBSD syscalls + in linuxulator emulation layer. Here's the problem: There are two approaches to the name translation from "filedescriptor" to the "vnode". 1) we can do it in the kern_fooat() syscall and pass namei() the resulting vnode 2) we can pass namei() the filedescriptor and do the translation there PROs of #1: o namei() does not need to know about the curthread, you can use this *at ability for different purposes, it's cleaner (imho) PROs of #2 o raceless implementation o no code duplication CONs of #1 o some very small code duplication (the translation is done in every kern_fooat() function) o there is a race between the name translation and the actual use of the result of the translation that needs to be handled, the "path_to_file" string is copied to the kernel space twice hence a race CONs of #2 o namei is made thread dependant Please tell me what approach you like more. I personally favour #1 because I don't like namei() being thread dependant, Kostik Belousov prefers #2. I'd like to change the current patch to whatever you decide is the best (currently I implement #1) and finally ship it for commiting. thank you Roman Divacky From owner-freebsd-arch@FreeBSD.ORG Tue Dec 18 12:10:47 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E4E6B16A41A; Tue, 18 Dec 2007 12:10:46 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 8C59513C4CC; Tue, 18 Dec 2007 12:10:46 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 4BC9347E89; Tue, 18 Dec 2007 07:10:46 -0500 (EST) Date: Tue, 18 Dec 2007 12:10:46 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: arch@FreeBSD.org Message-ID: <20071218120359.E15521@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: current@FreeBSD.org Subject: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Dec 2007 12:10:47 -0000 Dear all: I've been hacking on-and-off for a while on a side project to improve our kernel debugging facilities. Primarily, my concern has been to address three problems: - The complications of employing kernel core dumps for debugging, including the large size of dumps making them unwieldy to distribute or store for any extended period (even with minidumps), the requirement to have relatively synchronized kernel source in order to use the dumps, the need to have a kernel with debugging symbols, and the problems with fsck causing sufficient swap use to invalidate dumps before they can be extracted. - The decreasing likelihood that notebooks will ship with serial ports that can be used for interactive debugging using DDB. Making end-users type in stack traces is cruel, photos are a pain, and X11 rules out both. - The fact that a great many problems are most easily diagnosed using utility routines present in DDB, but not as easily using kgdb for offline analysis. I find that for many bugs I analyze, simply looking at the DDB output is sufficient to identify the source of the problem. An idea I punted around a bit at BSDCan earlier this year (or perhaps it was at EuroBSDCon the previous year) was an idea of a "textdump" -- that is, a new type of kernel dump based on capturing automatically extracted debugging information generated by DDB. The result would be an ASCII text file that could be filed as a bug report, perhaps even automatically. To this end, I have implemented three new facilities for use with DDB: (1) DDB output capture. The output of DDB is stored in a memory buffer, and can be extracted using a sysctl or textdumps (see below). This can be turned on and off, both for use manually ("I'll want this later, but not that") and as part of scripts (see below). (2) DDB scripting. A limited number of named scripts can be defined to run a series of DDB commands. No loops, etc, just simple command lists. These can be caused to run automatically on entering DDB for various scenarios, including WITNESS violations and kernel panics. They can also be run by hand in order to save a bit of typing if you use DDB in a repetitive way (as I do). (3) Textdumps. A new dump type that stores a series of data files containing various pieces of information, including the DDB capture buffer, kernel message buffer, kernel configuration (if compiled into the kernel), panic message, and kernel version string. These are stored in the ustar format inside the dump partition (aligned to the end) so can be easily extended, and savecore(8) requires almost no new logic to deal with them (it just drops numbered tar files in /var/crash). This makes it straight forward to extend the textdump format to include new types of information and avoids the issue of how to safely simultaneously represent information in many different formats in the same file. These are pretty flexible tools, and you can imagine doing the following sorts of things: - Setting the kdb.enter.panic script to automatically turn on output capture, do full backtraces of all threads, show open file information, dump UMA stats, and save it all to a textdump and then reboot. - Setting the kdb.enter.witness script to show lock information, generate a coredump, and reboot. Or, just to automatically do "show allocks" and drop to the DDB prompt. - Adding a flag to rc.conf to automatically submit textdumps via e-mail to a specific address, perhaps including GNATS or an automated bug system. These could be unpacked and automatically analyzed, and do to the compact size, kept for long-term trend analysis or to identify when a problem started occuring. I've produced an initial snapshot of the above, which can be found here: http://www.watson.org/~robert/freebsd/20071218-ddb.tgz This adds three files to DDB, patches quite a few kernel files (to pass more information into KDB about why it's being entered, in order to trigger the right script), enhancements to savecore(8) to know how to extract textdumps, adds a ddb(8) command line tool so that userspace can manage DDB scripts from outside the debugger, extensions to the ddb(4) man page, and a new textdump(4) man page. There are a number of known limitations; I've tried to document them at the top of the pertinent files where I am aware of them. I also regret to say that to date I've been able to test only on i386, and not other platforms. I'd welcome any feedback -- I'd like to get these changes into CVS in the next week or two. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Tue Dec 18 20:44:25 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 22AA716A469; Tue, 18 Dec 2007 20:44:25 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id D2BB413C4E1; Tue, 18 Dec 2007 20:44:24 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 921CC4AAE7; Tue, 18 Dec 2007 15:44:24 -0500 (EST) Date: Tue, 18 Dec 2007 20:44:24 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Maxim Sobolev In-Reply-To: <47682ED1.7000702@FreeBSD.org> Message-ID: <20071218204401.E33011@fledge.watson.org> References: <20071218120359.E15521@fledge.watson.org> <47682ED1.7000702@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@FreeBSD.org, current@FreeBSD.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Dec 2007 20:44:25 -0000 On Tue, 18 Dec 2007, Maxim Sobolev wrote: > Robert Watson wrote: >> buffer, kernel message buffer, kernel configuration (if compiled into >> the kernel), panic message, and kernel version string. These are > > Just a sidenote - maybe as part of this change it makes sense to make > compiling configuration into a kernel opt-out, not opt-in? We are in 21st > century, nobody really cares about saving few kilobytes of kernel memory > anymore. I'd certainly be fine with it being added to GENERIC on our various architectures. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Tue Dec 18 21:08:11 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9046C16A420; Tue, 18 Dec 2007 21:08:11 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from sippysoft.com (gk.360sip.com [72.236.70.226]) by mx1.freebsd.org (Postfix) with ESMTP id 265DF13C468; Tue, 18 Dec 2007 21:08:11 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from [192.168.0.3] ([204.244.149.125]) (authenticated bits=0) by sippysoft.com (8.13.8/8.13.8) with ESMTP id lBIKYjS8081433 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 18 Dec 2007 12:34:46 -0800 (PST) (envelope-from sobomax@FreeBSD.org) Message-ID: <47682ED1.7000702@FreeBSD.org> Date: Tue, 18 Dec 2007 12:34:25 -0800 From: Maxim Sobolev Organization: Sippy Software, Inc. User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: Robert Watson References: <20071218120359.E15521@fledge.watson.org> In-Reply-To: <20071218120359.E15521@fledge.watson.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@FreeBSD.org, current@FreeBSD.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 18 Dec 2007 21:08:11 -0000 Robert Watson wrote: > buffer, kernel message buffer, kernel configuration (if compiled into > the kernel), panic message, and kernel version string. These are Just a sidenote - maybe as part of this change it makes sense to make compiling configuration into a kernel opt-out, not opt-in? We are in 21st century, nobody really cares about saving few kilobytes of kernel memory anymore. -Maxim From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 06:54:13 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C671416A419 for ; Wed, 19 Dec 2007 06:54:13 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from mrout2-b.corp.dcn.yahoo.com (mrout2-b.corp.dcn.yahoo.com [216.109.112.28]) by mx1.freebsd.org (Postfix) with ESMTP id 864AF13C458 for ; Wed, 19 Dec 2007 06:54:13 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from minion.local.neville-neil.com (proxy7.corp.yahoo.com [216.145.48.98]) by mrout2-b.corp.dcn.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id lBJ6hdih040018; Tue, 18 Dec 2007 22:43:40 -0800 (PST) Date: Wed, 19 Dec 2007 14:43:35 +0800 Message-ID: From: gnn@freebsd.org To: Brooks Davis In-Reply-To: <20071217193804.GA17357@lor.one-eyed-alien.net> References: <20071217193804.GA17357@lor.one-eyed-alien.net> User-Agent: Wanderlust/2.15.5 (Almost Unreal) SEMI/1.14.6 (Maruoka) FLIM/1.14.8 (=?ISO-8859-4?Q?Shij=F2?=) APEL/10.7 Emacs/22.1.50 (i386-apple-darwin8.10.1) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Cc: Kip Macy , freebsd-arch@freebsd.org Subject: Re: RDMA support on FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 06:54:13 -0000 At Mon, 17 Dec 2007 13:38:04 -0600, Brooks Davis wrote: > > [1 ] > On Mon, Dec 17, 2007 at 11:22:09AM -0800, Kip Macy wrote: > > Chelsio's T3 card supports iWARP (RDMA over TCP). I've ported > > OpenFabric's kernel infrastructure for supporting RDMA to FreeBSD. Do > > we think that other RDMA providers (IB or iWARP) will be interested > > in supporting FreeBSD? If so it makes sense to put it under > > sys/contrib/rdma, otherwise I'll just add it as another module under > > cxgb. > > It seems unlikely that anyone would bother with IB without RDMA. I > can't see any value in mixing it with the cxgb bits. > I know at least one group that is looking at IB and yeah, the RDMA should be generally consumable. BrooksACK += 1. Later GEorge From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 07:13:00 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 61DD316A418 for ; Wed, 19 Dec 2007 07:13:00 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2070D13C447 for ; Wed, 19 Dec 2007 07:13:00 +0000 (UTC) (envelope-from kip.macy@gmail.com) Received: by wa-out-1112.google.com with SMTP id k17so4701677waf.3 for ; Tue, 18 Dec 2007 23:12:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; bh=Kg5KPo3iRQwB2GiFxmRldy5d/h6ff++dVOc86OOlwFE=; b=M4R6Ldl494mOC7gl2q2tnWzzoPnBBRc1JIypQZNq1R6BE9rG1lwtRJZ74tiLW2eqlI9JDBDf9l0C5tWvoPXevWNdatUIBJVO5Lo8LQ0mgOhFaNNYD04Ow5n9XMFEvwtgH9XPeOSZu3hqCEy629aljea25Q+qZMBRmN69WPb44eg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=qKH162LKCZ5zZ92mxsxDY70zRCnAhHZUT194vmFLk+IYbanzWNWhO4KqtQXBn/Fg/w3XnsUV7CZgaNaOOtEHy/JzdqEt7uk2rDR/prUCMu6zLpnUkt9EWkTjrtiRbe9sdUjDcAyd2RzX0Jk82jmsF1EmVP1WhpMdgTX89jC/SLk= Received: by 10.114.107.19 with SMTP id f19mr3052317wac.113.1198048379667; Tue, 18 Dec 2007 23:12:59 -0800 (PST) Received: by 10.114.255.11 with HTTP; Tue, 18 Dec 2007 23:12:59 -0800 (PST) Message-ID: Date: Tue, 18 Dec 2007 23:12:59 -0800 From: "Kip Macy" To: "Hidetoshi Shimokawa" , "gnn@freebsd.org" , "Brooks Davis" , freebsd-arch@freebsd.org In-Reply-To: <626eb4530712182307o63ed8d0cjc985f4404e143c1b@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20071217193804.GA17357@lor.one-eyed-alien.net> <626eb4530712182307o63ed8d0cjc985f4404e143c1b@mail.gmail.com> Cc: Subject: Re: RDMA support on FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 07:13:00 -0000 Could you take a look at the openfabric docs and see how well that meshes with IEEE1394? I think the lack of privilege checks might make it a bad fit. However, it *might* work for users who don't care about security. -Kip On 12/18/07, Hidetoshi Shimokawa wrote: > I'm also interested in RDMA support. > IEEE1394 OHCI has a RDMA-like feature. > > On 12/19/07, gnn@freebsd.org wrote: > > At Mon, 17 Dec 2007 13:38:04 -0600, > > Brooks Davis wrote: > > > > > > [1 ] > > > On Mon, Dec 17, 2007 at 11:22:09AM -0800, Kip Macy wrote: > > > > Chelsio's T3 card supports iWARP (RDMA over TCP). I've ported > > > > OpenFabric's kernel infrastructure for supporting RDMA to FreeBSD. Do > > > > we think that other RDMA providers (IB or iWARP) will be interested > > > > in supporting FreeBSD? If so it makes sense to put it under > > > > sys/contrib/rdma, otherwise I'll just add it as another module under > > > > cxgb. > > > > > > It seems unlikely that anyone would bother with IB without RDMA. I > > > can't see any value in mixing it with the cxgb bits. > > > > > > > I know at least one group that is looking at IB and yeah, the RDMA > > should be generally consumable. BrooksACK += 1. > > > > Later > > GEorge > > _______________________________________________ > > freebsd-arch@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > > > > > > > -- > /\ Hidetoshi Shimokawa > \/ simokawa@FreeBSD.ORG > From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 07:24:41 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D5A1716A41B for ; Wed, 19 Dec 2007 07:24:41 +0000 (UTC) (envelope-from freebsd@gm.nunu.org) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.181]) by mx1.freebsd.org (Postfix) with ESMTP id F31BC13C46B for ; Wed, 19 Dec 2007 07:24:40 +0000 (UTC) (envelope-from freebsd@gm.nunu.org) Received: by py-out-1112.google.com with SMTP id u77so5007470pyb.3 for ; Tue, 18 Dec 2007 23:24:40 -0800 (PST) Received: by 10.142.187.2 with SMTP id k2mr481993wff.51.1198049076710; Tue, 18 Dec 2007 23:24:36 -0800 (PST) Received: by 10.142.224.12 with HTTP; Tue, 18 Dec 2007 23:24:36 -0800 (PST) Message-ID: <626eb4530712182324j6137abb1l9d705fe43bd90e29@mail.gmail.com> Date: Wed, 19 Dec 2007 16:24:36 +0900 From: "Hidetoshi Shimokawa" Sender: freebsd@gm.nunu.org To: "Kip Macy" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20071217193804.GA17357@lor.one-eyed-alien.net> <626eb4530712182307o63ed8d0cjc985f4404e143c1b@mail.gmail.com> X-Google-Sender-Auth: f1bcf7a795d833e8 Cc: "gnn@freebsd.org" , Brooks Davis , freebsd-arch@freebsd.org Subject: Re: RDMA support on FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 07:24:41 -0000 I don't think IEEE1394 has enough privilege checks. Anyway, thanks for a pointer of docs. I'll check it later. On 12/19/07, Kip Macy wrote: > Could you take a look at the openfabric docs and see how well that > meshes with IEEE1394? > > I think the lack of privilege checks might make it a bad fit. However, > it *might* work for users who don't care about security. > > -Kip > > > On 12/18/07, Hidetoshi Shimokawa wrote: > > I'm also interested in RDMA support. > > IEEE1394 OHCI has a RDMA-like feature. > > > > On 12/19/07, gnn@freebsd.org wrote: > > > At Mon, 17 Dec 2007 13:38:04 -0600, > > > Brooks Davis wrote: > > > > > > > > [1 ] > > > > On Mon, Dec 17, 2007 at 11:22:09AM -0800, Kip Macy wrote: > > > > > Chelsio's T3 card supports iWARP (RDMA over TCP). I've ported > > > > > OpenFabric's kernel infrastructure for supporting RDMA to FreeBSD. Do > > > > > we think that other RDMA providers (IB or iWARP) will be interested > > > > > in supporting FreeBSD? If so it makes sense to put it under > > > > > sys/contrib/rdma, otherwise I'll just add it as another module under > > > > > cxgb. > > > > > > > > It seems unlikely that anyone would bother with IB without RDMA. I > > > > can't see any value in mixing it with the cxgb bits. > > > > > > > > > > I know at least one group that is looking at IB and yeah, the RDMA > > > should be generally consumable. BrooksACK += 1. > > > > > > Later > > > GEorge > > > _______________________________________________ > > > freebsd-arch@freebsd.org mailing list > > > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > > > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > > > > > > > > > > > > -- > > /\ Hidetoshi Shimokawa > > \/ simokawa@FreeBSD.ORG > > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > > -- /\ Hidetoshi Shimokawa \/ simokawa@FreeBSD.ORG From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 07:31:43 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6F93316A41B for ; Wed, 19 Dec 2007 07:31:43 +0000 (UTC) (envelope-from freebsd@gm.nunu.org) Received: from wa-out-1112.google.com (wa-out-1112.google.com [209.85.146.177]) by mx1.freebsd.org (Postfix) with ESMTP id 4654A13C458 for ; Wed, 19 Dec 2007 07:31:43 +0000 (UTC) (envelope-from freebsd@gm.nunu.org) Received: by wa-out-1112.google.com with SMTP id k17so4711523waf.3 for ; Tue, 18 Dec 2007 23:31:43 -0800 (PST) Received: by 10.142.178.13 with SMTP id a13mr2516286wff.24.1198048032633; Tue, 18 Dec 2007 23:07:12 -0800 (PST) Received: by 10.142.224.12 with HTTP; Tue, 18 Dec 2007 23:07:12 -0800 (PST) Message-ID: <626eb4530712182307o63ed8d0cjc985f4404e143c1b@mail.gmail.com> Date: Wed, 19 Dec 2007 16:07:12 +0900 From: "Hidetoshi Shimokawa" Sender: freebsd@gm.nunu.org To: "gnn@freebsd.org" In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20071217193804.GA17357@lor.one-eyed-alien.net> X-Google-Sender-Auth: 03b0a1b4460f875f Cc: Kip Macy , Brooks Davis , freebsd-arch@freebsd.org Subject: Re: RDMA support on FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 07:31:43 -0000 I'm also interested in RDMA support. IEEE1394 OHCI has a RDMA-like feature. On 12/19/07, gnn@freebsd.org wrote: > At Mon, 17 Dec 2007 13:38:04 -0600, > Brooks Davis wrote: > > > > [1 ] > > On Mon, Dec 17, 2007 at 11:22:09AM -0800, Kip Macy wrote: > > > Chelsio's T3 card supports iWARP (RDMA over TCP). I've ported > > > OpenFabric's kernel infrastructure for supporting RDMA to FreeBSD. Do > > > we think that other RDMA providers (IB or iWARP) will be interested > > > in supporting FreeBSD? If so it makes sense to put it under > > > sys/contrib/rdma, otherwise I'll just add it as another module under > > > cxgb. > > > > It seems unlikely that anyone would bother with IB without RDMA. I > > can't see any value in mixing it with the cxgb bits. > > > > I know at least one group that is looking at IB and yeah, the RDMA > should be generally consumable. BrooksACK += 1. > > Later > GEorge > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > > -- /\ Hidetoshi Shimokawa \/ simokawa@FreeBSD.ORG From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 10:45:56 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DB79616A41B; Wed, 19 Dec 2007 10:45:56 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (tim.des.no [194.63.250.121]) by mx1.freebsd.org (Postfix) with ESMTP id 8705D13C45D; Wed, 19 Dec 2007 10:45:55 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (localhost [127.0.0.1]) by spam.des.no (Postfix) with ESMTP id 63C9920AF; Wed, 19 Dec 2007 11:45:46 +0100 (CET) X-Spam-Tests: AWL X-Spam-Learn: disabled X-Spam-Score: -0.1/3.0 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on tim.des.no Received: from ds4.des.no (des.no [80.203.243.180]) by smtp.des.no (Postfix) with ESMTP id D133420BB; Wed, 19 Dec 2007 11:45:45 +0100 (CET) Received: by ds4.des.no (Postfix, from userid 1001) id 0224F84499; Wed, 19 Dec 2007 11:45:49 +0100 (CET) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: Robert Watson References: <20071218120359.E15521@fledge.watson.org> <47682ED1.7000702@FreeBSD.org> <20071218204401.E33011@fledge.watson.org> Date: Wed, 19 Dec 2007 11:45:49 +0100 In-Reply-To: <20071218204401.E33011@fledge.watson.org> (Robert Watson's message of "Tue\, 18 Dec 2007 20\:44\:24 +0000 \(GMT\)") Message-ID: <86ir2vklnm.fsf@ds4.des.no> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/22.1 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Maxim Sobolev , current@FreeBSD.org, arch@FreeBSD.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 10:45:57 -0000 Robert Watson writes: > I'd certainly be fine with it being added to GENERIC on our various > architectures. s/GENERIC/DEFAULTS/ DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 10:59:52 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4404016A419; Wed, 19 Dec 2007 10:59:52 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 193EC13C4CC; Wed, 19 Dec 2007 10:59:51 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 0763047831; Wed, 19 Dec 2007 05:53:59 -0500 (EST) Date: Wed, 19 Dec 2007 10:53:58 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= In-Reply-To: <86ir2vklnm.fsf@ds4.des.no> Message-ID: <20071219105229.T95322@fledge.watson.org> References: <20071218120359.E15521@fledge.watson.org> <47682ED1.7000702@FreeBSD.org> <20071218204401.E33011@fledge.watson.org> <86ir2vklnm.fsf@ds4.des.no> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="621616949-1563880115-1198061638=:95322" Cc: Maxim Sobolev , current@FreeBSD.org, arch@FreeBSD.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 10:59:52 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --621616949-1563880115-1198061638=:95322 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 19 Dec 2007, Dag-Erling Sm=F8rgrav wrote: > Robert Watson writes: >> I'd certainly be fine with it being added to GENERIC on our various=20 >> architectures. > > s/GENERIC/DEFAULTS/ At the risk of creating a bikeshed, I thought we had a consensus that DEFAU= LTS=20 should never be used :-). What I'd love to get, as a bikeshed alternative, would be feedback on the= =20 usability of DDB scripting, output capture, and textdumps... I know there = are=20 a few nits, such as the fact that "continue" at the end of an=20 automatically-run script for a KDB entry event still results in sitting at = a=20 DDB prompt, for example. Robert N M Watson Computer Laboratory University of Cambridge --621616949-1563880115-1198061638=:95322-- From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 13:09:14 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6F12716A420 for ; Wed, 19 Dec 2007 13:09:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 27DF113C4F6 for ; Wed, 19 Dec 2007 13:09:14 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 859A247E50; Wed, 19 Dec 2007 08:09:13 -0500 (EST) Date: Wed, 19 Dec 2007 13:09:13 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: net@FreeBSD.org Message-ID: <20071219123305.Y95322@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: James Healy , arch@FreeBSD.org, Lawrence Stewart Subject: Coordinating TCP projects X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 13:09:14 -0000 Dear all, It is rapidly becoming clear that quite a few of us have Big Plans for the TCP implementation over the next 12-18 months. It's important that we get the plans out on the table now so that everyone working on these projects is aware of the larger context. This will encourage collaboration, but also allow us to manage the risks inevitably associated with having several simultaneous projects going on in a very complex software base. With that in mind, here are the large projects I'm currently aware of: Project Flag Wavers Status ------- ----------- ------ TCP offload Kip Macy Moving to CVS and under review and testing; one supporting device driver. TCP congestion control Sam Leffler, At least one prototype Rui Paulo, implementation, to move to p4 Andre Oppermann, Kip Macy, Lawrence Stewart, James Healy TCP overhaul Andre Oppermann Glimmer in eye, to move to p4. TCP lock granularity/ Robert Watson Glimmer in eye, to occur in increased parallelism p4. TCP timer unification Andre Oppermann, Previously committed, and to Mike Silbersack be reintroduced via p4. Monitoring ABI cleanup Robert Watson Glimmer in eye, to occur in p4. Looking at the above, it sounds like a massive amount of work taking place, so we will need to coordinate carefully. I'd like to encourage people to avoid creating unnecessary dependencies between changes, and to be especially careful in coordinating potentially MFCable changes. There are (at least) two conflicting scheduling desires in play here: - A desire to merge MFCable changes early, so that they aren't entangled with un-mergeable changes. This will simplify merging and also maximize the extent to which testing in HEAD will apply to them once merged to RELENG_7. - A desire to merge large-scale infrastructural changes early so that they see the greatest exposure, and so that they can be introduced incrementally over a longer period of time to shake each out. Both of these are valid perspectives, and will need to be balanced. I have a few questions, then, for people involved in these or other projects: (0) Is your project in the above list? If not, could you send out a reply talking a bit about the project, who's involved, where it's taking place, etc. (1) What is your availability to shepherd the project through its entire cycle, including early prototyping, design review, development, implementation review, testing, and the inevitable long debugging tail that all TCP projects have. (2) When do you think your implementation will reach a prototype phase appropriate for an expanded circle of reviewers? When do you think it might be ready for commit? Keep in mind that we're now a month or so into the 18-month cycle for 8.0, and that all serious TCP work should be completed at least six months before the end of the cycle. (3) What potential interactions of note exist between your project and the others being planned. Are there explicit dependencies? (4) Do you anticipate an MFC cycle for your work to RELENG_7? I'd like for us to create a wiki page tracking these various projects, and pointing at per-project resources. Once the discussion has settled a bit, I can take responsibility for creating such a page, but will need everyone involved to help maintain it, as well as to maintain pages (on the wiki or elsewhere) regarding the status of the projects. I think it also makes a lot of sense for participants in the projects to send occasional updates and reports to net@/arch@ in order to keep people who can't track things day-to-date in the loop, and to invite review. At the end of the day, we must be clear: the only way even a fraction of these projects can happen in time for 8.0 is if there is careful planning, coordination, and exception care taken in the review and testing of the changes. We cannot have the 8.0 release cycle put at risk the way the 7.0 cycle was due to inadequately reviwed and tested patches entering the tree under the assumption that problems would somehow be magically found and fixed before the release by the relatively small population of -CURRENT users. Experience tells us that changes must be extensively reviewed and tested before they enter the tree. I'm really looking forward to the 8 development cycle, and the work that's in the pipeline is really very exciting. It will take quite a bit of dedication to make it all happen, but if even only a small part of it happens, it will still be very good news. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 13:19:23 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D7F7016A418 for ; Wed, 19 Dec 2007 13:19:23 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id B58E913C447 for ; Wed, 19 Dec 2007 13:19:23 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 2377A47828; Wed, 19 Dec 2007 08:19:23 -0500 (EST) Date: Wed, 19 Dec 2007 13:19:23 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Kip Macy In-Reply-To: Message-ID: <20071219131646.P95322@fledge.watson.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@freebsd.org Subject: Re: RDMA support on FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 13:19:23 -0000 On Mon, 17 Dec 2007, Kip Macy wrote: > Chelsio's T3 card supports iWARP (RDMA over TCP). I've ported OpenFabric's > kernel infrastructure for supporting RDMA to FreeBSD. Do we think that other > RDMA providers (IB or iWARP) will be interested in supporting FreeBSD? If so > it makes sense to put it under sys/contrib/rdma, otherwise I'll just add it > as another module under cxgb. Kip, Could you put up a snapshot of the changes somewhere, along with some documentation as to the architecture, changes to FreeBSD, changes to the openfabric bits, etc, and pointers to any pertinent documentation that may already exist and would answer obvious and less obvious questions? Thanks, Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 15:12:58 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0E0FF16A418; Wed, 19 Dec 2007 15:12:58 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (tim.des.no [194.63.250.121]) by mx1.freebsd.org (Postfix) with ESMTP id BDF4813C458; Wed, 19 Dec 2007 15:12:57 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (localhost [127.0.0.1]) by spam.des.no (Postfix) with ESMTP id 13F5020A0; Wed, 19 Dec 2007 16:12:49 +0100 (CET) X-Spam-Tests: AWL X-Spam-Learn: disabled X-Spam-Score: -0.1/3.0 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on tim.des.no Received: from ds4.des.no (des.no [80.203.243.180]) by smtp.des.no (Postfix) with ESMTP id E9EFA2089; Wed, 19 Dec 2007 16:12:48 +0100 (CET) Received: by ds4.des.no (Postfix, from userid 1001) id 0650C84490; Wed, 19 Dec 2007 16:12:53 +0100 (CET) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: Robert Watson References: <20071218120359.E15521@fledge.watson.org> <47682ED1.7000702@FreeBSD.org> <20071218204401.E33011@fledge.watson.org> <86ir2vklnm.fsf@ds4.des.no> <20071219105229.T95322@fledge.watson.org> Date: Wed, 19 Dec 2007 16:12:52 +0100 In-Reply-To: <20071219105229.T95322@fledge.watson.org> (Robert Watson's message of "Wed\, 19 Dec 2007 10\:53\:58 +0000 \(GMT\)") Message-ID: <86zlw6btvv.fsf@ds4.des.no> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/22.1 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Maxim Sobolev , current@FreeBSD.org, arch@FreeBSD.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 15:12:58 -0000 Robert Watson writes: > Dag-Erling Sm=C3=B8rgrav writes: > > Robert Watson writes: > > > I'd certainly be fine with it being added to GENERIC on our > > > various architectures. > > s/GENERIC/DEFAULTS/ > At the risk of creating a bikeshed, I thought we had a consensus that > DEFAULTS should never be used :-). What sobomax actually asked was for INCLUDE_CONFIG_FILE to be opt-out; the easiest way to achieve this is to put it in DEFAULTS, so people who don't want it can use "nooption INCLUDE_CONFIG_FILE". I vote in favor. It's one of those things that is so incredibly useful, but that you never remember to turn on until it's too late. BTW, it would be also useful to include the config file (if available) in the tar-formatted dumps you are working on. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 15:20:30 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8145C16A41A; Wed, 19 Dec 2007 15:20:30 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 2BBA413C442; Wed, 19 Dec 2007 15:20:29 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 89DE947518; Wed, 19 Dec 2007 10:20:28 -0500 (EST) Date: Wed, 19 Dec 2007 15:20:28 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= In-Reply-To: <86zlw6btvv.fsf@ds4.des.no> Message-ID: <20071219151957.C754@fledge.watson.org> References: <20071218120359.E15521@fledge.watson.org> <47682ED1.7000702@FreeBSD.org> <20071218204401.E33011@fledge.watson.org> <86ir2vklnm.fsf@ds4.des.no> <20071219105229.T95322@fledge.watson.org> <86zlw6btvv.fsf@ds4.des.no> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="621616949-236892328-1198077628=:754" Cc: Maxim Sobolev , current@FreeBSD.org, arch@FreeBSD.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 15:20:30 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --621616949-236892328-1198077628=:754 Content-Type: TEXT/PLAIN; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE On Wed, 19 Dec 2007, Dag-Erling Sm=F8rgrav wrote: > What sobomax actually asked was for INCLUDE_CONFIG_FILE to be opt-out; th= e=20 > easiest way to achieve this is to put it in DEFAULTS, so people who don't= =20 > want it can use "nooption INCLUDE_CONFIG_FILE". I vote in favor. It's on= e=20 > of those things that is so incredibly useful, but that you never remember= to=20 > turn on until it's too late. > > BTW, it would be also useful to include the config file (if available) in= =20 > the tar-formatted dumps you are working on. If you'd looked at the patch, you'd see it was already there. Robert N M Watson Computer Laboratory University of Cambridge --621616949-236892328-1198077628=:754-- From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 15:54:42 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BB1B716A420 for ; Wed, 19 Dec 2007 15:54:42 +0000 (UTC) (envelope-from qpadla@gmail.com) Received: from fk-out-0910.google.com (fk-out-0910.google.com [209.85.128.189]) by mx1.freebsd.org (Postfix) with ESMTP id 49D1E13C4DD for ; Wed, 19 Dec 2007 15:54:42 +0000 (UTC) (envelope-from qpadla@gmail.com) Received: by fk-out-0910.google.com with SMTP id b27so3644645fka.11 for ; Wed, 19 Dec 2007 07:54:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:message-id; bh=rcThqj2M73Rpdf/sIx98jcLA+Eebu4pi93f0r7JHH0I=; b=NEUmKYngStg0+Wi4Fa5yZBPvnjZp2v1RWWnTArnz/T7rsrN+7VAchShcVfpWy+nLVR7OllWM+C/oEJ7geLw/A5morczOT5WEfwqS/i9IoCnspY5zthP3tkViSM49vFYLjpFGfz3Ye573wtS4qPVxquFdSBL7HgHAvmNtgVJ0qgA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:message-id; b=QV/fXtwNdy/ivE6AKMo9yINvcB0bUIni3/jojbwttkC7CMvZw6fmrfO2DkTHOzBV+1Y8uxOedUgePMcLdUfIDkOlWLJJA1F2XM3lEz7CDdpiknVoT8tEcUQPK/ioJvTroksIY5JNVcCCyZq8vxjzfJQ47kjLpC2+JW0P8+fYWv4= Received: by 10.82.184.2 with SMTP id h2mr9151575buf.22.1198079680498; Wed, 19 Dec 2007 07:54:40 -0800 (PST) Received: from orion ( [89.162.141.1]) by mx.google.com with ESMTPS id b30sm3949475ika.2007.12.19.07.54.37 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 19 Dec 2007 07:54:39 -0800 (PST) From: Nikolay Pavlov To: freebsd-arch@freebsd.org Date: Wed, 19 Dec 2007 17:54:44 +0200 User-Agent: KMail/1.9.6 (enterprise 0.20070907.709405) References: <20071218120359.E15521@fledge.watson.org> <20071219105229.T95322@fledge.watson.org> <86zlw6btvv.fsf@ds4.des.no> In-Reply-To: <86zlw6btvv.fsf@ds4.des.no> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart3381363.mIDhAAKYbb"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200712191754.49293.qpadla@gmail.com> Cc: arch@freebsd.org, Dag-Erling =?utf-8?q?Sm=C3=B8rgrav?= , Robert Watson , current@freebsd.org, Maxim Sobolev Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: qpadla@gmail.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 15:54:42 -0000 --nextPart3381363.mIDhAAKYbb Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Wednesday 19 December 2007 17:12:52 Dag-Erling Sm=C3=B8rgrav wrote: > What sobomax actually asked was for INCLUDE_CONFIG_FILE to be opt-out; > the easiest way to achieve this is to put it in DEFAULTS, so people who > don't want it can use "nooption INCLUDE_CONFIG_FILE". =C2=A0I vote in fav= or. > It's one of those things that is so incredibly useful, but that you > never remember to turn on until it's too late. With out this feature there would be no way to get really full automatic=20 crash reports from users that is not familiar with debugging or kernel=20 building process. =20 =2D-=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 =2D Best regards, Nikolay Pavlov. <<<----------------------------------- = =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --nextPart3381363.mIDhAAKYbb Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBHaT7J/2R6KvEYGaIRAo0eAJ9iTCA/O5GoLdswgF4mhbEOpKsyQwCeJ/6u 10925E4c3PliOooUrOljgP8= =X5aW -----END PGP SIGNATURE----- --nextPart3381363.mIDhAAKYbb-- From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 15:54:43 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BE7B616A469 for ; Wed, 19 Dec 2007 15:54:43 +0000 (UTC) (envelope-from qpadla@gmail.com) Received: from mu-out-0910.google.com (mu-out-0910.google.com [209.85.134.187]) by mx1.freebsd.org (Postfix) with ESMTP id 391C213C4D1 for ; Wed, 19 Dec 2007 15:54:42 +0000 (UTC) (envelope-from qpadla@gmail.com) Received: by mu-out-0910.google.com with SMTP id w9so4695800mue.6 for ; Wed, 19 Dec 2007 07:54:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:message-id; bh=rcThqj2M73Rpdf/sIx98jcLA+Eebu4pi93f0r7JHH0I=; b=NEUmKYngStg0+Wi4Fa5yZBPvnjZp2v1RWWnTArnz/T7rsrN+7VAchShcVfpWy+nLVR7OllWM+C/oEJ7geLw/A5morczOT5WEfwqS/i9IoCnspY5zthP3tkViSM49vFYLjpFGfz3Ye573wtS4qPVxquFdSBL7HgHAvmNtgVJ0qgA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:message-id; b=QV/fXtwNdy/ivE6AKMo9yINvcB0bUIni3/jojbwttkC7CMvZw6fmrfO2DkTHOzBV+1Y8uxOedUgePMcLdUfIDkOlWLJJA1F2XM3lEz7CDdpiknVoT8tEcUQPK/ioJvTroksIY5JNVcCCyZq8vxjzfJQ47kjLpC2+JW0P8+fYWv4= Received: by 10.82.184.2 with SMTP id h2mr9151575buf.22.1198079680498; Wed, 19 Dec 2007 07:54:40 -0800 (PST) Received: from orion ( [89.162.141.1]) by mx.google.com with ESMTPS id b30sm3949475ika.2007.12.19.07.54.37 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 19 Dec 2007 07:54:39 -0800 (PST) From: Nikolay Pavlov To: freebsd-arch@freebsd.org Date: Wed, 19 Dec 2007 17:54:44 +0200 User-Agent: KMail/1.9.6 (enterprise 0.20070907.709405) References: <20071218120359.E15521@fledge.watson.org> <20071219105229.T95322@fledge.watson.org> <86zlw6btvv.fsf@ds4.des.no> In-Reply-To: <86zlw6btvv.fsf@ds4.des.no> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart3381363.mIDhAAKYbb"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200712191754.49293.qpadla@gmail.com> Cc: arch@freebsd.org, Dag-Erling =?utf-8?q?Sm=C3=B8rgrav?= , Robert Watson , current@freebsd.org, Maxim Sobolev Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: qpadla@gmail.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 15:54:43 -0000 --nextPart3381363.mIDhAAKYbb Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Wednesday 19 December 2007 17:12:52 Dag-Erling Sm=C3=B8rgrav wrote: > What sobomax actually asked was for INCLUDE_CONFIG_FILE to be opt-out; > the easiest way to achieve this is to put it in DEFAULTS, so people who > don't want it can use "nooption INCLUDE_CONFIG_FILE". =C2=A0I vote in fav= or. > It's one of those things that is so incredibly useful, but that you > never remember to turn on until it's too late. With out this feature there would be no way to get really full automatic=20 crash reports from users that is not familiar with debugging or kernel=20 building process. =20 =2D-=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 =2D Best regards, Nikolay Pavlov. <<<----------------------------------- = =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --nextPart3381363.mIDhAAKYbb Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBHaT7J/2R6KvEYGaIRAo0eAJ9iTCA/O5GoLdswgF4mhbEOpKsyQwCeJ/6u 10925E4c3PliOooUrOljgP8= =X5aW -----END PGP SIGNATURE----- --nextPart3381363.mIDhAAKYbb-- From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 16:05:12 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 110A316A421; Wed, 19 Dec 2007 16:05:12 +0000 (UTC) (envelope-from lastewart@swin.edu.au) Received: from lauren.room52.net (lauren.room52.net [210.50.193.198]) by mx1.freebsd.org (Postfix) with ESMTP id 6288913C4F7; Wed, 19 Dec 2007 16:05:11 +0000 (UTC) (envelope-from lastewart@swin.edu.au) Received: from newbox.caia.swin.edu.au (124-168-6-25.dyn.iinet.net.au [124.168.6.25]) (authenticated bits=0) by lauren.room52.net (8.13.8/8.13.8) with ESMTP id lBJFoNSn012477 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 20 Dec 2007 02:50:23 +1100 (EST) (envelope-from lastewart@swin.edu.au) Message-ID: <47693DBD.6050104@swin.edu.au> Date: Thu, 20 Dec 2007 02:50:21 +1100 From: Lawrence Stewart User-Agent: Thunderbird 2.0.0.4 (X11/20070625) MIME-Version: 1.0 To: Robert Watson References: <20071219123305.Y95322@fledge.watson.org> In-Reply-To: <20071219123305.Y95322@fledge.watson.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.7 required=5.0 tests=AWL,BAYES_00, RCVD_IN_SORBS_DUL,RDNS_DYNAMIC autolearn=disabled version=3.2.3 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on lauren.room52.net Cc: James Healy , arch@freebsd.org, net@freebsd.org Subject: Re: Coordinating TCP projects X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 16:05:12 -0000 Hi Robert, Comments inline. Robert Watson wrote: > > Dear all, > > It is rapidly becoming clear that quite a few of us have Big Plans for > the TCP implementation over the next 12-18 months. It's important > that we get the plans out on the table now so that everyone working on > these projects is aware of the larger context. This will encourage > collaboration, but also allow us to manage the risks inevitably > associated with having several simultaneous projects going on in a > very complex software base. With that in mind, here are the large > projects I'm currently aware of: > > Project Flag Wavers Status > ------- ----------- ------ > TCP offload Kip Macy Moving to CVS and under > review and testing; one > supporting device driver. > > TCP congestion control Sam Leffler, At least one prototype > Rui Paulo, implementation, to move to p4 > Andre Oppermann, > Kip Macy, > Lawrence Stewart, > James Healy > > TCP overhaul Andre Oppermann Glimmer in eye, to move to > p4. > > TCP lock granularity/ Robert Watson Glimmer in eye, to occur in > increased parallelism p4. > > TCP timer unification Andre Oppermann, Previously committed, and to > Mike Silbersack be reintroduced via p4. > > Monitoring ABI cleanup Robert Watson Glimmer in eye, to > occur in > p4. > > Looking at the above, it sounds like a massive amount of work taking > place, so we will need to coordinate carefully. I'd like to encourage > people to avoid creating unnecessary dependencies between changes, and > to be especially careful in coordinating potentially MFCable changes. > There are (at least) two conflicting scheduling desires in play here: > > - A desire to merge MFCable changes early, so that they aren't > entangled with > un-mergeable changes. This will simplify merging and also maximize the > extent to which testing in HEAD will apply to them once merged to > RELENG_7. > > - A desire to merge large-scale infrastructural changes early so that > they see > the greatest exposure, and so that they can be introduced > incrementally over > a longer period of time to shake each out. > > Both of these are valid perspectives, and will need to be balanced. I > have a few questions, then, for people involved in these or other > projects: > > (0) Is your project in the above list? If not, could you send out a > reply > talking a bit about the project, who's involved, where it's taking > place, > etc. Rui@ recently posted a TCP ECN patch that probably belongs in the list (http://lists.freebsd.org/pipermail/freebsd-net/2007-November/015979.html) unless it has already recently been committed. Jim and I recently discussed the idea of implementing autotuning of the TCP reassembly queue size based on analysis of some experimental work we've been doing. It's a small project, but we feel it would be worth implementing. Details follow... Problem description: Currently, "net.inet.tcp.reass.maxqlen" specifies the maximum number of segments that can be held in the reassembly queue for a TCP connection. The current default value is 48, which equates to approx. 69k of buffer space if MSS = 1448 bytes. This means that if the TCP window grows to be more than 48 segments wide, and a packet is lost, the receiver will buffer the next 48 segments in the reassembly queue and subsequently drop all the remaining segments in the window because the reassembly buffer is full i.e. 1 packet loss in the network can equate to many packet losses at the receiver because of insufficient buffering. This obviously has a negative impact on performance in environments where there is non-zero packet loss. With the addition of automatic socket buffer tuning in FreeBSD 7, the ability for the TCP window to grow above 48 segments is going to be even more prevalent than it is now, so this issue will continue to affect connections to FreeBSD based TCP receivers. We observed that the socket receive buffer size provides a good indication of the expected number of bytes in flight for a connection, and can therefore serve as the figure to base the size of the reassembly queue on. Basic project description: - Make the reassembly queue's max length a per-connection variable to appropriately tailor the reassembly queue buffer size for each connection - Piggyback automated reassembly queue sizing with the code that resizes the socket receive buffer - The socket buffer tuning code already has the required infrastructure to cap the max buffer size, so this would implicitly limit the size of the reassembly queue - If the socket buffer sizes were explicitly overridden using sockopts (e.g. to support large windows for particular apps), the reassembly queue would grow to accommodate only connections using the larger than normal receive buffer. - The net.inet.tcp.reass.maxsegments tunable would still be left intact to ensure users can set a hard cap on the max amount of memory allowed for reassembly buffering. > > (1) What is your availability to shepherd the project through its entire > cycle, including early prototyping, design review, development, > implementation review, testing, and the inevitable long debugging > tail > that all TCP projects have. We should be able to run the reassembly queue project full cycle. > > (2) When do you think your implementation will reach a prototype phase > appropriate for an expanded circle of reviewers? When do you > think it > might be ready for commit? Keep in mind that we're now a month or > so into > the 18-month cycle for 8.0, and that all serious TCP work should be > completed at least six months before the end of the cycle. To be safe, I'll say we should have a prototype ready by the end of Feb 2008, though I suspect we'll have something ready sooner than that. Commit ready code should follow very shortly after that (few weeks at most), as we anticipate that the patch will be very simple. > > (3) What potential interactions of note exist between your project and > the > others being planned. Are there explicit dependencies? The "TCP Overhaul" project would possibly alter the location of the changes, but shouldn't affect the essence of the changes themselves. It's unlikely any of the other projects would affect this one. > > (4) Do you anticipate an MFC cycle for your work to RELENG_7? Yes. A munged version could also be made available for RELENG_6.... it just wouldn't be based on automatic receive buffer tuning, and would probably be based on a static calculation during connection initialisation. > > I'd like for us to create a wiki page tracking these various projects, > and pointing at per-project resources. Once the discussion has > settled a bit, I can take responsibility for creating such a page, but > will need everyone involved to help maintain it, as well as to > maintain pages (on the wiki or elsewhere) regarding the status of the > projects. I think it also makes a lot of sense for participants in > the projects to send occasional updates and reports to net@/arch@ in > order to keep people who can't track things day-to-date in the loop, > and to invite review. Sounds fair. [snip] Cheers, Jim and Lawrence From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 18:06:07 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4172E16A419 for ; Wed, 19 Dec 2007 18:06:07 +0000 (UTC) (envelope-from SRS0=10c3c91f8c82dd6af1ec65ba89c5ad41e31288e5=554=es.net=oberman@es.net) Received: from postal1.es.net (postal4.es.net [IPv6:2001:400:6000:1::66]) by mx1.freebsd.org (Postfix) with ESMTP id 879A213C457 for ; Wed, 19 Dec 2007 18:06:06 +0000 (UTC) (envelope-from SRS0=10c3c91f8c82dd6af1ec65ba89c5ad41e31288e5=554=es.net=oberman@es.net) Received: from ptavv.es.net (ptavv.es.net [198.128.4.29]) by postal4.es.net (Postal Node 4) with ESMTP (SSL) id YEZ99903; Wed, 19 Dec 2007 10:06:03 -0800 Received: from ptavv.es.net (ptavv.es.net [127.0.0.1]) by ptavv.es.net (Tachyon Server) with ESMTP id C170945014; Wed, 19 Dec 2007 10:06:01 -0800 (PST) To: Kip Macy Mime-Version: 1.0 Content-Type: multipart/signed; boundary="==_Exmh_1198087561_84283P"; micalg=pgp-sha1; protocol="application/pgp-signature" Content-Transfer-Encoding: 7bit Date: Wed, 19 Dec 2007 10:06:01 -0800 From: "Kevin Oberman" Message-Id: <20071219180601.C170945014@ptavv.es.net> X-Sender-IP: 198.128.4.29 X-Sender-Domain: es.net X-Recipent: ;; X-Sender: X-To_Name: Kip Macy X-To_Domain: gmail.com X-To: Kip Macy X-To_Email: kip.macy@gmail.com X-To_Alias: kip.macy Cc: freebsd-arch@freebsd.org Subject: TOE support issues X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 18:06:07 -0000 --==_Exmh_1198087561_84283P Content-Type: text/plain; charset=us-ascii Content-Disposition: inline I have come up with several questions about the supportability of TOE. 1. Packet capture. Can I use tcpdump or other libpcap tools with TOE cards? Can the card do pcap in its own microcode? 2. Statistics. What statistics are available with TOE? I know the Chelsio card keeps all kinds of potentially interesting stats as will as the basic packet and error counts. Can these be made available to user code, management tools, and such stuff? 3. The Chelsio card has some very impressive, but as far as I can tell, undocumented capabilities for things like traffic shaping and policing. Any of these available? If these issues are handled correctly, this could be a network researcher's dream. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: oberman@es.net Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 --==_Exmh_1198087561_84283P Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) Comment: Exmh version 2.5 06/03/2002 iD8DBQFHaV2Jkn3rs5h7N1ERAjxdAJ0a45GF/R7UWajfzSBczZLBDSd5TACgoUiU 9PpYbAYHDiXsdBfblMrEJ0o= =CLGl -----END PGP SIGNATURE----- --==_Exmh_1198087561_84283P-- From owner-freebsd-arch@FreeBSD.ORG Wed Dec 19 18:40:33 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AD8F216A468; Wed, 19 Dec 2007 18:40:33 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (tim.des.no [194.63.250.121]) by mx1.freebsd.org (Postfix) with ESMTP id 6B6EA13C47E; Wed, 19 Dec 2007 18:40:33 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (localhost [127.0.0.1]) by spam.des.no (Postfix) with ESMTP id 2E8AA20BE; Wed, 19 Dec 2007 19:40:25 +0100 (CET) X-Spam-Tests: AWL X-Spam-Learn: disabled X-Spam-Score: -0.1/3.0 X-Spam-Checker-Version: SpamAssassin 3.2.3 (2007-08-08) on tim.des.no Received: from ds4.des.no (des.no [80.203.243.180]) by smtp.des.no (Postfix) with ESMTP id 204C220B1; Wed, 19 Dec 2007 19:40:25 +0100 (CET) Received: by ds4.des.no (Postfix, from userid 1001) id 322FA84499; Wed, 19 Dec 2007 19:40:29 +0100 (CET) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: Robert Watson References: <20071218120359.E15521@fledge.watson.org> <47682ED1.7000702@FreeBSD.org> <20071218204401.E33011@fledge.watson.org> <86ir2vklnm.fsf@ds4.des.no> <20071219105229.T95322@fledge.watson.org> <86zlw6btvv.fsf@ds4.des.no> <20071219151957.C754@fledge.watson.org> Date: Wed, 19 Dec 2007 19:40:29 +0100 In-Reply-To: <20071219151957.C754@fledge.watson.org> (Robert Watson's message of "Wed\, 19 Dec 2007 15\:20\:28 +0000 \(GMT\)") Message-ID: <86bq8mmste.fsf@ds4.des.no> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/22.1 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Maxim Sobolev , current@FreeBSD.org, arch@FreeBSD.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Dec 2007 18:40:33 -0000 Robert Watson writes: > Dag-Erling Sm=C3=B8rgrav writes: > > BTW, it would be also useful to include the config file (if > > available) in the tar-formatted dumps you are working on. > If you'd looked at the patch, you'd see it was already there. Looking at the patch would have violated a long-standing FreeBSD tradition :) DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 07:18:37 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 126F816A419 for ; Thu, 20 Dec 2007 07:18:37 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id D028013C465 for ; Thu, 20 Dec 2007 07:18:36 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.107] (cpe-24-94-75-93.hawaii.res.rr.com [24.94.75.93]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id lBK7IVFX098126; Thu, 20 Dec 2007 02:18:32 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Wed, 19 Dec 2007 21:19:32 -1000 (HST) From: Jeff Roberson X-X-Sender: jroberson@desktop To: arch@freebsd.org, brde@optusnet.com.au Message-ID: <20071219211025.T899@desktop> MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="2547152148-1307199678-1198135172=:899" Cc: Subject: Linux compatible setaffinity. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 07:18:37 -0000 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --2547152148-1307199678-1198135172=:899 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed I have implemented a linux compatible sched_setaffinity() call which is somewhat crippled. This allows a userspace process to supply a bitmask of processors which it will run on. I have copied the linux interface such that it should be api compatible because I believe it is a sensible interface and they beat us to it by 3 years. My implementation is crippled in that it supports binding by curthread only and to a single cpu only. Neither of the schedulers presently support binding to multiple cpus or binding a non-curthread thread. This property is not inherited by forked threads and does not effect other threads in the same process. These two limitations can gradually be weakened without effecting the syscall api. The linux api is: int sched_setaffinity(pid_t pid, unsigned int cpusetsize, cpu_set_t *mask); The cpu_set_t is the same as a fdset for select. The cpusetsize argument is used to determine the size of the array in mask. I'm mostly interested in feedback on how best to reduce the namespace pollution and avoid pulling the sched.h file into the generated syscall files (sysproto.h, etc). Anyone who feels this is a terrible interface for such a thing should speak up now. I also feel that in the medium term we will have to deal with machines with more cores than bits in their native word. Using these CPU_SET, CPU_CLR macros is a fine way to deal with this issue. I also have a primitive 'taskset', although I don't like the name, it allows you to run arbitrary programs bound to a single cpu. Thanks, Jeff --2547152148-1307199678-1198135172=:899 Content-Type: TEXT/x-diff; charset=US-ASCII; name=setaffinity.diff Content-Transfer-Encoding: BASE64 Content-ID: <20071219211932.Q899@desktop> Content-Description: Content-Disposition: attachment; filename=setaffinity.diff SW5kZXg6IGtlcm4va2Vybl9yZXNvdXJjZS5jDQo9PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09DQpSQ1MgZmlsZTogL0NWUy9DVlNfSVBTTy9zcmMvc3lzL2tlcm4v a2Vybl9yZXNvdXJjZS5jLHYNCnJldHJpZXZpbmcgcmV2aXNpb24gMS4yLjEw LjINCmRpZmYgLXUgLXIxLjIuMTAuMiBrZXJuX3Jlc291cmNlLmMNCi0tLSBr ZXJuL2tlcm5fcmVzb3VyY2UuYwkxNyBOb3YgMjAwNyAwMTowMTozOSAtMDAw MAkxLjIuMTAuMg0KKysrIGtlcm4va2Vybl9yZXNvdXJjZS5jCTIwIERlYyAy MDA3IDA3OjA5OjExIC0wMDAwDQpAQCAtNTIsNiArNTIsNyBAQA0KICNpbmNs dWRlIDxzeXMvcmVmY291bnQuaD4NCiAjaW5jbHVkZSA8c3lzL3Jlc291cmNl dmFyLmg+DQogI2luY2x1ZGUgPHN5cy9zY2hlZC5oPg0KKyNpbmNsdWRlIDxz eXMvc21wLmg+DQogI2luY2x1ZGUgPHN5cy9zeC5oPg0KICNpbmNsdWRlIDxz eXMvc3lzY2FsbHN1YnIuaD4NCiAjaW5jbHVkZSA8c3lzL3N5c2VudC5oPg0K QEAgLTczMSw2ICs3MzIsNDUgQEANCiAJcmV0dXJuIChlcnJvcik7DQogfQ0K IA0KKyNpZm5kZWYgX1NZU19TWVNQUk9UT19IXw0KK3N0cnVjdCBzY2hlZF9z ZXRhZmZpbml0eV9hcmdzIHsNCisJcGlkX3QgcGlkOw0KKwl1bnNpZ25lZCBp bnQgY3B1c2V0c2l6ZTsNCisJY3B1X3NldF90ICptYXNrOw0KK307DQorI2Vu ZGlmDQorDQoraW50DQorc2NoZWRfc2V0YWZmaW5pdHkoc3RydWN0IHRocmVh ZCAqdGQsIHN0cnVjdCBzY2hlZF9zZXRhZmZpbml0eV9hcmdzICp1YXApDQor ew0KKwljcHVfc2V0X3QgbWFzazsNCisJaW50IGVycm9yOw0KKwlpbnQgY3B1 Ow0KKwlpbnQgaTsNCisNCisJaWYgKHVhcC0+cGlkICE9IDApDQorCQlyZXR1 cm4gKEVQRVJNKTsNCisJaWYgKHVhcC0+Y3B1c2V0c2l6ZSAhPSBDUFVfU0VU U0laRSkNCisJCXJldHVybiAoRUlOVkFMKTsNCisJZXJyb3IgPSBjb3B5aW4o dWFwLT5tYXNrLCAmbWFzaywgc2l6ZW9mKG1hc2spKTsNCisJaWYgKGVycm9y KQ0KKwkJcmV0dXJuIChlcnJvcik7DQorCWZvciAoY3B1ID0gMCwgaSA9IDA7 IGkgPCBDUFVfU0VUU0laRTsgaSsrKSB7DQorCQlpZiAoIUNQVV9JU1NFVChp LCAmbWFzaykpDQorCQkJY29udGludWU7DQorCQlpZiAoY3B1KQ0KKwkJCXJl dHVybiAoRUlOVkFMKTsNCisJCWNwdSA9IGkgKyAxOw0KKwl9DQorCWNwdS0t Ow0KKwlpZiAoQ1BVX0FCU0VOVChjcHUpKQ0KKwkJcmV0dXJuIChFSU5WQUwp Ow0KKwl0aHJlYWRfbG9jayhjdXJ0aHJlYWQpOw0KKwlzY2hlZF9iaW5kKGN1 cnRocmVhZCwgY3B1KTsNCisJdGhyZWFkX3VubG9jayhjdXJ0aHJlYWQpOw0K KwlyZXR1cm4gKDApOw0KK30NCisNCiAvKg0KICAqIFRyYW5zZm9ybSB0aGUg cnVubmluZyB0aW1lIGFuZCB0aWNrIGluZm9ybWF0aW9uIGZvciBjaGlsZHJl biBvZiBwcm9jIHANCiAgKiBpbnRvIHVzZXIgYW5kIHN5c3RlbSB0aW1lIHVz YWdlLg0KSW5kZXg6IGtlcm4vbWFrZXN5c2NhbGxzLnNoDQo9PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09DQpSQ1MgZmlsZTogL0NWUy9DVlNfSVBTTy9zcmMvc3lz L2tlcm4vbWFrZXN5c2NhbGxzLnNoLHYNCnJldHJpZXZpbmcgcmV2aXNpb24g MS4xDQpkaWZmIC11IC1yMS4xIG1ha2VzeXNjYWxscy5zaA0KLS0tIGtlcm4v bWFrZXN5c2NhbGxzLnNoCTEwIEZlYiAyMDA2IDAzOjU0OjE4IC0wMDAwCTEu MQ0KKysrIGtlcm4vbWFrZXN5c2NhbGxzLnNoCTIwIERlYyAyMDA3IDA3OjA5 OjExIC0wMDAwDQpAQCAtMTE3LDYgKzExNyw4IEBADQogCQlwcmludGYgIiNk ZWZpbmVcdCVzXG5cbiIsIHN5c3Byb3RvX2ggPiBzeXNhcmcNCiAJCXByaW50 ZiAiI2luY2x1ZGUgPHN5cy9zaWduYWwuaD5cbiIgPiBzeXNhcmcNCiAJCXBy aW50ZiAiI2luY2x1ZGUgPHN5cy9hY2wuaD5cbiIgPiBzeXNhcmcNCisJCXBy aW50ZiAiI2luY2x1ZGUgPHN5cy9wcm9jLmg+XG4iID4gc3lzYXJnDQorCQlw cmludGYgIiNpbmNsdWRlIDxzeXMvc2NoZWQuaD5cbiIgPiBzeXNhcmcNCiAJ CXByaW50ZiAiI2luY2x1ZGUgPHN5cy90aHIuaD5cbiIgPiBzeXNhcmcNCiAJ CXByaW50ZiAiI2luY2x1ZGUgPHN5cy91bXR4Lmg+XG4iID4gc3lzYXJnDQog CQlwcmludGYgIiNpbmNsdWRlIDxwb3NpeDQvX3NlbWFwaG9yZS5oPlxuXG4i ID4gc3lzYXJnDQpJbmRleDoga2Vybi9zY2hlZF80YnNkLmMNCj09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT0NClJDUyBmaWxlOiAvQ1ZTL0NWU19JUFNPL3NyYy9z eXMva2Vybi9zY2hlZF80YnNkLmMsdg0KcmV0cmlldmluZyByZXZpc2lvbiAx LjcuNi4yDQpkaWZmIC11IC1yMS43LjYuMiBzY2hlZF80YnNkLmMNCi0tLSBr ZXJuL3NjaGVkXzRic2QuYwkyOSBOb3YgMjAwNyAwMTo1Mzo1MSAtMDAwMAkx LjcuNi4yDQorKysga2Vybi9zY2hlZF80YnNkLmMJMjAgRGVjIDIwMDcgMDc6 MDk6MTEgLTAwMDANCkBAIC0xNDQyLDYgKzE0NDIsNyBAQA0KIAkJCWNwdV9p ZGxlKCk7DQogCQl9DQogCQltdHhfbG9ja19zcGluKCZzY2hlZF9sb2NrKTsN CisJCVNDSEVEX1NUQVRfSU5DKHN3aXRjaF9pZGxlKTsNCiAJCW1pX3N3aXRj aChTV19WT0wsIE5VTEwpOw0KIAkJbXR4X3VubG9ja19zcGluKCZzY2hlZF9s b2NrKTsNCiAJfQ0KSW5kZXg6IGtlcm4vc3lzY2FsbHMubWFzdGVyDQo9PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09DQpSQ1MgZmlsZTogL0NWUy9DVlNfSVBTTy9z cmMvc3lzL2tlcm4vc3lzY2FsbHMubWFzdGVyLHYNCnJldHJpZXZpbmcgcmV2 aXNpb24gMS4yDQpkaWZmIC11IC1yMS4yIHN5c2NhbGxzLm1hc3Rlcg0KLS0t IGtlcm4vc3lzY2FsbHMubWFzdGVyCTIxIEZlYiAyMDA3IDA2OjM0OjMwIC0w MDAwCTEuMg0KKysrIGtlcm4vc3lzY2FsbHMubWFzdGVyCTIwIERlYyAyMDA3 IDA3OjA5OjEyIC0wMDAwDQpAQCAtNzkzLDYgKzc5Myw4IEBADQogCQkJCSAg ICBsb25nIGlkLCB2b2lkICp1YWRkciwgdm9pZCAqdWFkZHIyKTsgfQ0KIDQ1 NQlBVUVfTlVMTAlNU1RECXsgaW50IHRocl9uZXcoc3RydWN0IHRocl9wYXJh bSAqcGFyYW0sIFwNCiAJCQkJICAgIGludCBwYXJhbV9zaXplKTsgfQ0KKzQ1 NglBVUVfTlVMTAlNU1RECXsgaW50IHNjaGVkX3NldGFmZmluaXR5KHBpZF90 IHBpZCwgXA0KKwkJCQkgICAgdW5zaWduZWQgaW50IGNwdXNldHNpemUsIGNw dV9zZXRfdCAqbWFzayk7IH0NCiANCiA7IFBsZWFzZSBjb3B5IGFueSBhZGRp dGlvbnMgYW5kIGNoYW5nZXMgdG8gdGhlIGZvbGxvd2luZyBjb21wYXRhYmls aXR5IHRhYmxlczoNCiA7IHN5cy9jb21wYXQvZnJlZWJzZDMyL3N5c2NhbGxz Lm1hc3Rlcg0KSW5kZXg6IHN5cy9zY2hlZC5oDQo9PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09DQpSQ1MgZmlsZTogL0NWUy9DVlNfSVBTTy9zcmMvc3lzL3N5cy9z Y2hlZC5oLHYNCnJldHJpZXZpbmcgcmV2aXNpb24gMS4yLjEwLjINCmRpZmYg LXUgLXIxLjIuMTAuMiBzY2hlZC5oDQotLS0gc3lzL3NjaGVkLmgJMyBEZWMg MjAwNyAyMTo0NzowOSAtMDAwMAkxLjIuMTAuMg0KKysrIHN5cy9zY2hlZC5o CTIwIERlYyAyMDA3IDA3OjA5OjE4IC0wMDAwDQpAQCAtMTk4LDYgKzE5OCwz NyBAQA0KICAgICAgICAgaW50ICAgICBzY2hlZF9wcmlvcml0eTsNCiB9Ow0K IA0KK3R5cGVkZWYJdW5zaWduZWQgbG9uZwlfX2NwdV9tYXNrOw0KKw0KKyNp Zm5kZWYJQ1BVX1NFVFNJWkUNCisjZGVmaW5lCUNQVV9TRVRTSVpFCTEwMjRV DQorI2VuZGlmDQorDQorI2RlZmluZQlfTkNQVUJJVFMJKHNpemVvZihfX2Nw dV9tYXNrKSAqIDgpCS8qIGJpdHMgcGVyIG1hc2sgKi8NCisNCisjaWZuZGVm IF9ob3dtYW55DQorI2RlZmluZQlfaG93bWFueSh4LCB5KQkoKCh4KSArICgo eSkgLSAxKSkgLyAoeSkpDQorI2VuZGlmDQorDQordHlwZWRlZglzdHJ1Y3Qg Y3B1X3NldCB7DQorCV9fY3B1X21hc2sJX19jcHVzX2JpdHNbX2hvd21hbnko Q1BVX1NFVFNJWkUsIF9OQ1BVQklUUyldOw0KK30gY3B1X3NldF90Ow0KKw0K KyNkZWZpbmUJX19jcHVzZXRfbWFzayhuKQkoKF9fY3B1X21hc2spMSA8PCAo KG4pICUgX05DUFVCSVRTKSkNCisjZGVmaW5lCUNQVV9DTFIobiwgcCkJKChw KS0+X19jcHVzX2JpdHNbKG4pL19OQ1BVQklUU10gJj0gfl9fY3B1c2V0X21h c2sobikpDQorI2RlZmluZQlDUFVfQ09QWShmLCB0KQkodm9pZCkoKih0KSA9 ICooZikpDQorI2RlZmluZQlDUFVfSVNTRVQobiwgcCkJKCgocCktPl9fY3B1 c19iaXRzWyhuKS9fTkNQVUJJVFNdICYgX19jcHVzZXRfbWFzayhuKSkgIT0g MCkNCisjZGVmaW5lCUNQVV9TRVQobiwgcCkJKChwKS0+X19jcHVzX2JpdHNb KG4pL19OQ1BVQklUU10gfD0gX19jcHVzZXRfbWFzayhuKSkNCisjZGVmaW5l CUNQVV9aRVJPKHApIGRvIHsJCQkJCVwNCisJY3B1X3NldF90ICpfcDsJCQkJ CVwNCisJX19zaXplX3QgX247CQkJCQlcDQorCQkJCQkJCVwNCisJX3AgPSAo cCk7CQkJCQlcDQorCV9uID0gX2hvd21hbnkoQ1BVX1NFVFNJWkUsIF9OQ1BV QklUUyk7CQlcDQorCXdoaWxlIChfbiA+IDApCQkJCQlcDQorCQlfcC0+X19j cHVzX2JpdHNbLS1fbl0gPSAwOwkJXA0KK30gd2hpbGUgKDApDQorDQogLyoN CiAgKiBQT1NJWCBzY2hlZHVsaW5nIGRlY2xhcmF0aW9ucyBmb3IgdXNlcmxh bmQuDQogICovDQpAQCAtMjEzLDYgKzI0NCw4IEBADQogc3RydWN0IHRpbWVz cGVjOw0KIA0KIF9fQkVHSU5fREVDTFMNCitpbnQJc2NoZWRfc2V0YWZmaW5p dHkocGlkX3QgcGlkLCB1bnNpZ25lZCBpbnQgY3B1c2V0c2l6ZSwgY3B1X3Nl dF90ICptYXNrKTsNCitpbnQJc2NoZWRfZ2V0YWZmaW5pdHkocGlkX3QgcGlk LCB1bnNpZ25lZCBpbnQgY3B1c2V0c2l6ZSwgY3B1X3NldF90ICptYXNrKTsN CiBpbnQgICAgIHNjaGVkX2dldF9wcmlvcml0eV9tYXgoaW50KTsNCiBpbnQg ICAgIHNjaGVkX2dldF9wcmlvcml0eV9taW4oaW50KTsNCiBpbnQgICAgIHNj aGVkX2dldHBhcmFtKHBpZF90LCBzdHJ1Y3Qgc2NoZWRfcGFyYW0gKik7DQo= --2547152148-1307199678-1198135172=:899-- From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 10:37:03 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D46116A418 for ; Thu, 20 Dec 2007 10:37:03 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 5719013C4E7 for ; Thu, 20 Dec 2007 10:37:01 +0000 (UTC) (envelope-from andre@freebsd.org) Received: (qmail 30324 invoked from network); 20 Dec 2007 10:04:48 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 20 Dec 2007 10:04:48 -0000 Message-ID: <476A45D6.6030305@freebsd.org> Date: Thu, 20 Dec 2007 11:37:10 +0100 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.13 (Windows/20070809) MIME-Version: 1.0 To: Lawrence Stewart References: <20071219123305.Y95322@fledge.watson.org> <47693DBD.6050104@swin.edu.au> In-Reply-To: <47693DBD.6050104@swin.edu.au> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: James Healy , arch@freebsd.org, Robert Watson , net@freebsd.org Subject: Re: Coordinating TCP projects X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 10:37:03 -0000 Lawrence Stewart wrote: > Hi Robert, > > Comments inline. > > Robert Watson wrote: >> >> Dear all, >> >> It is rapidly becoming clear that quite a few of us have Big Plans for >> the TCP implementation over the next 12-18 months. It's important >> that we get the plans out on the table now so that everyone working on >> these projects is aware of the larger context. This will encourage >> collaboration, but also allow us to manage the risks inevitably >> associated with having several simultaneous projects going on in a >> very complex software base. With that in mind, here are the large >> projects I'm currently aware of: >> >> Project Flag Wavers Status >> ------- ----------- ------ >> TCP offload Kip Macy Moving to CVS and under >> review and testing; one >> supporting device driver. >> >> TCP congestion control Sam Leffler, At least one prototype >> Rui Paulo, implementation, to move to p4 >> Andre Oppermann, >> Kip Macy, >> Lawrence Stewart, >> James Healy >> >> TCP overhaul Andre Oppermann Glimmer in eye, to move to >> p4. >> >> TCP lock granularity/ Robert Watson Glimmer in eye, to occur in >> increased parallelism p4. >> >> TCP timer unification Andre Oppermann, Previously committed, and to >> Mike Silbersack be reintroduced via p4. >> >> Monitoring ABI cleanup Robert Watson Glimmer in eye, to >> occur in >> p4. >> >> Looking at the above, it sounds like a massive amount of work taking >> place, so we will need to coordinate carefully. I'd like to encourage >> people to avoid creating unnecessary dependencies between changes, and >> to be especially careful in coordinating potentially MFCable changes. >> There are (at least) two conflicting scheduling desires in play here: >> >> - A desire to merge MFCable changes early, so that they aren't >> entangled with >> un-mergeable changes. This will simplify merging and also maximize the >> extent to which testing in HEAD will apply to them once merged to >> RELENG_7. >> >> - A desire to merge large-scale infrastructural changes early so that >> they see >> the greatest exposure, and so that they can be introduced >> incrementally over >> a longer period of time to shake each out. >> >> Both of these are valid perspectives, and will need to be balanced. I >> have a few questions, then, for people involved in these or other >> projects: >> >> (0) Is your project in the above list? If not, could you send out a >> reply >> talking a bit about the project, who's involved, where it's taking >> place, >> etc. > > Rui@ recently posted a TCP ECN patch that probably belongs in the list > (http://lists.freebsd.org/pipermail/freebsd-net/2007-November/015979.html) > unless it has already recently been committed. > > > Jim and I recently discussed the idea of implementing autotuning of the > TCP reassembly queue size based on analysis of some experimental work > we've been doing. It's a small project, but we feel it would be worth > implementing. Details follow... > > > Problem description: > > Currently, "net.inet.tcp.reass.maxqlen" specifies the maximum number of > segments that can be held in the reassembly queue for a TCP connection. > The current default value is 48, which equates to approx. 69k of buffer > space if MSS = 1448 bytes. This means that if the TCP window grows to be > more than 48 segments wide, and a packet is lost, the receiver will > buffer the next 48 segments in the reassembly queue and subsequently > drop all the remaining segments in the window because the reassembly > buffer is full i.e. 1 packet loss in the network can equate to many > packet losses at the receiver because of insufficient buffering. This > obviously has a negative impact on performance in environments where > there is non-zero packet loss. > > With the addition of automatic socket buffer tuning in FreeBSD 7, the > ability for the TCP window to grow above 48 segments is going to be even > more prevalent than it is now, so this issue will continue to affect > connections to FreeBSD based TCP receivers. > > We observed that the socket receive buffer size provides a good > indication of the expected number of bytes in flight for a connection, > and can therefore serve as the figure to base the size of the reassembly > queue on. I've got a rewritten and much more efficient tcp_reass() function in my local tree. I'll import it into Perforce next week with all the other stuff. You may want to base your auto-sizing work on it. The only missing parts are some statistics gathering. -- Andre > Basic project description: > > - Make the reassembly queue's max length a per-connection variable to > appropriately tailor the reassembly queue buffer size for each connection > > - Piggyback automated reassembly queue sizing with the code that resizes > the socket receive buffer > > - The socket buffer tuning code already has the required infrastructure > to cap the max buffer size, so this would implicitly limit the size of > the reassembly queue > > - If the socket buffer sizes were explicitly overridden using sockopts > (e.g. to support large windows for particular apps), the reassembly > queue would grow to accommodate only connections using the larger than > normal receive buffer. > > - The net.inet.tcp.reass.maxsegments tunable would still be left intact > to ensure users can set a hard cap on the max amount of memory allowed > for reassembly buffering. > >> >> (1) What is your availability to shepherd the project through its entire >> cycle, including early prototyping, design review, development, >> implementation review, testing, and the inevitable long debugging >> tail >> that all TCP projects have. > > We should be able to run the reassembly queue project full cycle. > >> >> (2) When do you think your implementation will reach a prototype phase >> appropriate for an expanded circle of reviewers? When do you >> think it >> might be ready for commit? Keep in mind that we're now a month or >> so into >> the 18-month cycle for 8.0, and that all serious TCP work should be >> completed at least six months before the end of the cycle. > > To be safe, I'll say we should have a prototype ready by the end of Feb > 2008, though I suspect we'll have something ready sooner than that. > Commit ready code should follow very shortly after that (few weeks at > most), as we anticipate that the patch will be very simple. > >> >> (3) What potential interactions of note exist between your project and >> the >> others being planned. Are there explicit dependencies? > > The "TCP Overhaul" project would possibly alter the location of the > changes, but shouldn't affect the essence of the changes themselves. > It's unlikely any of the other projects would affect this one. > >> >> (4) Do you anticipate an MFC cycle for your work to RELENG_7? > > Yes. A munged version could also be made available for RELENG_6.... it > just wouldn't be based on automatic receive buffer tuning, and would > probably be based on a static calculation during connection initialisation. > >> >> I'd like for us to create a wiki page tracking these various projects, >> and pointing at per-project resources. Once the discussion has >> settled a bit, I can take responsibility for creating such a page, but >> will need everyone involved to help maintain it, as well as to >> maintain pages (on the wiki or elsewhere) regarding the status of the >> projects. I think it also makes a lot of sense for participants in >> the projects to send occasional updates and reports to net@/arch@ in >> order to keep people who can't track things day-to-date in the loop, >> and to invite review. > > Sounds fair. > > [snip] > > Cheers, > Jim and Lawrence > > _______________________________________________ > freebsd-net@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-net > To unsubscribe, send any mail to "freebsd-net-unsubscribe@freebsd.org" > > From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 11:10:59 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C20F116A417 for ; Thu, 20 Dec 2007 11:10:59 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.freebsd.org (Postfix) with ESMTP id 362EA13C448 for ; Thu, 20 Dec 2007 11:10:58 +0000 (UTC) (envelope-from andre@freebsd.org) Received: (qmail 30694 invoked from network); 20 Dec 2007 10:38:46 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 20 Dec 2007 10:38:46 -0000 Message-ID: <476A4DCC.4040206@freebsd.org> Date: Thu, 20 Dec 2007 12:11:08 +0100 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.13 (Windows/20070809) MIME-Version: 1.0 To: Jeff Roberson References: <20071219211025.T899@desktop> In-Reply-To: <20071219211025.T899@desktop> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: Linux compatible setaffinity. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 11:10:59 -0000 Jeff Roberson wrote: > I have implemented a linux compatible sched_setaffinity() call which is > somewhat crippled. This allows a userspace process to supply a bitmask > of processors which it will run on. I have copied the linux interface > such that it should be api compatible because I believe it is a sensible > interface and they beat us to it by 3 years. The Linux (and Solaris) style setaffinity is rather low level and any user of it has to make many assumptions based on incomplete knowledge of the underlying hardware and its architecture (buses, caches, latency between cores, etc). In practical use I'd rather have a function to bind myself to the current CPU or CPU number X, and then to specify that new threads or forked processes should emerge on another, but not this CPU. Pepper that with a few hints like latency and cache affinity (important or not important) the kernel can act on appropriately and it becomes much more powerful and simpler to use. Taking it even further an application may want to specify that it would like to run on a number X of cores that are close (latency/cache) together, be permanently bound to it and to repel any other such requests. This way I can run my database server on socket 1 cores 1-4, and the webserver on socket 2 cores 5-8 more or less automagically. sched_setaffinity requires a lot of operator involvement and architecture knowledge to make that happen. Not that I'm against a Linux compatible sched_setaffinity(), it's just not as practical to use as other constructs. Food for thought. -- Andre From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 11:29:31 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B270716A41A; Thu, 20 Dec 2007 11:29:31 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 0A58B13C51F; Thu, 20 Dec 2007 11:29:29 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 3D3FC46C11; Thu, 20 Dec 2007 06:29:28 -0500 (EST) Date: Thu, 20 Dec 2007 11:29:28 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Andre Oppermann In-Reply-To: <476A4DCC.4040206@freebsd.org> Message-ID: <20071220112116.X67327@fledge.watson.org> References: <20071219211025.T899@desktop> <476A4DCC.4040206@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org Subject: Re: Linux compatible setaffinity. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 11:29:31 -0000 On Thu, 20 Dec 2007, Andre Oppermann wrote: > Not that I'm against a Linux compatible sched_setaffinity(), it's just not > as practical to use as other constructs. Another thought I shared with Jeff earlier today was that we'd like to be able to represent some more systemic ideas, such as: - This jail is able to run on logical CPUs 7 and 8, whereas that jail can use 9, 10, and 11. Scope affinity management in some form to those CPUs. - The tasks on CPUs 4, 5, 6, and 7 should be relocated to CPUs 8, 9, 10, and 11 so that those former CPUs can be taken offline. - I don't care what physical die my threads run on, but I want them all to be on the same die. At least the first of these has immediate applicability, and might be represented as an additional per-thread inherited mask. However, it won't help with the second, which implies a level of indirection between the requested medium and the assigned medium. Solaris has a notion of CPU set, which I'm not sure we want verbatim, but might consider whether it has some useful properties. In particular, it would allow you to perform certain types of assignment operations more abstractly, such as "This jail runs on CPU set X; the customer has paid more money, so add another two CPUs to the set and have the jail automatically expand to include them", which isn't well-represented if you have to walk lots and lots of thread and process masks. I'm not opposed to simply getting the Linux API in the tree for now, but I'd rather it weren't MFC'd until we've thought a bit more about whether we need a stronger abstraction. One thing to think about is whether we couldn't layer the Linux API over another abstraction -- i.e., have the API's notion of CPU number be with respect to the current CPU set, so that if the CPU set is migrated, the relative assignments and pinnings still apply, just to different CPUs. Finally, one of the things we will want to start addressing in 8 is affinity for TCP connections. Kip and I have both been waving our hands at some stuff in this area, but we'll want to formalize it. Having a clear idea of our thread/process affinity model will be an important part of understanding what we're going to do for network connections, because presumably we'll want the affinities of kernel objects to be related in some way to the affinities expressed by user threads and processes... I've done a bit of hacking regarding simply allowing user threads to request a specific affinity for a TCP connection, but that only makes sense in the context of the new TCP locking work. However, if requests for affinity are with respect to logical CPUs within a CPU set, the expression and maintenance of that affinity for the TCP connection becomes quite different. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 13:32:53 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CDEA316A419 for ; Thu, 20 Dec 2007 13:32:53 +0000 (UTC) (envelope-from qpadla@gmail.com) Received: from nf-out-0910.google.com (nf-out-0910.google.com [64.233.182.184]) by mx1.freebsd.org (Postfix) with ESMTP id EEDE113C468 for ; Thu, 20 Dec 2007 13:32:52 +0000 (UTC) (envelope-from qpadla@gmail.com) Received: by nf-out-0910.google.com with SMTP id b2so1951564nfb.33 for ; Thu, 20 Dec 2007 05:32:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:message-id; bh=6xE7+coxtivywsrg/Du8HyBEPLVYKzkoMposCWiY9Vs=; b=VMWX2YY4Y8nBGIWsX9lcsbV4Fggzf2ENAIRLFlr28FDssx/RMOfFh2gEqyYteiPF0yw7y5VsGaTgl+/OR+SueviBsLUxCm00l/r/3SC5iyXSlUTTNiYXtebYQQykFVD/onznJkE1YD+SRaRX+q8wnnQUsL42GdknGkGbnsLfI5g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:message-id; b=l6ahJQMcwNG/sGF57GXD9rvkZ7/sEFpCouIn5whXyvo1ttz8MxjzMgRL0ZoDy41iWIQbACnAdeF2UsiiVHEgAXy29ApapY/ASGaUt8Y6nZ9zBgT/h1cKpeDQhmxRmPYYgr9BO7Xb8yIhTtISve8udopboSWnDqdwXwO7weiEDJE= Received: by 10.78.160.4 with SMTP id i4mr10057108hue.35.1198157569269; Thu, 20 Dec 2007 05:32:49 -0800 (PST) Received: from orion ( [89.162.141.1]) by mx.google.com with ESMTPS id 2sm186084nfv.2007.12.20.05.32.45 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 20 Dec 2007 05:32:47 -0800 (PST) From: Nikolay Pavlov To: freebsd-current@freebsd.org Date: Thu, 20 Dec 2007 15:32:40 +0200 User-Agent: KMail/1.9.6 (enterprise 0.20070907.709405) References: <20071218120359.E15521@fledge.watson.org> In-Reply-To: <20071218120359.E15521@fledge.watson.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1211070.4xZmMiWbSL"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200712201532.41123.qpadla@gmail.com> Cc: arch@freebsd.org, Robert Watson , current@freebsd.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: qpadla@gmail.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 13:32:53 -0000 --nextPart1211070.4xZmMiWbSL Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Tuesday 18 December 2007 14:10:46 Robert Watson wrote: > Dear all: > > I've been hacking on-and-off for a while on a side project to improve > our kernel debugging facilities. Primarily, my concern has been to > address three problems: > > - The complications of employing kernel core dumps for debugging, > including the large size of dumps making them unwieldy to distribute > or store for any extended period (even with minidumps), the requirement > to have relatively synchronized kernel source in order to use the dumps, > the need to have a kernel with debugging symbols, and the problems with > fsck causing sufficient swap use to invalidate dumps before they can be > extracted. > > - The decreasing likelihood that notebooks will ship with serial ports > that can be used for interactive debugging using DDB. Making > end-users type in stack traces is cruel, photos are a pain, and X11 > rules out both. > > - The fact that a great many problems are most easily diagnosed using > utility routines present in DDB, but not as easily using kgdb for > offline analysis. I find that for many bugs I analyze, simply > looking at the DDB output is sufficient to identify the source of the > problem. > > An idea I punted around a bit at BSDCan earlier this year (or perhaps it > was at EuroBSDCon the previous year) was an idea of a "textdump" -- that > is, a new type of kernel dump based on capturing automatically extracted > debugging information generated by DDB. The result would be an ASCII > text file that could be filed as a bug report, perhaps even > automatically. > > To this end, I have implemented three new facilities for use with DDB: > > (1) DDB output capture. The output of DDB is stored in a memory buffer, > and can be extracted using a sysctl or textdumps (see below). This > can be turned on and off, both for use manually ("I'll want this > later, but not that") and as part of scripts (see below). > > (2) DDB scripting. A limited number of named scripts can be defined to > run a series of DDB commands. No loops, etc, just simple command > lists. These can be caused to run automatically on entering DDB > for various scenarios, including WITNESS violations and kernel panics. > They can also be run by hand in order to save a bit of typing if you use > DDB in a repetitive way (as I do). > > (3) Textdumps. A new dump type that stores a series of data files > containing various pieces of information, including the DDB capture > buffer, kernel message buffer, kernel configuration (if compiled > into the kernel), panic message, and kernel version string. These are > stored in the ustar format inside the dump partition (aligned to the > end) so can be easily extended, and savecore(8) requires almost no new > logic to deal with them (it just drops numbered tar files in > /var/crash). This makes it straight forward to extend the textdump > format to include new types of information and avoids the issue of how > to safely simultaneously represent information in many different formats > in the same file. > > These are pretty flexible tools, and you can imagine doing the following > sorts of things: > > - Setting the kdb.enter.panic script to automatically turn on output > capture, do full backtraces of all threads, show open file > information, dump UMA stats, and save it all to a textdump and then > reboot. > > - Setting the kdb.enter.witness script to show lock information, > generate a coredump, and reboot. Or, just to automatically do "show > allocks" and drop to the DDB prompt. > > - Adding a flag to rc.conf to automatically submit textdumps via e-mail > to a specific address, perhaps including GNATS or an automated bug > system. These could be unpacked and automatically analyzed, and do to > the compact size, kept for long-term trend analysis or to identify when > a problem started occuring. > > I've produced an initial snapshot of the above, which can be found here: > > http://www.watson.org/~robert/freebsd/20071218-ddb.tgz > > This adds three files to DDB, patches quite a few kernel files (to pass > more information into KDB about why it's being entered, in order to > trigger the right script), enhancements to savecore(8) to know how to > extract textdumps, adds a ddb(8) command line tool so that userspace can > manage DDB scripts from outside the debugger, extensions to the ddb(4) > man page, and a new textdump(4) man page. > > There are a number of known limitations; I've tried to document them at > the top of the pertinent files where I am aware of them. I also regret > to say that to date I've been able to test only on i386, and not other > platforms. I'd welcome any feedback -- I'd like to get these changes > into CVS in the next week or two. It looks like some files is not included in the patch. I have this error: make -V CFILES -V SYSTEM_CFILES -V GEN_CFILES | MKDEP_CPP=3D"cc -E" CC=3D"= cc"=20 xargs=20 mkdep -a -f .newdep -O -pipe -std=3Dc99 -g -Wall -Wredundant-decls -Wneste= d-externs -Wstrict-prototypes -Wmissing-prototypes -Wpointer-arith -Winlin= e -Wcast-qual -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc -I.= -I/usr/src/sys -I/usr/src/sys/contrib/altq -I/usr/src/sys/contrib/ipfilter= -I/usr/src/sys/contrib/pf -I/usr/src/sys/dev/ath -I/usr/src/sys/contrib/ng= atm -I/usr/src/sys/dev/twa -I/usr/src/sys/gnu/fs/xfs/FreeBSD -I/usr/src/sys= /gnu/fs/xfs/FreeBSD/support -I/usr/src/sys/gnu/fs/xfs -D_KERNEL -DHAVE_KERN= EL_OPTION_HEADERS -include=20 opt_global.h -fno-common -finline-limit=3D8000 --param=20 inline-unit-growth=3D100 --param=20 large-function-growth=3D1000 -mno-align-long-strings -mpreferred-stack-bou= ndary=3D2 -mno-mmx -mno-3dnow -mno-sse -mno-sse2 -mno-sse3 -ffreestanding cc: /usr/src/sys/ddb/db_capture.c: No such file or directory cc: /usr/src/sys/ddb/db_script.c: No such file or directory cc: /usr/src/sys/ddb/db_textdump.c: No such file or directory mkdep: compile failed *** Error code 1 Stop in /usr/obj/usr/src/sys/GENERIC. *** Error code 1 Stop in /usr/src. *** Error code 1 Stop in /usr/src. root@orion-vm:/usr/src# ls -la /usr/src/sys/ddb/ = = =20 total 424 drwxr-xr-x 2 root wheel 1024 Dec 19 16:49 ./ drwxr-xr-x 53 root wheel 1024 Oct 29 08:24 ../ =2Drw-r--r-- 1 root wheel 2591 Dec 4 2005 db_access.c =2Drw-r--r-- 1 root wheel 1431 Jan 6 2005 db_access.h =2Drw-r--r-- 1 root wheel 7737 Jan 6 2005 db_break.c =2Drw-r--r-- 1 root wheel 2098 Jan 6 2005 db_break.h =2Drw-r--r-- 1 root wheel 16579 Dec 19 16:49 db_command.c =2Drw-r--r-- 1 root wheel 15705 Jan 17 2007 db_command.c.orig =2Drw-r--r-- 1 root wheel 1633 Dec 19 16:49 db_command.h =2Drw-r--r-- 1 root wheel 1588 Jan 6 2005 db_command.h.orig =2Drw-r--r-- 1 root wheel 7270 Oct 27 20:19 db_examine.c =2Drw-r--r-- 1 root wheel 4811 Jan 6 2005 db_expr.c =2Drw-r--r-- 1 root wheel 7981 Dec 19 16:49 db_input.c =2Drw-r--r-- 1 root wheel 7931 Jan 6 2005 db_input.c.orig =2Drw-r--r-- 1 root wheel 5978 Dec 19 16:49 db_lex.c =2Drw-r--r-- 1 root wheel 5304 Jan 6 2005 db_lex.c.orig =2Drw-r--r-- 1 root wheel 1951 Dec 19 16:49 db_lex.h =2Drw-r--r-- 1 root wheel 1861 Jan 6 2005 db_lex.h.orig =2Drw-r--r-- 1 root wheel 5976 Dec 19 16:49 db_main.c =2Drw-r--r-- 1 root wheel 5787 Nov 6 2006 db_main.c.orig =2Drw-r--r-- 1 root wheel 6889 Dec 19 16:49 db_output.c =2Drw-r--r-- 1 root wheel 6639 Oct 10 2006 db_output.c.orig =2Drw-r--r-- 1 root wheel 1444 Oct 8 2006 db_output.h =2Drw-r--r-- 1 root wheel 2013 Apr 14 2005 db_print.c =2Drw-r--r-- 1 root wheel 10956 Nov 13 13:43 db_ps.c =2Drw-r--r-- 1 root wheel 8810 Apr 14 2005 db_run.c =2Drw-r--r-- 1 root wheel 7815 Jun 16 2006 db_sym.c =2Drw-r--r-- 1 root wheel 3591 Jan 6 2005 db_sym.h =2Drw-r--r-- 1 root wheel 5143 Jan 17 2007 db_thread.c =2Drw-r--r-- 1 root wheel 3402 Jan 6 2005 db_variables.c =2Drw-r--r-- 1 root wheel 1854 Jan 6 2005 db_variables.h =2Drw-r--r-- 1 root wheel 7218 Nov 17 2006 db_watch.c =2Drw-r--r-- 1 root wheel 1509 Jan 6 2005 db_watch.h =2Drw-r--r-- 1 root wheel 2264 Jan 6 2005 db_write_cmd.c =2Drw-r--r-- 1 root wheel 7467 Dec 19 16:49 ddb.h =2Drw-r--r-- 1 root wheel 5776 Jul 12 2006 ddb.h.orig =2D-=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 =2D Best regards, Nikolay Pavlov. <<<----------------------------------- = =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --nextPart1211070.4xZmMiWbSL Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBHam75/2R6KvEYGaIRAo+GAJ996lCyHp0+dyN5rW7dWbBRugjQFgCg39e1 +bfdaDTErGPe2aJWZCW0BTw= =CE18 -----END PGP SIGNATURE----- --nextPart1211070.4xZmMiWbSL-- From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 13:42:22 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 05EE716A419; Thu, 20 Dec 2007 13:42:22 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id C804A13C478; Thu, 20 Dec 2007 13:42:21 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 1CD33475F9; Thu, 20 Dec 2007 08:42:21 -0500 (EST) Date: Thu, 20 Dec 2007 13:42:21 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Nikolay Pavlov In-Reply-To: <200712201532.41123.qpadla@gmail.com> Message-ID: <20071220134033.T94754@fledge.watson.org> References: <20071218120359.E15521@fledge.watson.org> <200712201532.41123.qpadla@gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org, current@freebsd.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 13:42:22 -0000 On Thu, 20 Dec 2007, Nikolay Pavlov wrote: > It looks like some files is not included in the patch. I have this error: Unfortunately, p4 doesn't include a -N argument to its diff2 command, so I attached the new files directly in the tarball (including the kernel ones). Could you check and see if they are there? If you untar the tarball in your top-level src directory, it should extract one man page into share/man/man4, a few files into src/sbin/ddb (note: I have not updated the sbin Makefile so you'll need to build it by hand), three files into src/sys/ddb, and then a patch which you apply. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 13:56:06 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D947B16A420; Thu, 20 Dec 2007 13:56:06 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 9662713C474; Thu, 20 Dec 2007 13:56:06 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 2AF3946F7A; Thu, 20 Dec 2007 08:56:06 -0500 (EST) Date: Thu, 20 Dec 2007 13:56:06 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: net@FreeBSD.org Message-ID: <20071220135342.O67327@fledge.watson.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@FreeBSD.org Subject: TCP Projects for 8.0 - first cut wiki page X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 13:56:06 -0000 Per earlier e-mail, I've created a page to track the various on-going projects: http://wiki.freebsd.org/TCPProjects8 Rui has already kindly added the TCP ECN work to the page. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 15:25:34 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5730A16A417; Thu, 20 Dec 2007 15:25:33 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (cl-162.ewr-01.us.sixxs.net [IPv6:2001:4830:1200:a1::2]) by mx1.freebsd.org (Postfix) with ESMTP id C824913C459; Thu, 20 Dec 2007 15:25:32 +0000 (UTC) (envelope-from brooks@lor.one-eyed-alien.net) Received: from lor.one-eyed-alien.net (localhost [127.0.0.1]) by lor.one-eyed-alien.net (8.14.1/8.13.8) with ESMTP id lBKFPVk6053771; Thu, 20 Dec 2007 09:25:31 -0600 (CST) (envelope-from brooks@lor.one-eyed-alien.net) Received: (from brooks@localhost) by lor.one-eyed-alien.net (8.14.1/8.13.8/Submit) id lBKFPVd7053770; Thu, 20 Dec 2007 09:25:31 -0600 (CST) (envelope-from brooks) Date: Thu, 20 Dec 2007 09:25:31 -0600 From: Brooks Davis To: Robert Watson Message-ID: <20071220152531.GA53327@lor.one-eyed-alien.net> References: <20071219123305.Y95322@fledge.watson.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="LQksG6bCIzRHxTLp" Content-Disposition: inline In-Reply-To: <20071219123305.Y95322@fledge.watson.org> User-Agent: Mutt/1.5.16 (2007-06-09) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (lor.one-eyed-alien.net [127.0.0.1]); Thu, 20 Dec 2007 09:25:31 -0600 (CST) Cc: James Healy , arch@freebsd.org, Lawrence Stewart , net@freebsd.org Subject: Re: Coordinating TCP projects X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 15:25:34 -0000 --LQksG6bCIzRHxTLp Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Dec 19, 2007 at 01:09:13PM +0000, Robert Watson wrote: >=20 > I'd like for us to create a wiki page tracking these various projects, an= d=20 > pointing at per-project resources. Once the discussion has settled a bit= ,=20 > I can take responsibility for creating such a page, but will need everyon= e=20 > involved to help maintain it, as well as to maintain pages (on the wiki o= r=20 > elsewhere) regarding the status of the projects. I think it also makes a= =20 > lot of sense for participants in the projects to send occasional updates= =20 > and reports to net@/arch@ in order to keep people who can't track things= =20 > day-to-date in the loop, and to invite review. In addition to the wiki, I think it's important to emphasize that as a matter of principle, if it's not public, then in practice it doesn't exist. This means that if you want people to coordinate their changes with your changes, you need to be working in public. You should ideally be working in perforce, but at a minimum regular posts discussing details are required. Non-public work (existent or not) will not be permitted to delay the inclusion of desirable features that are reviewed and tested. -- Brooks --LQksG6bCIzRHxTLp Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQFHaolqXY6L6fI4GtQRAgVSAJ9+Voftec9WDAzSCv9/q+tq192JqwCeMAnN 8jlDdU08+rOjrPz01GcBwWA= =hwMu -----END PGP SIGNATURE----- --LQksG6bCIzRHxTLp-- From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:01:32 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DBE2616A46C for ; Thu, 20 Dec 2007 18:01:32 +0000 (UTC) (envelope-from chuckr@chuckr.org) Received: from mail6.sea5.speakeasy.net (mail6.sea5.speakeasy.net [69.17.117.8]) by mx1.freebsd.org (Postfix) with ESMTP id AFD5213C46A for ; Thu, 20 Dec 2007 18:01:32 +0000 (UTC) (envelope-from chuckr@chuckr.org) Received: (qmail 20746 invoked from network); 20 Dec 2007 17:34:51 -0000 Received: from april.chuckr.org (chuckr@[66.92.151.30]) (envelope-sender ) by mail6.sea5.speakeasy.net (qmail-ldap-1.03) with AES256-SHA encrypted SMTP for ; 20 Dec 2007 17:34:51 -0000 Message-ID: <476AA708.1090908@chuckr.org> Date: Thu, 20 Dec 2007 12:31:52 -0500 From: Chuck Robey User-Agent: Thunderbird 2.0.0.6 (X11/20071107) MIME-Version: 1.0 To: Robert Watson References: <20071220135342.O67327@fledge.watson.org> In-Reply-To: <20071220135342.O67327@fledge.watson.org> X-Enigmail-Version: 0.95.5 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: arch@FreeBSD.org, net@FreeBSD.org Subject: Re: TCP Projects for 8.0 - first cut wiki page X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:01:33 -0000 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Robert Watson wrote: > > Per earlier e-mail, I've created a page to track the various on-going > projects: > > http://wiki.freebsd.org/TCPProjects8 > > Rui has already kindly added the TCP ECN work to the page. > Things like this should definitely be publicized on the FreeBSD main web page, because it's stuff like this I find exciting. > Robert N M Watson > Computer Laboratory > University of Cambridge > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHaqcIz62J6PPcoOkRApN0AJ9zo/3MzQnDs2FtIuKYyje6L5u9VgCfTRpX aFBUQLFM2Dbx9U5P2Jv1Az8= =FSIC -----END PGP SIGNATURE----- From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:17:55 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1B5BF16A41B for ; Thu, 20 Dec 2007 18:17:55 +0000 (UTC) (envelope-from _pppp@mail.ru) Received: from f59.mail.ru (f59.mail.ru [194.67.57.93]) by mx1.freebsd.org (Postfix) with ESMTP id C20F113C457 for ; Thu, 20 Dec 2007 18:17:54 +0000 (UTC) (envelope-from _pppp@mail.ru) Received: from mail by f59.mail.ru with local id 1J5Pxt-0004O8-00; Thu, 20 Dec 2007 21:17:49 +0300 Received: from [89.208.20.114] by koi.mail.ru with HTTP; Thu, 20 Dec 2007 21:17:49 +0300 From: dima <_pppp@mail.ru> To: Robert Watson Mime-Version: 1.0 X-Mailer: mPOP Web-Mail 2.19 X-Originating-IP: [89.208.20.114] Date: Thu, 20 Dec 2007 21:17:49 +0300 In-Reply-To: <20071220135342.O67327@fledge.watson.org> References: <20071220135342.O67327@fledge.watson.org> Content-Type: text/plain; charset=koi8-r Content-Transfer-Encoding: 8bit Message-Id: Cc: arch@FreeBSD.org, net@FreeBSD.org Subject: Re: TCP Projects for 8.0 - first cut wiki page X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: dima <_pppp@mail.ru> List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:17:55 -0000 > Per earlier e-mail, I've created a page to track the various on-going > projects: > > http://wiki.freebsd.org/TCPProjects8 > > Rui has already kindly added the TCP ECN work to the page. As I know, we have a single swi:net thread in the kernel yet. Are there any plans to make several such threads? If yes, this activity isn't mentioned in wiki. There are 2 ideas: 1. per-core thread 2. per-interface thread I like the second more. Regards, Dmitriy Marchenko. From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:18:57 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 996D516A419 for ; Thu, 20 Dec 2007 18:18:57 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outO.internet-mail-service.net (outO.internet-mail-service.net [216.240.47.238]) by mx1.freebsd.org (Postfix) with ESMTP id 74A7213C459 for ; Thu, 20 Dec 2007 18:18:57 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.40) with ESMTP; Thu, 20 Dec 2007 10:04:49 -0800 Received: from julian-mac.elischer.org (localhost [127.0.0.1]) by idiom.com (Postfix) with ESMTP id 2F984126FD6; Thu, 20 Dec 2007 10:04:49 -0800 (PST) Message-ID: <476AAEC0.2070008@elischer.org> Date: Thu, 20 Dec 2007 10:04:48 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: Jeff Roberson References: <20071219211025.T899@desktop> In-Reply-To: <20071219211025.T899@desktop> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: Linux compatible setaffinity. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:18:57 -0000 Jeff Roberson wrote: > I have implemented a linux compatible sched_setaffinity() call which is > somewhat crippled. This allows a userspace process to supply a bitmask > of processors which it will run on. I have copied the linux interface > such that it should be api compatible because I believe it is a sensible > interface and they beat us to it by 3 years. > > My implementation is crippled in that it supports binding by curthread > only and to a single cpu only. Neither of the schedulers presently > support binding to multiple cpus or binding a non-curthread thread. > This property is not inherited by forked threads and does not effect > other threads in the same process. These two limitations can gradually > be weakened without effecting the syscall api. > > The linux api is: > int sched_setaffinity(pid_t pid, unsigned int cpusetsize, cpu_set_t > *mask); > > The cpu_set_t is the same as a fdset for select. The cpusetsize > argument is used to determine the size of the array in mask. > > I'm mostly interested in feedback on how best to reduce the namespace > pollution and avoid pulling the sched.h file into the generated syscall > files (sysproto.h, etc). Anyone who feels this is a terrible interface > for such a thing should speak up now. > > I also feel that in the medium term we will have to deal with machines > with more cores than bits in their native word. Using these CPU_SET, > CPU_CLR macros is a fine way to deal with this issue. > > I also have a primitive 'taskset', although I don't like the name, it > allows you to run arbitrary programs bound to a single cpu. makes sense to me.. don't forget the man page! > > Thanks, > Jeff > > > ------------------------------------------------------------------------ > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:26:18 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28FEB16A417; Thu, 20 Dec 2007 18:26:18 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from harmony.bsdimp.com (bsdimp.com [199.45.160.85]) by mx1.freebsd.org (Postfix) with ESMTP id BF0C413C45D; Thu, 20 Dec 2007 18:26:17 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from localhost (localhost [127.0.0.1]) by harmony.bsdimp.com (8.14.1/8.14.1) with ESMTP id lBKILBYO021179; Thu, 20 Dec 2007 11:21:11 -0700 (MST) (envelope-from imp@bsdimp.com) Date: Thu, 20 Dec 2007 11:24:05 -0700 (MST) Message-Id: <20071220.112405.-713486157.imp@bsdimp.com> To: sobomax@freebsd.org From: "M. Warner Losh" In-Reply-To: <47682ED1.7000702@FreeBSD.org> References: <20071218120359.E15521@fledge.watson.org> <47682ED1.7000702@FreeBSD.org> X-Mailer: Mew version 5.2 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org, rwatson@freebsd.org, current@freebsd.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:26:18 -0000 In message: <47682ED1.7000702@FreeBSD.org> Maxim Sobolev writes: : Robert Watson wrote: : > buffer, kernel message buffer, kernel configuration (if compiled into : > the kernel), panic message, and kernel version string. These are : : : Just a sidenote - maybe as part of this change it makes sense to make : compiling configuration into a kernel opt-out, not opt-in? We are in : 21st century, nobody really cares about saving few kilobytes of kernel : memory anymore. In the embedded world, it matters. And we already have opt-out. 'include GENERIC; nodev X, nodev Y, nodev Z' Warner From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:27:12 2007 Return-Path: Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B937E16A419 for ; Thu, 20 Dec 2007 18:27:12 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id 327A013C45A for ; Thu, 20 Dec 2007 18:27:11 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8q) with ESMTP id 225307900-1834499 for multiple; Thu, 20 Dec 2007 13:25:04 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id lBKIQThg077000; Thu, 20 Dec 2007 13:26:52 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-arch@FreeBSD.org Date: Thu, 20 Dec 2007 11:38:55 -0500 User-Agent: KMail/1.9.6 References: <20071218092222.GA9695@freebsd.org> In-Reply-To: <20071218092222.GA9695@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200712201138.56423.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 20 Dec 2007 13:26:52 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/5192/Thu Dec 20 12:24:15 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@FreeBSD.org, Roman Divacky Subject: Re: final decision about *at syscalls X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:27:12 -0000 On Tuesday 18 December 2007 04:22:22 am Roman Divacky wrote: > Dear arch@ > > Over this summer I was working (among other things) on *at family of syscalls > kindly sponsored by Google (in their Summer of Code). The resulting patch is > almost finished but I need to decide one design question. If you are not interested > in *at/namei feel free to skip this mail. > > The *at syscalls are a threads-oriented extension to basic file syscalls (think > of open(), fstat(), etc.) adding the possibility to specify from where the search > for relative path should start. > > image that we have /tmp/foo/bar > > and CWD is set to "/tmp/", and the process has opened "foo" as dirfd. with ordinary > open() syscall you have to either > > chdir("/tmp/foo");open("./bar"); > > or > > open("/tmp/foo/bar"); > > The first approach is problematic because it changes CWD for all threads in the process, > the second is prone to race-conditions as some of the components of the path can > change in parallel with the "open". > > So POSIX introduced a new API, called "Extended API set part 2, ISBN: 1-931624-67-4" (at > least this was the latest when I looked last time), which solves that by introducing "*at" > syscalls that supply an fd of previously opened directory which is used instead of CWD > for searching relative path, ie. the previous example becomes > > dirfd = open("/tmp/foo"); openat("foo", dirfd); > > I implemented the whole API as native FreeBSD syscalls + in linuxulator emulation layer. > Here's the problem: > > There are two approaches to the name translation from "filedescriptor" to the "vnode". > > 1) we can do it in the kern_fooat() syscall and pass namei() the resulting vnode > 2) we can pass namei() the filedescriptor and do the translation there > > PROs of #1: > > o namei() does not need to know about the curthread, you can use this *at > ability for different purposes, it's cleaner (imho) > > PROs of #2 > > o raceless implementation > o no code duplication > > CONs of #1 > > o some very small code duplication (the translation is done in every > kern_fooat() function) > o there is a race between the name translation and the actual use of the result > of the translation that needs to be handled, the "path_to_file" string is copied > to the kernel space twice hence a race > > CONs of #2 > > o namei is made thread dependant > > Please tell me what approach you like more. I personally favour #1 because I don't like namei() > being thread dependant, Kostik Belousov prefers #2. Considering Robert's paper on security race problems in things like systrace stemming from when you copy parameters out of userland and into the kernel multiple times, I think #2 is definitely the better choice. Also, namei() is already thread aware AFAICT since 'struct componentname' already contains a 'cnp_thread' member (was 'cnp_proc' in 4.x). -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:35:52 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 81AD716A481; Thu, 20 Dec 2007 18:35:52 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from sippysoft.com (gk.360sip.com [72.236.70.226]) by mx1.freebsd.org (Postfix) with ESMTP id 359B413C461; Thu, 20 Dec 2007 18:35:51 +0000 (UTC) (envelope-from sobomax@FreeBSD.org) Received: from [192.168.0.3] ([204.244.149.125]) (authenticated bits=0) by sippysoft.com (8.13.8/8.13.8) with ESMTP id lBKIZlTN007982 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 20 Dec 2007 10:35:48 -0800 (PST) (envelope-from sobomax@FreeBSD.org) Message-ID: <476AB5EC.9060204@FreeBSD.org> Date: Thu, 20 Dec 2007 10:35:24 -0800 From: Maxim Sobolev Organization: Sippy Software, Inc. User-Agent: Thunderbird 2.0.0.9 (Windows/20071031) MIME-Version: 1.0 To: "M. Warner Losh" References: <20071218120359.E15521@fledge.watson.org> <47682ED1.7000702@FreeBSD.org> <20071220.112405.-713486157.imp@bsdimp.com> In-Reply-To: <20071220.112405.-713486157.imp@bsdimp.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@FreeBSD.org, rwatson@FreeBSD.org, current@FreeBSD.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:35:52 -0000 M. Warner Losh wrote: > In message: <47682ED1.7000702@FreeBSD.org> > Maxim Sobolev writes: > : Robert Watson wrote: > : > buffer, kernel message buffer, kernel configuration (if compiled into > : > the kernel), panic message, and kernel version string. These are > : > : > : Just a sidenote - maybe as part of this change it makes sense to make > : compiling configuration into a kernel opt-out, not opt-in? We are in > : 21st century, nobody really cares about saving few kilobytes of kernel > : memory anymore. > > In the embedded world, it matters. > > And we already have opt-out. 'include GENERIC; nodev X, nodev Y, > nodev Z' So what is your point, exactly? In embedded word nobody runs GENERIC, and you know it better than anybody else. My point that Joe User, who runs GENERIC or slightly modified GENERIC should have kernel config compiled into kernel so that when something happens this information is available for debugging purposes. -Maxim From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:38:07 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2D65816A476 for ; Thu, 20 Dec 2007 18:38:07 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outI.internet-mail-service.net (outI.internet-mail-service.net [216.240.47.232]) by mx1.freebsd.org (Postfix) with ESMTP id 0C12413C461 for ; Thu, 20 Dec 2007 18:38:06 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.40) with ESMTP; Thu, 20 Dec 2007 10:38:06 -0800 Received: from julian-mac.elischer.org (localhost [127.0.0.1]) by idiom.com (Postfix) with ESMTP id 882B2126D39; Thu, 20 Dec 2007 10:38:05 -0800 (PST) Message-ID: <476AB68C.30201@elischer.org> Date: Thu, 20 Dec 2007 10:38:04 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.9 (Macintosh/20071031) MIME-Version: 1.0 To: dima <_pppp@mail.ru> References: <20071220135342.O67327@fledge.watson.org> In-Reply-To: Content-Type: text/plain; charset=KOI8-R; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@FreeBSD.org, Robert Watson , net@FreeBSD.org Subject: Re: TCP Projects for 8.0 - first cut wiki page X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:38:07 -0000 dima wrote: >> Per earlier e-mail, I've created a page to track the various on-going >> projects: >> >> http://wiki.freebsd.org/TCPProjects8 >> >> Rui has already kindly added the TCP ECN work to the page. > > As I know, we have a single swi:net thread in the kernel yet. Are there any plans to make several such threads? If yes, this activity isn't mentioned in wiki. > There are 2 ideas: > 1. per-core thread > 2. per-interface thread and for my system with 64 virtual interfaces? > I like the second more. > > Regards, > Dmitriy Marchenko. > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:41:21 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F31E116A418; Thu, 20 Dec 2007 18:41:20 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from harmony.bsdimp.com (bsdimp.com [199.45.160.85]) by mx1.freebsd.org (Postfix) with ESMTP id A975413C467; Thu, 20 Dec 2007 18:41:20 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from localhost (localhost [127.0.0.1]) by harmony.bsdimp.com (8.14.1/8.14.1) with ESMTP id lBKIc6eX021389; Thu, 20 Dec 2007 11:38:06 -0700 (MST) (envelope-from imp@bsdimp.com) Date: Thu, 20 Dec 2007 11:41:01 -0700 (MST) Message-Id: <20071220.114101.228909062.imp@bsdimp.com> To: sobomax@freebsd.org From: "M. Warner Losh" In-Reply-To: <476AB5EC.9060204@FreeBSD.org> References: <47682ED1.7000702@FreeBSD.org> <20071220.112405.-713486157.imp@bsdimp.com> <476AB5EC.9060204@FreeBSD.org> X-Mailer: Mew version 5.2 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org, rwatson@freebsd.org, current@freebsd.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:41:21 -0000 In message: <476AB5EC.9060204@FreeBSD.org> Maxim Sobolev writes: : M. Warner Losh wrote: : > In message: <47682ED1.7000702@FreeBSD.org> : > Maxim Sobolev writes: : > : Robert Watson wrote: : > : > buffer, kernel message buffer, kernel configuration (if compiled into : > : > the kernel), panic message, and kernel version string. These are : > : : > : : > : Just a sidenote - maybe as part of this change it makes sense to make : > : compiling configuration into a kernel opt-out, not opt-in? We are in : > : 21st century, nobody really cares about saving few kilobytes of kernel : > : memory anymore. : > : > In the embedded world, it matters. : > : > And we already have opt-out. 'include GENERIC; nodev X, nodev Y, : > nodev Z' : : So what is your point, exactly? In embedded word nobody runs GENERIC, : and you know it better than anybody else. : : My point that Joe User, who runs GENERIC or slightly modified GENERIC : should have kernel config compiled into kernel so that when something : happens this information is available for debugging purposes. That makes sense. I thought you were talking about something different, so I'm just going to say "I'm in violent agreement." Warner From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 18:43:15 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4935516A418; Thu, 20 Dec 2007 18:43:15 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from speedfactory.net (mail6.speedfactory.net [66.23.216.219]) by mx1.freebsd.org (Postfix) with ESMTP id DC52813C4CE; Thu, 20 Dec 2007 18:43:14 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from server.baldwin.cx (unverified [66.23.211.162]) by speedfactory.net (SurgeMail 3.8q) with ESMTP id 225307900-1834499 for multiple; Thu, 20 Dec 2007 13:25:04 -0500 Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id lBKIQThg077000; Thu, 20 Dec 2007 13:26:52 -0500 (EST) (envelope-from jhb@FreeBSD.org) From: John Baldwin To: freebsd-arch@FreeBSD.org Date: Thu, 20 Dec 2007 11:38:55 -0500 User-Agent: KMail/1.9.6 References: <20071218092222.GA9695@freebsd.org> In-Reply-To: <20071218092222.GA9695@freebsd.org> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200712201138.56423.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Thu, 20 Dec 2007 13:26:52 -0500 (EST) X-Virus-Scanned: ClamAV 0.91.2/5192/Thu Dec 20 12:24:15 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: arch@FreeBSD.org, Roman Divacky Subject: Re: final decision about *at syscalls X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 18:43:15 -0000 On Tuesday 18 December 2007 04:22:22 am Roman Divacky wrote: > Dear arch@ > > Over this summer I was working (among other things) on *at family of syscalls > kindly sponsored by Google (in their Summer of Code). The resulting patch is > almost finished but I need to decide one design question. If you are not interested > in *at/namei feel free to skip this mail. > > The *at syscalls are a threads-oriented extension to basic file syscalls (think > of open(), fstat(), etc.) adding the possibility to specify from where the search > for relative path should start. > > image that we have /tmp/foo/bar > > and CWD is set to "/tmp/", and the process has opened "foo" as dirfd. with ordinary > open() syscall you have to either > > chdir("/tmp/foo");open("./bar"); > > or > > open("/tmp/foo/bar"); > > The first approach is problematic because it changes CWD for all threads in the process, > the second is prone to race-conditions as some of the components of the path can > change in parallel with the "open". > > So POSIX introduced a new API, called "Extended API set part 2, ISBN: 1-931624-67-4" (at > least this was the latest when I looked last time), which solves that by introducing "*at" > syscalls that supply an fd of previously opened directory which is used instead of CWD > for searching relative path, ie. the previous example becomes > > dirfd = open("/tmp/foo"); openat("foo", dirfd); > > I implemented the whole API as native FreeBSD syscalls + in linuxulator emulation layer. > Here's the problem: > > There are two approaches to the name translation from "filedescriptor" to the "vnode". > > 1) we can do it in the kern_fooat() syscall and pass namei() the resulting vnode > 2) we can pass namei() the filedescriptor and do the translation there > > PROs of #1: > > o namei() does not need to know about the curthread, you can use this *at > ability for different purposes, it's cleaner (imho) > > PROs of #2 > > o raceless implementation > o no code duplication > > CONs of #1 > > o some very small code duplication (the translation is done in every > kern_fooat() function) > o there is a race between the name translation and the actual use of the result > of the translation that needs to be handled, the "path_to_file" string is copied > to the kernel space twice hence a race > > CONs of #2 > > o namei is made thread dependant > > Please tell me what approach you like more. I personally favour #1 because I don't like namei() > being thread dependant, Kostik Belousov prefers #2. Considering Robert's paper on security race problems in things like systrace stemming from when you copy parameters out of userland and into the kernel multiple times, I think #2 is definitely the better choice. Also, namei() is already thread aware AFAICT since 'struct componentname' already contains a 'cnp_thread' member (was 'cnp_proc' in 4.x). -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 20:04:59 2007 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 13D4816A419 for ; Thu, 20 Dec 2007 20:04:59 +0000 (UTC) (envelope-from hselasky@c2i.net) Received: from swip.net (mailfe04.swip.net [212.247.154.97]) by mx1.freebsd.org (Postfix) with ESMTP id 9B81213C461 for ; Thu, 20 Dec 2007 20:04:58 +0000 (UTC) (envelope-from hselasky@c2i.net) X-Cloudmark-Score: 0.000000 [] Received: from [85.19.218.45] (account mc467741@c2i.net [85.19.218.45] verified) by mailfe04.swip.net (CommuniGate Pro SMTP 5.1.13) with ESMTPA id 737923474; Thu, 20 Dec 2007 20:04:55 +0100 From: Hans Petter Selasky To: freebsd-arch@freebsd.org Date: Thu, 20 Dec 2007 20:05:32 +0100 User-Agent: KMail/1.9.7 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200712202005.33263.hselasky@c2i.net> Cc: Poul-Henning Kamp , Alfred Perlstein Subject: More leaves on the device tree ? X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 20:04:59 -0000 Hi, I'm currently working on USB and I have been thinking about a simple way to find what devices an USB device creates, and how to easily present that information to the user. I know there is "devinfo" and I would like to extend this utility to also show which devices under /dev belongs to the device. Implementation: "make_dev" takes an additional "device_t parent_device" argument and creates a child device with some magic flags set. Any comments ? --HPS From owner-freebsd-arch@FreeBSD.ORG Thu Dec 20 21:45:39 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9DF516A419; Thu, 20 Dec 2007 21:45:39 +0000 (UTC) (envelope-from vadim_nuclight@mail.ru) Received: from mx28.mail.ru (mx28.mail.ru [194.67.23.67]) by mx1.freebsd.org (Postfix) with ESMTP id 8D3B213C44B; Thu, 20 Dec 2007 21:45:39 +0000 (UTC) (envelope-from vadim_nuclight@mail.ru) Received: from mx40.mail.ru (mx40.mail.ru [194.67.23.36]) by mx28.mail.ru (mPOP.Fallback_MX) with ESMTP id DD96C7889BD; Thu, 20 Dec 2007 21:49:36 +0300 (MSK) Received: from [78.140.3.25] (port=18063 helo=nuclight.avtf.net) by mx40.mail.ru with esmtp id 1J5QSb-000HS6-00; Thu, 20 Dec 2007 21:49:33 +0300 To: "Julian Elischer" , dima <_pppp@mail.ru> References: <20071220135342.O67327@fledge.watson.org> <476AB68C.30201@elischer.org> Message-ID: Date: Fri, 21 Dec 2007 00:49:29 +0600 From: "Vadim Goncharov" Organization: AVTF TPU Hostel Content-Type: text/plain; format=flowed; delsp=yes; charset=koi8-r MIME-Version: 1.0 Content-Transfer-Encoding: 8bit In-Reply-To: <476AB68C.30201@elischer.org> User-Agent: Opera M2/7.54 (Win32, build 3865) Cc: arch@freebsd.org, Robert Watson , net@freebsd.org Subject: Re: TCP Projects for 8.0 - first cut wiki page X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 20 Dec 2007 21:45:40 -0000 21.12.07 @ 00:38 Julian Elischer wrote: >> As I know, we have a single swi:net thread in the kernel yet. Are >> there any plans to make several such threads? If yes, this activity >> isn't mentioned in wiki. >> There are 2 ideas: >> 1. per-core thread >> 2. per-interface thread >> I like the second more. > > and for my system with 64 virtual interfaces? Surely, per-core thread is enough - why have too much synchronization overhead?.. A computer is a state machine. Threads are for people who can't program state machines. (c) Alan Cox -- WBR, Vadim Goncharov From owner-freebsd-arch@FreeBSD.ORG Fri Dec 21 01:39:03 2007 Return-Path: Delivered-To: arch@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DA6D416A420 for ; Fri, 21 Dec 2007 01:39:03 +0000 (UTC) (envelope-from davidxu@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D06D613C448; Fri, 21 Dec 2007 01:39:03 +0000 (UTC) (envelope-from davidxu@FreeBSD.org) Received: from apple.my.domain (root@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.2/8.14.2) with ESMTP id lBL1cwWj080728; Fri, 21 Dec 2007 01:39:00 GMT (envelope-from davidxu@freebsd.org) Message-ID: <476B1973.6070902@freebsd.org> Date: Fri, 21 Dec 2007 09:40:03 +0800 From: David Xu User-Agent: Thunderbird 2.0.0.9 (X11/20071211) MIME-Version: 1.0 To: Jeff Roberson References: <20071219211025.T899@desktop> In-Reply-To: <20071219211025.T899@desktop> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@FreeBSD.org Subject: Re: Linux compatible setaffinity. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Dec 2007 01:39:03 -0000 Jeff Roberson wrote: > I have implemented a linux compatible sched_setaffinity() call which is > somewhat crippled. This allows a userspace process to supply a bitmask > of processors which it will run on. I have copied the linux interface > such that it should be api compatible because I believe it is a sensible > interface and they beat us to it by 3 years. > > My implementation is crippled in that it supports binding by curthread > only and to a single cpu only. Neither of the schedulers presently > support binding to multiple cpus or binding a non-curthread thread. > This property is not inherited by forked threads and does not effect > other threads in the same process. These two limitations can gradually > be weakened without effecting the syscall api. > > The linux api is: > int sched_setaffinity(pid_t pid, unsigned int cpusetsize, cpu_set_t > *mask); > > The cpu_set_t is the same as a fdset for select. The cpusetsize > argument is used to determine the size of the array in mask. > > I'm mostly interested in feedback on how best to reduce the namespace > pollution and avoid pulling the sched.h file into the generated syscall > files (sysproto.h, etc). Anyone who feels this is a terrible interface > for such a thing should speak up now. > > I also feel that in the medium term we will have to deal with machines > with more cores than bits in their native word. Using these CPU_SET, > CPU_CLR macros is a fine way to deal with this issue. > > I also have a primitive 'taskset', although I don't like the name, it > allows you to run arbitrary programs bound to a single cpu. > > Thanks, > Jeff > I don't say no to these interfaces, but there is a need to tell user which cpus are sharing cache, or memory distance is closest enough, and which cpus are servicing interrupts, e.g, network interrupt and disks etc, etc, otherwise, blindly setting cpu affinity mask only can shoot itself in the foot. Regards, David Xu From owner-freebsd-arch@FreeBSD.ORG Fri Dec 21 02:33:10 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F80B16A41B for ; Fri, 21 Dec 2007 02:33:10 +0000 (UTC) (envelope-from qpadla@gmail.com) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.169]) by mx1.freebsd.org (Postfix) with ESMTP id BA40F13C46E for ; Fri, 21 Dec 2007 02:33:04 +0000 (UTC) (envelope-from qpadla@gmail.com) Received: by ug-out-1314.google.com with SMTP id y2so764129uge.37 for ; Thu, 20 Dec 2007 18:33:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:message-id; bh=LJYFj1tMylmBJD4Qu08kZ1OktaDcd/nzxvgD65D+3Co=; b=TMbUFzSkYt3KqiserKdNQTx24qCI/UFx+UjiKb/5Zbtb8lFE04Ojzk3QcBtZeRR/3cwa9I6rxYpvYzyebnkPLlCHA2LBuls8TtiejBuOKwKd956dOzo3i2L/oG1EXkSWPxdheCZwsM/Nw3BVNhpFFkPXyTNAafP2b4D0DFQ6F5g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=from:reply-to:to:subject:date:user-agent:cc:references:in-reply-to:mime-version:content-type:content-transfer-encoding:message-id; b=TjdSH/pGiMJ4aS3cDr+Up1CG10nmb5x3fxfhhNxTQLXY8uRzH19kQLY7R+rJiYc6WUpF6MAjAFU/2/eLkjJZT2C0fNtLQLad7y008UFHHzUm3jpi0I7zuspBGomXsZjEDYSKMW5dN28gCcG72AhTkyC/x9L3G57XboHtSWLLRtQ= Received: by 10.66.221.5 with SMTP id t5mr4148781ugg.83.1198204383185; Thu, 20 Dec 2007 18:33:03 -0800 (PST) Received: from orion ( [89.162.141.1]) by mx.google.com with ESMTPS id i39sm9705372ugd.32.2007.12.20.18.33.01 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 20 Dec 2007 18:33:02 -0800 (PST) From: Nikolay Pavlov To: Robert Watson Date: Fri, 21 Dec 2007 04:32:54 +0200 User-Agent: KMail/1.9.6 (enterprise 0.20070907.709405) References: <20071218120359.E15521@fledge.watson.org> <200712201532.41123.qpadla@gmail.com> <20071220134033.T94754@fledge.watson.org> In-Reply-To: <20071220134033.T94754@fledge.watson.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart1383799.v4Zq5TN8JT"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200712210432.58751.qpadla@gmail.com> Cc: arch@freebsd.org, current@freebsd.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: qpadla@gmail.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Dec 2007 02:33:10 -0000 --nextPart1383799.v4Zq5TN8JT Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Thursday 20 December 2007 15:42:21 Robert Watson wrote: > On Thu, 20 Dec 2007, Nikolay Pavlov wrote: > > It looks like some files is not included in the patch. I have this > > error: > > Unfortunately, p4 doesn't include a -N argument to its diff2 command, so > I attached the new files directly in the tarball (including the kernel > ones). Could you check and see if they are there? If you untar the > tarball in your top-level src directory, it should extract one man page > into share/man/man4, a few files into src/sbin/ddb (note: I have not > updated the sbin Makefile so you'll need to build it by hand), three > files into src/sys/ddb, and then a patch which you apply. > Thanks. I've figure this out, but still see some problems. I've created a=20 panic during the manual kernel debuging session and after reboot found a=20 textdump file in a /var/crash directory, but unexpectedly i can not untar=20 it:=20 root@orion-vm:/var/crash# bsdtar -xf textdump.tar.0 = = =20 bsdtar: Unrecognized archive format: Inappropriate file type or format bsdtar: Error exit delayed from previous errors. root@orion-vm:/var/crash# file textdump.tar.0 = = =20 textdump.tar.0: data root@orion-vm:/var/crash# ls -la textdump.tar.0 = = =20 =2Drw------- 1 root wheel 35483648 Dec 21 02:17 textdump.tar.0 =2D-=20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 =2D Best regards, Nikolay Pavlov. <<<----------------------------------- = =20 =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --nextPart1383799.v4Zq5TN8JT Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQBHayXa/2R6KvEYGaIRAv0CAKCPH0mqRKc3Bk0IhhHvHcNJ9/8owQCg7RCv U91/esJ6Ag+SritrJRvntJo= =k/Ci -----END PGP SIGNATURE----- --nextPart1383799.v4Zq5TN8JT-- From owner-freebsd-arch@FreeBSD.ORG Fri Dec 21 09:42:27 2007 Return-Path: Delivered-To: arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 73B9A16A41A; Fri, 21 Dec 2007 09:42:27 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 4D3F613C46E; Fri, 21 Dec 2007 09:42:27 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id BD2D948AC4; Fri, 21 Dec 2007 04:42:26 -0500 (EST) Date: Fri, 21 Dec 2007 09:42:26 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Nikolay Pavlov In-Reply-To: <200712210432.58751.qpadla@gmail.com> Message-ID: <20071221093923.W67327@fledge.watson.org> References: <20071218120359.E15521@fledge.watson.org> <200712201532.41123.qpadla@gmail.com> <20071220134033.T94754@fledge.watson.org> <200712210432.58751.qpadla@gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org, current@freebsd.org Subject: Re: DDB scripting, output capture, and textdumps X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 21 Dec 2007 09:42:27 -0000 On Fri, 21 Dec 2007, Nikolay Pavlov wrote: > On Thursday 20 December 2007 15:42:21 Robert Watson wrote: >> On Thu, 20 Dec 2007, Nikolay Pavlov wrote: >>> It looks like some files is not included in the patch. I have this error: >> >> Unfortunately, p4 doesn't include a -N argument to its diff2 command, so I >> attached the new files directly in the tarball (including the kernel ones). >> Could you check and see if they are there? If you untar the tarball in >> your top-level src directory, it should extract one man page into >> share/man/man4, a few files into src/sbin/ddb (note: I have not updated the >> sbin Makefile so you'll need to build it by hand), three files into >> src/sys/ddb, and then a patch which you apply. > > Thanks. I've figure this out, but still see some problems. I've created a > panic during the manual kernel debuging session and after reboot found a > textdump file in a /var/crash directory, but unexpectedly i can not untar > it: Sounds like a problem. Could you send me, via private e-mail, the output of uname -a, the approximate script you ran, and a copy of the textdump (gzip'd and attached, or via a URL or such)? The dump file looks quite large, larger than I'd really expect, so it could be there's an extraction problem. Robert N M Watson Computer Laboratory University of Cambridge > > root@orion-vm:/var/crash# bsdtar -xf textdump.tar.0 > bsdtar: Unrecognized archive format: Inappropriate file type or format > bsdtar: Error exit delayed from previous errors. > root@orion-vm:/var/crash# file textdump.tar.0 > textdump.tar.0: data > root@orion-vm:/var/crash# ls -la textdump.tar.0 > -rw------- 1 root wheel 35483648 Dec 21 02:17 textdump.tar.0 > > > -- > ====================================================================== > - Best regards, Nikolay Pavlov. <<<----------------------------------- > ====================================================================== > > From owner-freebsd-arch@FreeBSD.ORG Sat Dec 22 18:37:51 2007 Return-Path: Delivered-To: arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5AB2016A41A; Sat, 22 Dec 2007 18:37:51 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 4B95F13C458; Sat, 22 Dec 2007 18:37:51 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id A621B472D5; Sat, 22 Dec 2007 13:37:50 -0500 (EST) Date: Sat, 22 Dec 2007 18:37:50 +0000 (GMT) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: David Xu In-Reply-To: <476B1973.6070902@freebsd.org> Message-ID: <20071222183700.L5866@fledge.watson.org> References: <20071219211025.T899@desktop> <476B1973.6070902@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@FreeBSD.org Subject: Re: Linux compatible setaffinity. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 22 Dec 2007 18:37:51 -0000 On Fri, 21 Dec 2007, David Xu wrote: > I don't say no to these interfaces, but there is a need to tell user which > cpus are sharing cache, or memory distance is closest enough, and which cpus > are servicing interrupts, e.g, network interrupt and disks etc, etc, > otherwise, blindly setting cpu affinity mask only can shoot itself in the > foot. While the Mac OS X API is pretty Mach-specific, it's worth taking a look at their recently-announced affinity API: http://developer.apple.com/releasenotes/Performance/RN-AffinityAPI/index.html Robert N M Watson Computer Laboratory University of Cambridge