From owner-freebsd-virtualization@FreeBSD.ORG Mon Dec 15 15:17:27 2008 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 285B91065672 for ; Mon, 15 Dec 2008 15:17:27 +0000 (UTC) (envelope-from jason.fines@gmail.com) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.247]) by mx1.freebsd.org (Postfix) with ESMTP id BE3898FC0C for ; Mon, 15 Dec 2008 15:17:26 +0000 (UTC) (envelope-from jason.fines@gmail.com) Received: by an-out-0708.google.com with SMTP id c2so923701anc.13 for ; Mon, 15 Dec 2008 07:17:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:to :subject:cc:in-reply-to:mime-version:content-type:references; bh=H0LtWARHOlCcYr/KwVLan7xdh4wrAgmUluZDL5Ev+BE=; b=lr+LYEhJEysuSm8yR5tr2wI+HnM9PQa3eETxrkgVZBPgDmhMk2e1vkmCbuB5oRqUOd CYfaCVBePSqdc4AdpVHEjMrfS0lGo4416iK03K1jo5H30EasfEKPlm8m3/n4yM9Te4f1 +/RM6wMzGux29Bv7BIxVIRbQviWISB3A36XIg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:to:subject:cc:in-reply-to:mime-version :content-type:references; b=hl9FMQNjH4GW/ElCUMi94rA/zy+6K2w1W2HXvqkxheyciqhdfCxP5nrhMQoW5qdmeU YzjIoRAUVSsuqk9Jsuwi1Jvzzvo2hZA0qkHEcMzVTLduyLAtCoMQ6tcCMnnYrUW7DQ3O P7v2TGx9BgD67TR0MHd5GdlAVVrTExUDhW0+U= Received: by 10.100.172.17 with SMTP id u17mr4542338ane.105.1229354245496; Mon, 15 Dec 2008 07:17:25 -0800 (PST) Received: by 10.100.9.13 with HTTP; Mon, 15 Dec 2008 07:17:25 -0800 (PST) Message-ID: <5e6025b70812150717h500ab3c4tb1319dee1572f711@mail.gmail.com> Date: Mon, 15 Dec 2008 10:17:25 -0500 From: "Jason Fines" To: "Jorge Sanchez" In-Reply-To: MIME-Version: 1.0 References: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-virtualization@freebsd.org Subject: Re: Question About TCP Reassembly Inside VImages X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Dec 2008 15:17:27 -0000 Hi Jorge, Sorry about the delay in my response. I have also been able to set maxqlen, but have been unsuccessful in setting maxsegments. I just recently tried upgrading to 7_RELENG with the vimage package http://imunes.tel.fer.hr/virtnet/vimage_7-20081015.tgz however I am still experiencing the same problems. I even went so far as to set the variable 'V_tcp_reass_maxseg' explicitly to 4096 on lines 111, and 122 of tcp_reass.c in the kernel source code, however, the maxsegments sysctl variable is still zero in my vimages!!! Have you tried anything else? I am also prohibited from using this amazing technology because of this issue. Does anyone else have any information about this? Marko, would upgrading FreeBSD 8.0 make any difference? Any help would be greatly appreciated. Thanks, Jason On Thu, Nov 27, 2008 at 11:57 AM, Jorge Sanchez wrote: > > > > > Hola Jason, > > I also observed a similar phenomenon on my system's vimages. I have several > thousands dropped packets due to "insufficient memory" (the counter you > mention in netstat -m for me is also incremented in the > net.inet.tcp.reass.overflows read-only sysctl variable) and I routinely have > TCP connections dropped within vimages because of it. I think that the TCP > packet reassembly queue length is essentially zero once options VIMAGE is > enabled... which would explain my problems when trying to contact hosts that > are on flaky links or are situated very far away hop-wise. > > I think there is something very wrong with the TCP reassembly when options > VIMAGE is enabled. Did you try increasing the net.inet.reass.maxqlen or > net.inet.reass.maxsegments sysctls? I was able to increase maxqlen but > maxsegments must be set in loader.conf and this value is not inherited into > any vimages I create after booting! :-( > > If you come up with a fix, I would appreciate it too since this prevents me > from performing realistic TCP testing within the virtualization framework. > > Adios! > Jorge > > > From: jason.fines@gmail.com > To: > freebsd-virtualization@freebsd.org > Subject: Re: Question About TCP Reassembly Inside VImages > Date: Sat, 22 Nov 2008 08:52:16 -0500 > Thanks Julian, > > I suspect you are correct as nmbclusters is a system wide sysctl variable > set at boot time and although V_tcp_reass_maxseg is set per vimage it is > the > result of a constant operation done on nmbclusters (nmbclusters / 16). > > What I've described is what I suspect is the root of my problem. The > manifestation of this problem is that TCP packets passing through my > vimage(s) are not reassembled when they are out of order and I get an > exceptionally high value reported by netstat -m stating that packets were > dropped due to "insufficient memory". Posts I've found on the net point to > the reassembly queue length, which in the vimages is zero for some reason. > > Perhaps this additional information will help clarify my exact problem. > > Thanks, > Jason > > On Sat, Nov 22, 2008 at 5:12 AM, Julian Elischer >wrote: > > > Jason Fines wrote: > > > >> Hello all, > >> > >> I've got a question about setting the sysctl variable > >> net.inet.tcp.reass.maxsegments to a non-zero value inside my vimages. > I'm > >> currently running the FreeBSD 7 with the VIMAGE package available at > >> http://imunes.tel.fer.hr/virtnet/vimage_7-20081015.tgz. > >> > >> My problem is with TCP reassembly support inside of the vimages, namely > >> with > >> the tcp.reass.maxsegments sysctl variable. I've tracked down where in > the > >> code the variable is set to line 122 in tcp_reass_init() of > >> netinet/tcp_reass.c: "V_tcp_reass_maxseg = nmbclusters / 16;". The line > >> clearly reads that maxsegments should be set to "nmbclusters /16", in > the > >> main OS (not in any vimage) the value is correctly set to 1/16 of what > my > >> nmbclusters sysctl variable is set to. However, inside all my vimages > >> nmbclusters is set correctly, while reass.maxsegments is incorrectly set > >> to > >> zero!!! > >> > > > > V_tcp_reass_maxseg is a macro that hides the fact that > > tcp_reass_maxseg is a PER Vimage variable. > > > > Part of the patch > > is to make some sysctls be per-vimage. I do not know exactly > > about that one.. I suspect it is actually a read-only > > whole-system value, and not per vimage. > > > > > > > > > > > >> Is it possible that nmbclusters when read on line 122 of > >> netinet/tcp_reass.c > >> is zero? Has anyone else experienced this problem? Is TCP reassembly > not > >> supported/tested inside vimages? > >> > >> Any help in this area would be greatly appreciated. > >> > >> Thanks, > >> Jason > >> > >> P.S. This technology is phenomenal, and thanks to everyone who is > involved > >> developing it. > >> _______________________________________________ > >> freebsd-virtualization at freebsd.org mailing list > >> http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > >> To unsubscribe, send any mail to " > >> freebsd-virtualization-unsubscribe at freebsd.org" > >> > > > _________________________________________________________________ > > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to " > freebsd-virtualization-unsubscribe@freebsd.org" > From owner-freebsd-virtualization@FreeBSD.ORG Mon Dec 15 17:57:35 2008 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9D2491065676 for ; Mon, 15 Dec 2008 17:57:35 +0000 (UTC) (envelope-from bseklecki@collaborativefusion.com) Received: from collaborativefusion.com (mx01.pub.collaborativefusion.com [206.210.89.201]) by mx1.freebsd.org (Postfix) with ESMTP id 3F7478FC38 for ; Mon, 15 Dec 2008 17:57:35 +0000 (UTC) (envelope-from bseklecki@collaborativefusion.com) Received: from Internal Mail-Server by mx01 (envelope-from bseklecki@collaborativefusion.com) with SMTP; 15 Dec 2008 12:57:34 -0500 From: "Brian A. Seklecki" To: Philipp Wuensche In-Reply-To: <49418BD9.8080105@h3q.com> References: <20081201085229.D80401@maildrop.int.zabbadoz.net> <20081201122937.81475f0zhfsjya4o@webmail.leidinger.net> <6ae50c2d0812021800x791d2cfeh45d590de120f76df@mail.gmail.com> <1228483574.2805.499.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <86skp2l804.fsf@ds4.des.no> <1228507529.2805.539.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <49418BD9.8080105@h3q.com> Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-wUWBZhNw6rPpfbPTQaJy" Organization: Collaborative Fusion, Inc. Date: Mon, 15 Dec 2008 12:57:34 -0500 Message-Id: <1229363854.1722.39.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> Mime-Version: 1.0 X-Mailer: Evolution 2.22.3.1 (2.22.3.1-1.fc9) X-Mailman-Approved-At: Mon, 15 Dec 2008 19:52:10 +0000 Cc: FreeBSD virtualization mailing list , Alexander Leidinger , alexus , freebsd-current@freebsd.org, "Bjoern A. Zeeb" , freebsd-jail@freebsd.org Subject: Re: HEADS UP: r185435 multi-IPv4/v6/no-IP jails in HEAD X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: bseklecki@collaborativefusion.com List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Dec 2008 17:57:35 -0000 --=-wUWBZhNw6rPpfbPTQaJy Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Thu, 2008-12-11 at 22:53 +0100, Philipp Wuensche wrote: > Not entirely true, the jls output is totaly different than before and > breaks third-party applications like jailaudit and ezjail. Right, well, whether they check for VERSION > 70200x or 80000, the format will is likely to change. Once everything has been sorted out, they can add support now, push out the updates, and the version in common use will be forward/backward compatible. Whatever we have to do to light a fire there -- I just don't want ezjail-admin compatibility to be a showstopper on this. >=20 > It is uneasy to parse too. --=20 Brian A. Seklecki Collaborative Fusion, Inc. --=-wUWBZhNw6rPpfbPTQaJy Content-Type: application/pgp-signature; name=signature.asc Content-Description: This is a digitally signed message part -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) iEYEABECAAYFAklGmo4ACgkQCne6BNDQ+R9zLwCdFLkGOwDe4mjd1TZE/FY8cUUx 6rkAn2+OAkyyAdOo1OufwKBH4Gz4aSF7 =9i5l -----END PGP SIGNATURE----- --=-wUWBZhNw6rPpfbPTQaJy-- From owner-freebsd-virtualization@FreeBSD.ORG Mon Dec 15 19:55:07 2008 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3AE7106564A; Mon, 15 Dec 2008 19:55:07 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from mail.cksoft.de (mail.cksoft.de [62.111.66.27]) by mx1.freebsd.org (Postfix) with ESMTP id 548A38FC16; Mon, 15 Dec 2008 19:55:07 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from localhost (amavis.str.cksoft.de [192.168.74.71]) by mail.cksoft.de (Postfix) with ESMTP id B6D7C41C6B4; Mon, 15 Dec 2008 20:55:05 +0100 (CET) X-Virus-Scanned: amavisd-new at cksoft.de Received: from mail.cksoft.de ([62.111.66.27]) by localhost (amavis.str.cksoft.de [192.168.74.71]) (amavisd-new, port 10024) with ESMTP id nHFjzfnNzfS6; Mon, 15 Dec 2008 20:55:05 +0100 (CET) Received: by mail.cksoft.de (Postfix, from userid 66) id 634C541C6A3; Mon, 15 Dec 2008 20:55:05 +0100 (CET) Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net [10.111.66.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.int.zabbadoz.net (Postfix) with ESMTP id 1B2194448D5; Mon, 15 Dec 2008 19:50:34 +0000 (UTC) Date: Mon, 15 Dec 2008 19:50:34 +0000 (UTC) From: "Bjoern A. Zeeb" X-X-Sender: bz@maildrop.int.zabbadoz.net To: "Brian A. Seklecki" In-Reply-To: <1229363854.1722.39.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> Message-ID: <20081215194716.M97918@maildrop.int.zabbadoz.net> References: <20081201085229.D80401@maildrop.int.zabbadoz.net> <20081201122937.81475f0zhfsjya4o@webmail.leidinger.net> <6ae50c2d0812021800x791d2cfeh45d590de120f76df@mail.gmail.com> <1228483574.2805.499.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <86skp2l804.fsf@ds4.des.no> <1228507529.2805.539.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> <49418BD9.8080105@h3q.com> <1229363854.1722.39.camel@soundwave.ws.pitbpa0.priv.collaborativefusion.com> X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-jail@freebsd.org, Philipp Wuensche , FreeBSD virtualization mailing list , freebsd-current@freebsd.org Subject: Re: HEADS UP: r185435 multi-IPv4/v6/no-IP jails in HEAD X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Dec 2008 19:55:07 -0000 On Mon, 15 Dec 2008, Brian A. Seklecki wrote: > On Thu, 2008-12-11 at 22:53 +0100, Philipp Wuensche wrote: >> Not entirely true, the jls output is totaly different than before and >> breaks third-party applications like jailaudit and ezjail. > > Right, well, whether they check for VERSION > 70200x or 80000, the > format will is likely to change. > > Once everything has been sorted out, they can add support now, push out > the updates, and the version in common use will be forward/backward > compatible. > > Whatever we have to do to light a fire there -- I just don't want > ezjail-admin compatibility to be a showstopper on this. Two comments: the format as is, is most likely to stay for the livetime of the 7.x branch once things are MFCed. For 8 with vimage and we'll get an entirely new management interface for all this. /bz PS: yes, I know rc.d/jail foo still needs integration. Has anyone tested what was posted? -- Bjoern A. Zeeb The greatest risk is not taking one. From owner-freebsd-virtualization@FreeBSD.ORG Tue Dec 16 07:32:45 2008 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8AD7D1065670 for ; Tue, 16 Dec 2008 07:32:45 +0000 (UTC) (envelope-from wilkinsa@obelix.dsto.defence.gov.au) Received: from digger1.defence.gov.au (digger1.defence.gov.au [203.5.217.4]) by mx1.freebsd.org (Postfix) with ESMTP id F3DFA8FC0C for ; Tue, 16 Dec 2008 07:32:44 +0000 (UTC) (envelope-from wilkinsa@obelix.dsto.defence.gov.au) Received: from ednmsw510.dsto.defence.gov.au (ednmsw510.dsto.defence.gov.au [131.185.68.11]) by digger1.defence.gov.au (DSTO/DSTO) with ESMTP id mBG75YaM007330 for ; Tue, 16 Dec 2008 17:35:35 +1030 (CST) Received: from ednex510.dsto.defence.gov.au (ednex510.dsto.defence.gov.au) by ednmsw510.dsto.defence.gov.au (Clearswift SMTPRS 5.2.9) with ESMTP id for ; Tue, 16 Dec 2008 17:38:36 +1030 Received: from stlex511.dsto.defence.gov.au ([203.6.60.49]) by ednex510.dsto.defence.gov.au with Microsoft SMTPSVC(6.0.3790.3959); Tue, 16 Dec 2008 17:38:36 +1030 Received: from obelix.dsto.defence.gov.au ([203.6.60.208]) by stlex511.dsto.defence.gov.au with Microsoft SMTPSVC(6.0.3790.3959); Tue, 16 Dec 2008 16:08:34 +0900 Received: from obelix.dsto.defence.gov.au (localhost [127.0.0.1]) by obelix.dsto.defence.gov.au (8.14.2/8.14.2) with ESMTP id mBG77Kvw002082 for ; Tue, 16 Dec 2008 16:07:20 +0900 (WST) (envelope-from wilkinsa@obelix.dsto.defence.gov.au) Received: (from wilkinsa@localhost) by obelix.dsto.defence.gov.au (8.14.2/8.14.2/Submit) id mBG77KMa002081 for freebsd-virtualization@freebsd.org; Tue, 16 Dec 2008 16:07:20 +0900 (WST) (envelope-from wilkinsa) Date: Tue, 16 Dec 2008 16:07:20 +0900 From: "Wilkinson, Alex" To: freebsd-virtualization@freebsd.org Message-ID: <20081216070719.GB1501@stlux503.dsto.defence.gov.au> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline Organisation: Defence Science Technology Organisation User-Agent: Mutt/1.5.18 (2008-05-17) X-OriginalArrivalTime: 16 Dec 2008 07:08:34.0383 (UTC) FILETIME=[1C02B5F0:01C95F4D] X-TM-AS-Product-Ver: SMEX-7.0.0.1584-5.5.1027-16342.003 X-TM-AS-Result: No-2.120600-0.000000-31 Content-Transfer-Encoding: 7bit Subject: Network Virtualizing X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Dec 2008 07:32:45 -0000 Some of you may find this an interesting read: Crossbow - Network Virtualization Architecture Comes to Life [http://blogs.sun.com/sunay/entry/crossbow_network_virtualization_architecture_comes] -aW IMPORTANT: This email remains the property of the Australian Defence Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 1914. If you have received this email in error, you are requested to contact the sender and delete the email. From owner-freebsd-virtualization@FreeBSD.ORG Tue Dec 16 08:01:32 2008 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 72CCC1065677 for ; Tue, 16 Dec 2008 08:01:32 +0000 (UTC) (envelope-from prvs=julian=229da16c1@elischer.org) Received: from smtp-outbound.ironport.com (smtp-outbound.ironport.com [63.251.108.112]) by mx1.freebsd.org (Postfix) with ESMTP id 4F44D8FC27 for ; Tue, 16 Dec 2008 08:01:32 +0000 (UTC) (envelope-from prvs=julian=229da16c1@elischer.org) Received: from unknown (HELO julian-mac.elischer.org) ([10.251.60.14]) by smtp-outbound.ironport.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 15 Dec 2008 23:51:26 -0800 Message-ID: <49475DF8.1050608@elischer.org> Date: Mon, 15 Dec 2008 23:51:20 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.18 (Macintosh/20081105) MIME-Version: 1.0 To: "Wilkinson, Alex" References: <20081216070719.GB1501@stlux503.dsto.defence.gov.au> In-Reply-To: <20081216070719.GB1501@stlux503.dsto.defence.gov.au> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-virtualization@freebsd.org Subject: Re: Network Virtualizing X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Dec 2008 08:01:32 -0000 Wilkinson, Alex wrote: > Some of you may find this an interesting read: > > Crossbow - Network Virtualization Architecture Comes to Life > [http://blogs.sun.com/sunay/entry/crossbow_network_virtualization_architecture_comes] Interesting reading.. This is the equivalent of a combination of a number of bits of work done by Marko, Kip, ALC, jhb, and many others.. we aren't too far behind here.. Marko has virtual images with cpus assigned to them (or, at least he has work in that direction, and Kip has work on NICs with multiple receive/xmit rings etc. ALC has work on virtualised interfaces with hardware support. Jhb and others are doing multiple MSI interrupts and network polling... Interesting times.. > > -aW > > IMPORTANT: This email remains the property of the Australian Defence Organisation and is subject to the jurisdiction of section 70 of the CRIMES ACT 1914. If you have received this email in error, you are requested to contact the sender and delete the email. > > > _______________________________________________ > freebsd-virtualization@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-virtualization > To unsubscribe, send any mail to "freebsd-virtualization-unsubscribe@freebsd.org" From owner-freebsd-virtualization@FreeBSD.ORG Tue Dec 16 19:37:02 2008 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 468791065677; Tue, 16 Dec 2008 19:37:02 +0000 (UTC) (envelope-from prvs=julian=229da16c1@elischer.org) Received: from smtp-outbound.ironport.com (smtp-outbound.ironport.com [63.251.108.112]) by mx1.freebsd.org (Postfix) with ESMTP id 233358FC16; Tue, 16 Dec 2008 19:37:02 +0000 (UTC) (envelope-from prvs=julian=229da16c1@elischer.org) Received: from unknown (HELO julian-mac.elischer.org) ([10.251.60.14]) by smtp-outbound.ironport.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 16 Dec 2008 11:37:01 -0800 Message-ID: <49480357.2000400@elischer.org> Date: Tue, 16 Dec 2008 11:36:55 -0800 From: Julian Elischer User-Agent: Thunderbird 2.0.0.18 (Macintosh/20081105) MIME-Version: 1.0 To: Bruce Simpson References: <4947B5DB.7030502@incunabulum.net> <4947EE59.3090502@elischer.org> <4947F1F4.6020306@incunabulum.net> <4947F935.9070301@elischer.org> <4947FB30.8090905@incunabulum.net> In-Reply-To: <4947FB30.8090905@incunabulum.net> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: FreeBSD virtualization mailing list , qingli@FreeBSD.org, zec@freebsd.org, gnn@FreeBSD.org, rwatson@freebsd.org Subject: Re: Problems with IGMPv3 and VIMAGE X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Dec 2008 19:37:02 -0000 Bruce Simpson wrote: > Julian Elischer wrote: >> virtualisation of netisr at teh moment is still in flux a bit. >> the netisr thread (that's all it really is) becomes associated with >> a particular vimage as required. (but hey... read the code.. :-) >> >> http://perforce.freebsd.org/fileLogView.cgi?FSPC=//depot/projects/vimage/src/sys/net/netisr.c >> >> >> >>> >>> Is there a way of virtualizing mutexes? IGMPv3 currently has a global >>> output queue serviced by the netisr, as that is able to take all the >>> required locks in the right order to get to ip_output(), and this >>> queue is covered by a mutex. >> >> mutexes embedded in structures that are virtualises are virtualised.. >> for example if you have a mutex on teh reassembly queue in one stack >> that owuld b e a differnt mutex than one on another stack so they >> could never collide. >> it's a per instance decision on the part of the porter. > > OK, so I have to move "struct ifqueue igmpoq" into "struct vnet_inet" ? not necessarily. atomicity of virtualization is on kernel module boundaries. By this I mean that if it's separately loadable, it is probably separate enough to have it's own virtualisation structure containing al it's own 'global' variables, including its mutex. When you register your module with the vimage framework a new instance of your structure should be allocated by the constructor method you registered for it. > > Do I have to explicitly mtx_init() this mutex from within my pr_init > routine, i.e. igmp_init() ? > yes, as a new one would be created every time we make a new virtual machine (with its own IGMP3 instance). Don't forget to make a destructor function as well that tears down teh IGMP3 stack when the VM is removed. > Up until now the code has been using the MTX_SYSINIT() macro. I don't THINK that will work but I could be wrong.. Marko has done some pretty remarkable things. > >> >> Have you looked at the Vimage porter guide? >> >> http://perforce.freebsd.org/fileLogView.cgi?FSPC=//depot/projects/vimage/porting_to_vimage.txt >> > > Yes, but it didn't answer all my questions, so I am bugging everyone :-) when you understand something new, please send me an update to the doc that explains it so I can make the doc better. > > thanks again > BMS From owner-freebsd-virtualization@FreeBSD.ORG Wed Dec 17 10:25:07 2008 Return-Path: Delivered-To: freebsd-virtualization@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D11FD1065677; Wed, 17 Dec 2008 10:25:07 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from mail.cksoft.de (mail.cksoft.de [62.111.66.27]) by mx1.freebsd.org (Postfix) with ESMTP id 49CCF8FC1B; Wed, 17 Dec 2008 10:25:07 +0000 (UTC) (envelope-from bzeeb-lists@lists.zabbadoz.net) Received: from localhost (amavis.str.cksoft.de [192.168.74.71]) by mail.cksoft.de (Postfix) with ESMTP id E061F41C6BB; Wed, 17 Dec 2008 11:25:05 +0100 (CET) X-Virus-Scanned: amavisd-new at cksoft.de Received: from mail.cksoft.de ([62.111.66.27]) by localhost (amavis.str.cksoft.de [192.168.74.71]) (amavisd-new, port 10024) with ESMTP id jYrK+caOXMY5; Wed, 17 Dec 2008 11:25:05 +0100 (CET) Received: by mail.cksoft.de (Postfix, from userid 66) id 62C3541C64C; Wed, 17 Dec 2008 11:25:05 +0100 (CET) Received: from maildrop.int.zabbadoz.net (maildrop.int.zabbadoz.net [10.111.66.10]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.int.zabbadoz.net (Postfix) with ESMTP id B2B184448D5; Wed, 17 Dec 2008 10:23:17 +0000 (UTC) Date: Wed, 17 Dec 2008 10:23:16 +0000 (UTC) From: "Bjoern A. Zeeb" X-X-Sender: bz@maildrop.int.zabbadoz.net To: Peter Wemm , FreeBSD virtualization mailing list , FreeBSD current mailing list In-Reply-To: Message-ID: <20081217084354.T97918@maildrop.int.zabbadoz.net> References: <200812132159.mBDLxIQv040799@svn.freebsd.org> X-OpenPGP-Key: 0x14003F198FEFA3E77207EE8D2B58B8F83CCF1842 MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: svn-src-head@freebsd.org, svn-src-all@freebsd.org, src-committers@freebsd.org Subject: HEADS UP: vimage ABI constraints on container structs [was: Re: svn commit: r186057 - head/sys/netinet] X-BeenThere: freebsd-virtualization@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: FreeBSD virtualization mailing list List-Id: "Discussion of various virtualization techniques FreeBSD supports." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Dec 2008 10:25:08 -0000 On Tue, 16 Dec 2008, Peter Wemm wrote: Hi, let me Cc: virtualization and current@ for this reply (to have the explicit HEADS UP) for what Peter pointed out. The first to reply may want to trim the Cc: list again; possibly to only virtualization. > On Sat, Dec 13, 2008 at 1:59 PM, Bjoern A. Zeeb wrote: >> De-virtualize the MD5 context for TCP initial seq number generation >> and make it a function local variable like we do almost everywhere >> inside the kernel. > [..] >> --- head/sys/netinet/vinet.h Sat Dec 13 21:17:46 2008 (r186056) >> +++ head/sys/netinet/vinet.h Sat Dec 13 21:59:18 2008 (r186057) >> @@ -142,7 +142,6 @@ struct vnet_inet { >> int _isn_last_reseed; >> u_int32_t _isn_offset; >> u_int32_t _isn_offset_old; >> - MD5_CTX _isn_ctx; >> >> struct inpcbhead _udb; >> struct inpcbinfo _udbinfo; > > I'm bitterly unhappy with this. Every time these structs are touched, > either directly or indirectly, there is a guaranteed ABI breakage with > kernel modules. > > There needs to be a __FreeBSD_version bump (or something similar) > every time any of these structures change, and any kernel modules > *must* be prevented from loading. It can't be a >= some version, it > has to be an exact match. > > With the global variable method the linker calculates the offsets at > load time. With this abomination, the knowledge of the structure > layout is compiled into the generated code with no chance of a fixup. > There are no sanity checks. If a module that referenced _isn_ctx is > loaded the old way, there would be a link error. The new way will > have it silently trash _udb instead. > > There is a whole world of hurt being unleashed here. I suspect that > we might even be possible to run out of digits in our > __FreeBSD_version numbering scheme. > > I know we've talked about ABI-stable alternatives after the > infrastructure is done, but it needs to be absolutely clear that > touching this structure in the current form is a guaranteed ABI break, > and is currently undetected. If you boot kernel.old, you're hosed. > (Again, with -current, this might be ok for a while, but this scheme > won't wash with ports or other 3rd party modules) ... > In the mean time, I'd like to see some compile-time asserts in there > to make sure there are no accidental size changes of this structure. Ok, there are multiple things here: Size changes: ---------------------------------------- * I think catching pure size changes alone are not enough; any changes to the structs matter. Think of removing an int and adding an int (even at the same place). Something working on this will trash it. * The size changes of course would guard about non-obvious, indirect struct changes like (just an example) a change to struct inpcbinfo that make me worry even more (like they seem to worry you). More on structure changes: ---------------------------------------- * While this is on HEAD (refering to "Again, with -current, this might be ok for a while ..") I expected the following (major) changes coming with the continuing integration and testing: 1) Final passthrough on the set of virtualized variables. That might happen after Step #3 when people can actually test of SVN. 2) During Benchmarking - this would be about the same time - there might be shuffling. 3) Before we turn to a STABLE branch. *fear* * Again it's the indirect changes as said above (which are very worryingly). ABI problem: ---------------------------------------- * I agree that there are unaddressed challenges here. * Ideally we want version checking - at least at load time for modules - per container struct, as people do not want to change and recompile everything if say something in gif or ipsec changes which might not affect them. netgraph has done something like this but my feeling is that that would be the wrong way to go, especially wrt to vinet/vinet6. * I am not sure if padding as we had it before will work for stable branches. We need to think of this problem as well. To my understanding MAC has another really large structure with sufficient padding but it's a subsystem more or less living on its own on the side, not as heavily changes between branches and MFCed as the netstack and it's (function) pointers there and less direct members. * If you have suggestions or solutions, please share them. * .. Misc: ---------------------------------------- * I was aware of the problem and failed on two fronts here: 1) Doing my commit after Marko and forgetting about it going from the p4 branches to SVN. 2) Forgetting to mention this in the HEADs UP:( * The http://wiki.freebsd.org/Image/BeginnersGuideFAQ had a (somewhat hidden) "A single change would require complete recompiliation." I factored it out but will need to be more explicit there or refer to this thread. * Thanks a lot for sending this out to comitters and making them aware of the problem. I Cc:ed current@ and virtualization@ in the reply and change the subject. * I am happy there is another pair of eyes here now:) After losing the initial three other reviewers ... * I think the current mode (is?/)was "getting the infrastructre in" and get more hands (again) for the remaining parts when inevitable confronted with it and the huge pile of changes is gone and one might see more clear again. * We may want to always keep in mind that, not for 8 but maybe for 9, people may want to further virtualize more subsystems so whatever we do should possibly be general enough to work for other parts of the kernel as well. * As a very last resort at the moment (which would be a pitty) we can still do the rollback, keep the globals (as default) and only have virtualization optional. I would expect there to be some "keeping in sync" problems but it would be a lot easier to have both working and have people to adhere to the ``V_rules'' than on the side. It would be kind of hard for 3rd party vendors to supply (binary) modules with that then though. I really hope we won't need to go there. /bz -- Bjoern A. Zeeb The greatest risk is not taking one.