From owner-freebsd-stable Thu Nov 13 23:26:26 1997 Return-Path: Received: (from root@localhost) by hub.freebsd.org (8.8.7/8.8.7) id XAA06152 for stable-outgoing; Thu, 13 Nov 1997 23:26:26 -0800 (PST) (envelope-from owner-freebsd-stable) Received: from mail.san.rr.com (san.rr.com [204.210.0.1]) by hub.freebsd.org (8.8.7/8.8.7) with ESMTP id XAA06144 for ; Thu, 13 Nov 1997 23:26:24 -0800 (PST) (envelope-from studded@san.rr.com) Received: (from studded@localhost) by mail.san.rr.com (8.8.7/8.8.7) id XAA05912; Thu, 13 Nov 1997 23:25:53 -0800 (PST) Message-Id: <199711140725.XAA05912@mail.san.rr.com> From: "Studded" To: "FreeBSD Stable List" Date: Thu, 13 Nov 97 23:25:46 -0800 Reply-To: "Studded" Priority: Normal X-Mailer: PMMail 1.95a For OS/2 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Subject: Serious problem with ipfw in 11/10 Snap Sender: owner-freebsd-stable@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I was going to try and put more detail in this, but I don't have the time to do any digging right now. After using a tried and true make world process to upgrade to the 2.2-stable snap from 11/10, I built the kernel with the same options that were working in an earlier 2.2-stable snap, including ipfw, and rebooted only to find a dead machine. What happened was that an error somewhere in the ipfw code caused the default rule of deny all from any to any to load as rule 00000 instead of 65535, causing the box to be locked out from the net. The rule persisted even after a flush, so it was definitely a problem with the code. The 11/11 snap does not have this problem, and we're happily running it now. At the same time, the person who owns the hardware and bandwidth our server runs on is extremely frustrated with FreeBSD. This is twice now that we've been bitten in the ass by foolish mistakes in ipfw in what is supposed to be the STABLE branch. He suggested that if this happens again, he's going to put up a note on the server for the 40,000 users we get every day apologizing for the outage, and explaining that they can blame it on FreeBSD. IPFW problems are especially bad for us because our 2 servers are in a colo that goes without people for several days. Therefore, problems that isolate the machines from the net can cost us days in uptime. I've two reasons for writing this. The most important is to notify whoever is working on this part of the code, and helpfully make sure it doesn't happen again. Secondly, to urge caution when changes are made to -stable. I know that y'all are volunteers, but so am I. :) Finally, if anyone can tell me exactly where in the code I can look to double check this problem in the future, I'd appreciate it. If ever there was a case for the default rule to be open, I'd say this is it. :-/ Thanks, Doug