From owner-freebsd-bugs@FreeBSD.ORG Wed Apr 19 11:40:18 2006 Return-Path: X-Original-To: freebsd-bugs@hub.freebsd.org Delivered-To: freebsd-bugs@hub.freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0615716A406 for ; Wed, 19 Apr 2006 11:40:18 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8005F43D46 for ; Wed, 19 Apr 2006 11:40:17 +0000 (GMT) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.13.4/8.13.4) with ESMTP id k3JBeFqs049328 for ; Wed, 19 Apr 2006 11:40:15 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.13.4/8.13.4/Submit) id k3JBeFRn049327; Wed, 19 Apr 2006 11:40:15 GMT (envelope-from gnats) Date: Wed, 19 Apr 2006 11:40:15 GMT Message-Id: <200604191140.k3JBeFRn049327@freefall.freebsd.org> To: freebsd-bugs@FreeBSD.org From: Daniel Hartmeier Cc: Subject: Re: kern/95559: [RELENG_6] write(2) fails with EPERM on TCP socket under certain situations X-BeenThere: freebsd-bugs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Hartmeier List-Id: Bug reports List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Apr 2006 11:40:18 -0000 The following reply was made to PR kern/95559; it has been noted by GNATS. From: Daniel Hartmeier To: Xin LI Cc: Gleb Smirnoff , gnn@FreeBSD.org, Robert Watson , mlaier@FreeBSD.org, Xin LI , FreeBSD-gnats-submit@FreeBSD.org Subject: Re: kern/95559: [RELENG_6] write(2) fails with EPERM on TCP socket under certain situations Date: Wed, 19 Apr 2006 13:37:52 +0200 I haven't read all context yet, but maybe I can tell you how you can check whether it's really pf blocking any packets. If you create a state entry in pf for a TCP connection, pf must see and match all packets of that connection against that state entry, otherwise things will break. For instance, if pf only associates packets flowing in one direction with the state entry, the state entry will never advance to 'established' and pf can't track TCP windows and will sooner or later start to block packets. If outgoing packets of one connection are seen (by pf) on interface A, but incoming packets of the same connection on a different interface B, things still work, if you create a floating state (not using 'if-bound'). But the direction of packets matters. If pf sees packets flowing in either direction both as incoming (on different interfaces), or both as outgoing, things break. To check whether either of those things occur in your setup, you can try to establish one connection, then check the following things in pf a) pfctl -vvss, should show one (or more) states related to the connection. Check the "x:y pkts" part on the third line, it shows how many packets pf has associated with the state entry so far. x is number of packets in the same direction as the initial packet that created the state, y is in the reverse direction. If either one of those is >1 but the other ==0, pf doesn't see replies in the opposite direction. The right-most string on the first line tells how advanced the state entry is. After a successful TCP handshake, while the connection is not closed from either side, it should read 'ESTABLISHED:ESTABLISHED'. If there are multiple states related to a single connection, make sure each one is created as expected, and advancing normally. A common mistake, for instance, is to create state not on the initial SYN of the TCP handshake, but on a subsequent packet. This causes pf to miss the TCP window scaling negotiation, and can break connections eventually, after they appear to have been established fine and progress to some degree. b) pfctl -si, check for increasing counters once the problem occurs. pf will increase at least one of these counters for every packet it blocks, for any reason. If no counter is increasing, pf hasn't blocked a packet. c) pfctl -xm, enables debug logging to /var/log/messages, enable, then reproduce the problem, then check the log. If there are any messages from pf (like 'BAD state'), those will help analysis. One explanation why you'd see EPERM is that in FreeBSD, the pfil wrapper simply returns pf_test()'s return value. This is either PF_PASS (0) or PF_DROP (1), and 1 is also the value of EPERM, by coincidence. On OpenBSD and NetBSD, the return value PF_DROP of pf_test() is mapped to errno 65 EHOSTUNREACH, as that is one existing errno that most network related syscalls that can now additionally fail due to pf blocking can return otherwise already (according to their individual man pages). While returning EPERM is somewhat intuitive, it's not an errno an application must expect to come back from most such syscalls. On the other hand, people are regularly confused (on Open/Net) when tools (like ping) fail with 'No route to host' due to pf blocking, when the routing table is not the problem at all. Not sure if this is different on FreeBSD intentionally or an oversight. From a cross-platform supporter's point of view, it would make things easier if it was the same on all platforms :) HTH, Daniel