Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 13 Nov 2018 23:17:47 +0100
From:      "Kristof Provost" <kp@FreeBSD.org>
To:        "Andreas Longwitz" <longwitz@incore.de>, "Gleb Smirnoff" <glebius@freebsd.org>, kib@freebsd.org
Cc:        freebsd-pf@freebsd.org
Subject:   Re: rdr pass for proto tcp sometimes creates states with expire time zero and so breaking connections
Message-ID:  <9004F62C-D1DC-4CFA-93A1-67E981274831@FreeBSD.org>
In-Reply-To: <5BEB3B9A.9080402@incore.de>
References:  <5BC51424.5000309@incore.de> <C4D1F141-2979-4103-957F-F0314637D978@sigsegv.be> <5BD45882.1000207@incore.de> <D5EEA773-1F0F-4FA0-A39A-486EE323907D@sigsegv.be> <5BEB3B9A.9080402@incore.de>

next in thread | previous in thread | raw e-mail | index | archive | help
On 13 Nov 2018, at 22:01, Andreas Longwitz wrote:
>>
>>     Are there any hints why the counter pf_default_rule->states_cur
>>     could get a negative value ?
>>
>> I’m afraid I have no idea right now.
>>
>
> OK, in the meantime I did some more research and I am now quite sure 
> the
> problem with the bogus pf_default_rule->states_cur counter is not a
> problem in pf. I am convinced it is a problem in counter(9) on i386
> server. The critical code is the machine instruction cmpxchg8b used in
> /sys/i386/include/counter.h.
>
I’m always happy to hear problems aren’t my fault :)

>> From intel instruction set reference manual:
> Zhis instruction can be used with a LOCK prefix allow the instruction 
> to
> be executed atomically.
>
> We have two other sources in kernel using cmpxchg8b:
>   /sys/i386/include/atomic.h   and
>   /sys/cddl/contrib/opensolaris/common/atomic/i386/opensolaris_atomic.S
>
> Both make use of the LOCK feature, in atomic.h a detailed explanation 
> is
> given. Because counter.h lacks the LOCK prefix I propose the following
> patch to get around the leak:
>
> --- counter.h.orig       2015-07-03 16:45:36.000000000 +0200
> +++ counter.h   2018-11-13 16:07:20.329053000 +0100
> @@ -60,6 +60,7 @@
>         "movl   %%edx,%%ecx\n\t"
>         "addl   (%%edi),%%ebx\n\t"
>         "adcl   4(%%edi),%%ecx\n\t"
> +       "lock   \n\t"
>         "cmpxchg8b %%fs:(%%esi)\n\t"
>         "jnz    1b"
>         :
> @@ -76,6 +77,7 @@
>         __asm __volatile(
>         "movl   %%eax,%%ebx\n\t"
>         "movl   %%edx,%%ecx\n\t"
> +       "lock   \n\t"
>         "cmpxchg8b      (%2)"
>         : "=a" (res_lo), "=d"(res_high)
>         : "SD" (p)
> @@ -121,6 +123,7 @@
>         "xorl   %%ebx,%%ebx\n\t"
>         "xorl   %%ecx,%%ecx\n\t"
>  "1:\n\t"
> +       "lock   \n\t"
>         "cmpxchg8b      (%0)\n\t"
>         "jnz    1b"
>         :
>
That looks very plausible. I’m somewhat out of my depth here, so I’d 
like the authors of the counter code to take a look at it.

Best regards,
Kristof
From owner-freebsd-pf@freebsd.org  Tue Nov 13 22:25:45 2018
Return-Path: <owner-freebsd-pf@freebsd.org>
Delivered-To: freebsd-pf@mailman.ysv.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2610:1c1:1:606c::19:1])
 by mailman.ysv.freebsd.org (Postfix) with ESMTP id 144F9113485A
 for <freebsd-pf@mailman.ysv.freebsd.org>; Tue, 13 Nov 2018 22:25:45 +0000 (UTC)
 (envelope-from glebius@freebsd.org)
Received: from cell.glebi.us (glebi.us [198.45.61.253])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits))
 (Client CN "cell.glebi.us", Issuer "cell.glebi.us" (not verified))
 by mx1.freebsd.org (Postfix) with ESMTPS id 6B09A7B755
 for <freebsd-pf@freebsd.org>; Tue, 13 Nov 2018 22:25:44 +0000 (UTC)
 (envelope-from glebius@freebsd.org)
Received: from cell.glebi.us (localhost [127.0.0.1])
 by cell.glebi.us (8.15.2/8.15.2) with ESMTPS id wADMPanK046049
 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NO);
 Tue, 13 Nov 2018 14:25:36 -0800 (PST)
 (envelope-from glebius@freebsd.org)
Received: (from glebius@localhost)
 by cell.glebi.us (8.15.2/8.15.2/Submit) id wADMPXmb046047;
 Tue, 13 Nov 2018 14:25:33 -0800 (PST)
 (envelope-from glebius@freebsd.org)
X-Authentication-Warning: cell.glebi.us: glebius set sender to
 glebius@freebsd.org using -f
Date: Tue, 13 Nov 2018 14:25:33 -0800
From: Gleb Smirnoff <glebius@freebsd.org>
To: Andreas Longwitz <longwitz@incore.de>
Cc: Kristof Provost <kristof@sigsegv.be>, freebsd-pf@freebsd.org
Subject: Re: rdr pass for proto tcp sometimes creates states with expire time
 zero and so breaking connections
Message-ID: <20181113222533.GJ9744@FreeBSD.org>
References: <5BC51424.5000309@incore.de>
 <C4D1F141-2979-4103-957F-F0314637D978@sigsegv.be>
 <5BD45882.1000207@incore.de>
 <D5EEA773-1F0F-4FA0-A39A-486EE323907D@sigsegv.be>
 <5BEB3B9A.9080402@incore.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <5BEB3B9A.9080402@incore.de>
User-Agent: Mutt/1.10.1 (2018-07-13)
X-Rspamd-Queue-Id: 6B09A7B755
X-Spamd-Result: default: False [-103.10 / 200.00]; ARC_NA(0.00)[];
 NEURAL_HAM_MEDIUM(-1.00)[-1.000,0];
 ALLOW_DOMAIN_WHITELIST(-100.00)[freebsd.org];
 FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[3];
 TO_DN_SOME(0.00)[]; NEURAL_HAM_LONG(-1.00)[-1.000,0];
 MIME_GOOD(-0.10)[text/plain]; HAS_XAW(0.00)[];
 DMARC_NA(0.00)[freebsd.org]; R_SPF_SOFTFAIL(0.00)[~all];
 RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[];
 MX_GOOD(-0.01)[cached: mx66.freebsd.org];
 NEURAL_HAM_SHORT(-0.97)[-0.972,0];
 IP_SCORE(-0.02)[country: US(-0.09)]; FROM_EQ_ENVFROM(0.00)[];
 R_DKIM_NA(0.00)[]; RCVD_TLS_LAST(0.00)[];
 ASN(0.00)[asn:2906, ipnet:198.45.48.0/20, country:US];
 MID_RHS_MATCH_FROM(0.00)[]
X-Rspamd-Server: mx1.freebsd.org
X-BeenThere: freebsd-pf@freebsd.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "Technical discussion and general questions about packet filter
 \(pf\)" <freebsd-pf.freebsd.org>
List-Unsubscribe: <https://lists.freebsd.org/mailman/options/freebsd-pf>,
 <mailto:freebsd-pf-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-pf/>;
List-Post: <mailto:freebsd-pf@freebsd.org>
List-Help: <mailto:freebsd-pf-request@freebsd.org?subject=help>
List-Subscribe: <https://lists.freebsd.org/mailman/listinfo/freebsd-pf>,
 <mailto:freebsd-pf-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 13 Nov 2018 22:25:45 -0000

On Tue, Nov 13, 2018 at 10:01:14PM +0100, Andreas Longwitz wrote:
A> OK, in the meantime I did some more research and I am now quite sure the
A> problem with the bogus pf_default_rule->states_cur counter is not a
A> problem in pf. I am convinced it is a problem in counter(9) on i386
A> server. The critical code is the machine instruction cmpxchg8b used in
A> /sys/i386/include/counter.h.
A> 
A> From intel instruction set reference manual:
A> Zhis instruction can be used with a LOCK prefix allow the instruction to
A> be executed atomically.
A> 
A> We have two other sources in kernel using cmpxchg8b:
A>   /sys/i386/include/atomic.h   and
A>   /sys/cddl/contrib/opensolaris/common/atomic/i386/opensolaris_atomic.S

A single CPU instruction is atomic by definition, with regards to the CPU.
A preemption can not happen in a middle of instruction. What the "lock"
prefix does is memory locking to avoid unlocked parallel access to the
same address by different CPUs.

What is special about counter(9) is that %fs:%esi always points to a
per-CPU address, because %fs is unique for every CPU and is constant,
so no other CPU may write to this address, so lock prefix isn't needed.

Of course a true SMP i386 isn't a well tested arch, so I won't assert
that counter(9) doesn't have bugs on this arch. However, I don't see
lock prefix necessary here.

-- 
Gleb Smirnoff



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9004F62C-D1DC-4CFA-93A1-67E981274831>