Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 14 Jun 2019 10:20:18 -0700
From:      David Wolfskill <david@catwhisker.org>
To:        Peter <pmc@citylink.dinoex.sub.org>
Cc:        freebsd-ipfw@freebsd.org
Subject:   Re: ipfw: switching sets does stall the machine
Message-ID:  <20190614172018.GJ1219@albert.catwhisker.org>
In-Reply-To: <20190614153302.GA4503@gate.oper.dinoex.org>
References:  <20190614153302.GA4503@gate.oper.dinoex.org>

next in thread | previous in thread | raw e-mail | index | archive | help

--Aukwk0PYIw+XhFne
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, Jun 14, 2019 at 05:33:02PM +0200, Peter wrote:
>=20
> Hi,
> I am trying to use two different configurations (production and test)
> loaded into different sets, and switch between them with
>=20
>    # ipfw set disable ... enable ...
>=20
> When testing my script, this did work, except once the machine went
> into "swap_pager indefinite wait" and was lost.

IIRC, this message means that a command was sent to a disk controller
and at least 20 seconds have elapsed with no response from that
controller.  That doesn't seem like an "ipfw" issue, per se.

> Then, after reboot (and automatically loading the production rules) I
> tried to load and switch to the test rules, and immediately got ATA
> COMMAND TIMEOUT and the machine was lost.

Again, that's a disk subsystem (apparently) doing Bad Things.

> I repeated this a few times, it is nicely reproducible: withing 3-5
> seconds after the new rules are loaded, the machine locks up and is
> lost.

It's at least plausible that the catalyzing activity causes a certain
disk I/O pattern that does the actual triggering (I expect).

> I analyzed more closely by running "top -HPS" in rtprio, and found
> this:
>  * loading the rules is no problem.
>  * when switching sets, the command returns, but then within few
>    seconds the machine gets unresponsive and stays so until watchdog
>    hits.
>  * The last thing seen in "top" (before it freezes) is this thread
>    eating 85% CPU (and running with high priority):
>    [irq12: uhci0 uhci1]
>=20
>=20
> It there a known workaround?
> ....

My inclination is for you to check the disk drive(s), cabling, and
controller(s) before much else.

Peace,
david
--=20
David H. Wolfskill				david@catwhisker.org
Donald Trump advocated for the executions of five factually innocent young =
men.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.

--Aukwk0PYIw+XhFne
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQGTBAEBCgB9FiEE4owz2QxMJyaxAefyQLJg+bY2PckFAl0D11JfFIAAAAAALgAo
aXNzdWVyLWZwckBub3RhdGlvbnMub3BlbnBncC5maWZ0aGhvcnNlbWFuLm5ldEUy
OEMzM0Q5MEM0QzI3MjZCMTAxRTdGMjQwQjI2MEY5QjYzNjNEQzkACgkQQLJg+bY2
PckdEAf/YPtQtyqdxNIwBqabwEylzTYmbDvzIBb3NfznVA8rAoYovhuuhbjm+c/N
XYwT8bo9k4WAMHkIWTzs7Qy18x3VkgmTTwAztaR5+DVT1e98bJ9KWoZSlwUsKygM
L3oyFgs84uS713xn/pTXLVHDgRj2dhN6l2hMNwj/3fgzqUVf1ONy0STl0Vano2hE
5x9gBTIZYfBQpDkkeeUfZJP/gqntYRirXisFdZvDvtvKzAr9O0ZP1eVy/0Vu4vqi
y/qtFpJRv2erJDJQ5YQ+AyBJHjQDQnjmsNFPYP8ThDr67PKgosZdvrS6ZSj8ahZL
50eLntfnhcRQWrJPdam/OwyFbjlM1A==
=28XJ
-----END PGP SIGNATURE-----

--Aukwk0PYIw+XhFne--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190614172018.GJ1219>