Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 19 Feb 2020 13:20:36 +0000
From:      bugzilla-noreply@freebsd.org
To:        virtualization@FreeBSD.org
Subject:   [Bug 235856] FreeBSD freezes on AWS EC2 t3 machines
Message-ID:  <bug-235856-27103-4v5GWi4oyj@https.bugs.freebsd.org/bugzilla/>
In-Reply-To: <bug-235856-27103@https.bugs.freebsd.org/bugzilla/>
References:  <bug-235856-27103@https.bugs.freebsd.org/bugzilla/>

next in thread | previous in thread | raw e-mail | index | archive | help
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=3D235856

--- Comment #23 from mail@rubenvos.com ---
(In reply to Colin Percival from comment #21)

Hmm. The difference of exactly 1 hour doesn't seem to have a relationship w=
ith
different timezones...

Today we had another occurance on one of the machines:

Feb 19 03:24:53 volume3 kernel: nvme1: cpl does not map to outstanding cmd
Feb 19 03:24:53 volume3 kernel: cdw0:00000000 sqhd:000c sqid:0002 cid:0017 =
p:0
sc:00 sct:0 m:0 dnr:0
Feb 19 03:24:53 volume3 kernel: nvme1: Missing interrupt
Feb 19 03:24:53 volume3 kernel: nvme1: Resetting controller due to a timeou=
t.
Feb 19 03:24:53 volume3 kernel: nvme1: resetting controller
Feb 19 03:24:54 volume3 kernel: nvme1: temperature threshold not supported
Feb 19 03:24:54 volume3 kernel: nvme1: aborting outstanding i/o

Comparing 03:24:53 with the access times of the daily scripts though:

ls -lahtuT /etc/periodic/daily/
total 128
-rwxr-xr-x  1 root  wheel   1.0K Feb 19 12:31:47 2020 450.status-security
-rwxr-xr-x  1 root  wheel   811B Feb 19 05:32:14 2020 999.local
-rwxr-xr-x  1 root  wheel   2.8K Feb 19 05:32:14 2020 800.scrub-zfs
-rwxr-xr-x  1 root  wheel   845B Feb 19 05:32:14 2020 510.status-world-kern=
el
-rwxr-xr-x  1 root  wheel   737B Feb 19 05:32:14 2020 500.queuerun
-rwxr-xr-x  1 root  wheel   498B Feb 19 05:32:14 2020 480.status-ntpd
-rwxr-xr-x  1 root  wheel   451B Feb 19 05:32:14 2020 480.leapfile-ntpd
-rwxr-xr-x  1 root  wheel   2.0K Feb 19 05:32:14 2020 460.status-mail-rejec=
ts
-rwxr-xr-x  1 root  wheel   1.4K Feb 19 03:01:00 2020 440.status-mailq
-rwxr-xr-x  1 root  wheel   705B Feb 19 03:01:00 2020 430.status-uptime
-rwxr-xr-x  1 root  wheel   611B Feb 19 03:01:00 2020 420.status-network
-rwxr-xr-x  1 root  wheel   684B Feb 19 03:01:00 2020 410.status-mfi
-rwxr-xr-x  1 root  wheel   590B Feb 19 03:01:00 2020 409.status-gconcat
-rwxr-xr-x  1 root  wheel   590B Feb 19 03:01:00 2020 408.status-gstripe
-rwxr-xr-x  1 root  wheel   591B Feb 19 03:01:00 2020 407.status-graid3
-rwxr-xr-x  1 root  wheel   596B Feb 19 03:01:00 2020 406.status-gmirror
-rwxr-xr-x  1 root  wheel   807B Feb 19 03:01:00 2020 404.status-zfs
-rwxr-xr-x  1 root  wheel   583B Feb 19 03:01:00 2020 401.status-graid
-rwxr-xr-x  1 root  wheel   773B Feb 19 03:01:00 2020 400.status-disks
-rwxr-xr-x  1 root  wheel   724B Feb 19 03:01:00 2020 330.news
-r-xr-xr-x  1 root  wheel   1.4K Feb 19 03:01:00 2020 310.accounting
-rwxr-xr-x  1 root  wheel   693B Feb 19 03:01:00 2020 300.calendar
-rwxr-xr-x  1 root  wheel   1.0K Feb 19 03:01:00 2020 210.backup-aliases
-rwxr-xr-x  1 root  wheel   1.7K Feb 19 03:01:00 2020 200.backup-passwd
-rwxr-xr-x  1 root  wheel   603B Feb 19 03:01:00 2020 150.clean-hoststat
-rwxr-xr-x  1 root  wheel   1.0K Feb 19 03:01:00 2020 140.clean-rwho
-rwxr-xr-x  1 root  wheel   709B Feb 19 03:01:00 2020 130.clean-msgs
-rwxr-xr-x  1 root  wheel   1.1K Feb 19 03:01:00 2020 120.clean-preserve
-rwxr-xr-x  1 root  wheel   1.5K Feb 19 03:01:00 2020 110.clean-tmps
-rwxr-xr-x  1 root  wheel   1.3K Feb 19 03:01:00 2020 100.clean-disks
drwxr-xr-x  2 root  wheel   1.0K Nov  1 07:06:41 2019 .
drwxr-xr-x  6 root  wheel   512B Nov  1 07:06:41 2019 ..

but if the periodic framework executes the jobs serially I see no link with=
=20
440.status-mailq (that does not sound like high io) :S.

I think there definately is a link between this bug and high disk-/network-=
io
so the periodic framework probably classifies as a nice trigger (especially=
 the
security bits with the find commands)....

We will continue to cross-reference the access times of the daily scripts w=
ith
the "Missing interrupt" occurences and post updates.

Kind regards,

Ruben

--=20
You are receiving this mail because:
You are the assignee for the bug.=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bug-235856-27103-4v5GWi4oyj>