Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 1 Oct 2019 13:09:01 +0200
From:      Julien Cigar <julien@perdition.city>
To:        Reshad Patuck <reshadpatuck1@gmail.com>
Cc:        FreeBSD FS <freebsd-fs@freebsd.org>
Subject:   Re: [zfs] filesystem reads hanging
Message-ID:  <20191001110901.GL49734@home.lan>
In-Reply-To: <CADaJeD2JNwdd=_6_Pdp1YUgEdD6f0V%2B5U6bUk%2BoKzx_MmXs0ow@mail.gmail.com>
References:  <CADaJeD24HV0eW7nQT9jaQwEWp=1f4J2WL3OOLZiv--v1zyepwQ@mail.gmail.com> <20191001082837.GF49734@home.lan> <CADaJeD2JNwdd=_6_Pdp1YUgEdD6f0V%2B5U6bUk%2BoKzx_MmXs0ow@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

--hTKW8p8tUZ/8vLMe
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Tue, Oct 01, 2019 at 03:46:40PM +0530, Reshad Patuck wrote:
> Hi Julien,

Hi Reshad,

>=20
> I did come across that one an hour or so back, can you let me know if the=
re
> is any way to confirm that it is the same issue I am running up against.
> The command `procstat -kka` does have very similar (and in some cases
> identical) output to the lines in the PR mentioned.
>=20

I'm confident that it's the same issue.

> Unfortunately I need to stick to 12.0 till 12.1 is out, any idea if I can
> merge the same change into 12.0 and compile it?
> I can see the changes in the 12.1 branch, just wondering if I should jump
> to the beta or wait it out if I cant compile it into 12.0.
>=20

I can speak only for myself, but applying
https://bugs.freebsd.org/bugzilla/attachment.cgi?id=3D202890&action=3Ddiff
fixed the issue for me.

> Thanks for your help,
>=20
> Reshad
>=20

cheers,
Julien

>=20
> On Tue, Oct 1, 2019 at 1:58 PM Julien Cigar <julien@perdition.city> wrote:
>=20
> > On Tue, Oct 01, 2019 at 10:26:32AM +0530, Reshad Patuck wrote:
> > > Hi,
> >
> > Hello,
> >
> > >
> > > I have a FreeBSD 12.0-RELEASE-p9 system running ZFS.
> > > The system runs an application that uses postgres, and python (among
> > other
> > > services).
> > >
> > > I have noticed that python suddenly is not able to connect to postgre=
s.
> > > When I try to investigate further, certain files on disk can not be r=
ead.
> > > The commands `cat` and `ls -l` hang (no output and I can not ctrl-c or
> > kill
> > > -9 them), ps -aux shows them in a D+ state.
> > > On killing the SSH session these processes continue running in orphan=
s, I
> > > am not able to kill them.
> > >
> > > Someone on IRC suggested running a zfs scrub to check for data
> > corruption,
> > > but running `zpool scrub zroot` has the same effect.
> > > The command does not return, ctrl-c does not kill it and `zpool scrub=
 -s
> > > zroot` says "cannot cancel scrubbing zroot: there is no active scrub".
> > >
> > > This has happened in the past 1 month to two of my production servers=
 and
> > > since the application was critical they were rebooted and the boxes
> > > function as normal after the reboot.
> > > Files that were not cat-able on the production servers were working f=
ine
> > > and a zfs scrub worked fine to show 0 errors and 0 fixes.
> > > One of these boxes needed a hard reboot as it got stuck in the shutti=
ng
> > > down stage of a soft reboot.
> > >
> > > I am not sure where to start debugging this or if there are any ways =
to
> > get
> > > metrics on a box stuck in this state.
> > > Please let me know if you would like me to fetch any metrics or run a=
nd
> > > commands, etc. for you.
> > > Any help would be much appreciated.
> >
> > This is a known problem (see PR 236220) and has been fixed by r350894
> > (and MFC-ed into 12-STABLE, so I guess it should be in the upcoming
> > 12.1-RELEASE)
> >
> > >
> > > Best regards,
> > >
> > > Reshad
> > > _______________________________________________
> > > freebsd-fs@freebsd.org mailing list
> > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs
> > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"
> >
> > --
> > Julien Cigar
> > Belgian Biodiversity Platform (http://www.biodiversity.be)
> > PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
> > No trees were killed in the creation of this message.
> > However, many electrons were terribly inconvenienced.
> >

--=20
Julien Cigar
Belgian Biodiversity Platform (http://www.biodiversity.be)
PGP fingerprint: EEF9 F697 4B68 D275 7B11  6A25 B2BB 3710 A204 23C0
No trees were killed in the creation of this message.
However, many electrons were terribly inconvenienced.

--hTKW8p8tUZ/8vLMe
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQIzBAABCgAdFiEE7vn2l0to0nV7EWolsrs3EKIEI8AFAl2TM80ACgkQsrs3EKIE
I8Caag/+L+vUrfIjZDl1Ik+T7ymdxFWo2HhuZuzPhyRivmxUDTpkBRgQd0tzYBnp
rP9gP0eBb4kNAU6jTHm9xzEnafScOPnQxUVYP3/Js14MxF6GcuuODc/vPprqL6B2
qMec3TWUlLu20WWkxKoLkPVcHyk9JSlpkNGaVoO8beRm6INMwU2sgGdIdb5Fnylo
PyXyZZteiBgGNYZ/rLCTL1wUVdrqkGYHaKPOi49jThd6alGZrBHUVwTCkrEn4wBM
KbgfB06fSlTqyFh83ca7qugYd6837bZscGQUYKLQa3Cd+9GFY6On37PrVZdaUgZG
HPRmYt9csARXRXKO9AnN4iRZUmAb2+Mg2ft3bqnea8PuDxvKIH5Q1oJ0JT2boi7L
jF8ijzUyTcMb1iaAbXZHasKNk25UZatyy93nrPTDWWCyb7ivHHiG+jEnJpxzTrfE
+StICVnwGzOwVBTKI3aSel488Zg6iwW9QudQoacRkm6Pvr0B/glEUBGl1bECC84i
cOxhQIq/OKdhj+AIkIVEGMflp7IetYU3ucPG5Rara3717g5b59/7Poi9pZNSo6cs
Oto97SyPfrJD2fLoug4cjKqSigGOV4c20ksnuweRM6EXCYFhD7SgG9nD7To2WCb6
1R6gapKOKHLbcuqI/F/sJrd+vlyBrf9m1J0SyjH+4hwC8PKJA9E=
=+CrQ
-----END PGP SIGNATURE-----

--hTKW8p8tUZ/8vLMe--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20191001110901.GL49734>