Date: Tue, 1 Oct 2019 17:39:27 +0530 From: Reshad Patuck <reshadpatuck1@gmail.com> To: Julien Cigar <julien@perdition.city> Cc: FreeBSD FS <freebsd-fs@freebsd.org> Subject: Re: [zfs] filesystem reads hanging Message-ID: <CADaJeD1DgSv%2BSQ86m=2ovUDkv7RWhyguqP-ytmnm13aFndVSSw@mail.gmail.com> In-Reply-To: <20191001110901.GL49734@home.lan> References: <CADaJeD24HV0eW7nQT9jaQwEWp=1f4J2WL3OOLZiv--v1zyepwQ@mail.gmail.com> <20191001082837.GF49734@home.lan> <CADaJeD2JNwdd=_6_Pdp1YUgEdD6f0V%2B5U6bUk%2BoKzx_MmXs0ow@mail.gmail.com> <20191001110901.GL49734@home.lan>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Julien, Thanks, I will give it a shot, and check if it occurs again. Best, Reshad On Tue, Oct 1, 2019 at 4:39 PM Julien Cigar <julien@perdition.city> wrote: > On Tue, Oct 01, 2019 at 03:46:40PM +0530, Reshad Patuck wrote: > > Hi Julien, > > Hi Reshad, > > > > > I did come across that one an hour or so back, can you let me know if > there > > is any way to confirm that it is the same issue I am running up against. > > The command `procstat -kka` does have very similar (and in some cases > > identical) output to the lines in the PR mentioned. > > > > I'm confident that it's the same issue. > > > Unfortunately I need to stick to 12.0 till 12.1 is out, any idea if I can > > merge the same change into 12.0 and compile it? > > I can see the changes in the 12.1 branch, just wondering if I should jump > > to the beta or wait it out if I cant compile it into 12.0. > > > > I can speak only for myself, but applying > https://bugs.freebsd.org/bugzilla/attachment.cgi?id=202890&action=diff > fixed the issue for me. > > > Thanks for your help, > > > > Reshad > > > > cheers, > Julien > > > > > On Tue, Oct 1, 2019 at 1:58 PM Julien Cigar <julien@perdition.city> > wrote: > > > > > On Tue, Oct 01, 2019 at 10:26:32AM +0530, Reshad Patuck wrote: > > > > Hi, > > > > > > Hello, > > > > > > > > > > > I have a FreeBSD 12.0-RELEASE-p9 system running ZFS. > > > > The system runs an application that uses postgres, and python (among > > > other > > > > services). > > > > > > > > I have noticed that python suddenly is not able to connect to > postgres. > > > > When I try to investigate further, certain files on disk can not be > read. > > > > The commands `cat` and `ls -l` hang (no output and I can not ctrl-c > or > > > kill > > > > -9 them), ps -aux shows them in a D+ state. > > > > On killing the SSH session these processes continue running in > orphans, I > > > > am not able to kill them. > > > > > > > > Someone on IRC suggested running a zfs scrub to check for data > > > corruption, > > > > but running `zpool scrub zroot` has the same effect. > > > > The command does not return, ctrl-c does not kill it and `zpool > scrub -s > > > > zroot` says "cannot cancel scrubbing zroot: there is no active > scrub". > > > > > > > > This has happened in the past 1 month to two of my production > servers and > > > > since the application was critical they were rebooted and the boxes > > > > function as normal after the reboot. > > > > Files that were not cat-able on the production servers were working > fine > > > > and a zfs scrub worked fine to show 0 errors and 0 fixes. > > > > One of these boxes needed a hard reboot as it got stuck in the > shutting > > > > down stage of a soft reboot. > > > > > > > > I am not sure where to start debugging this or if there are any ways > to > > > get > > > > metrics on a box stuck in this state. > > > > Please let me know if you would like me to fetch any metrics or run > and > > > > commands, etc. for you. > > > > Any help would be much appreciated. > > > > > > This is a known problem (see PR 236220) and has been fixed by r350894 > > > (and MFC-ed into 12-STABLE, so I guess it should be in the upcoming > > > 12.1-RELEASE) > > > > > > > > > > > Best regards, > > > > > > > > Reshad > > > > _______________________________________________ > > > > freebsd-fs@freebsd.org mailing list > > > > https://lists.freebsd.org/mailman/listinfo/freebsd-fs > > > > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org > " > > > > > > -- > > > Julien Cigar > > > Belgian Biodiversity Platform (http://www.biodiversity.be) > > > PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 > > > No trees were killed in the creation of this message. > > > However, many electrons were terribly inconvenienced. > > > > > -- > Julien Cigar > Belgian Biodiversity Platform (http://www.biodiversity.be) > PGP fingerprint: EEF9 F697 4B68 D275 7B11 6A25 B2BB 3710 A204 23C0 > No trees were killed in the creation of this message. > However, many electrons were terribly inconvenienced. >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CADaJeD1DgSv%2BSQ86m=2ovUDkv7RWhyguqP-ytmnm13aFndVSSw>