Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 27 Aug 2015 15:22:41 -0500
From:      Karl Denninger <karl@denninger.net>
To:        freebsd-fs@freebsd.org
Subject:   Re: Panic in ZFS during zfs recv (while snapshots being destroyed)
Message-ID:  <55DF7191.2080409@denninger.net>
In-Reply-To: <55CF7926.1030901@denninger.net>
References:  <55BB443E.8040801@denninger.net> <55CF7926.1030901@denninger.net>

next in thread | previous in thread | raw e-mail | index | archive | help
This is a cryptographically signed message in MIME format.

--------------ms000808020402070505000800
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

On 8/15/2015 12:38, Karl Denninger wrote:
> Update:
>
> This /appears /to be related to attempting to send or receive a
> /cloned /snapshot.
>
> I use /beadm /to manage boot environments and the crashes have all
> come while send/recv-ing the root pool, which is the one where these
> clones get created.  It is /not /consistent within a given snapshot
> when it crashes and a second attempt (which does a "recovery"
> send/receive) succeeds every time -- I've yet to have it panic twice
> sequentially.
>
> I surmise that the problem comes about when a file in the cloned
> snapshot is modified, but this is a guess at this point.
>
> I'm going to try to force replication of the problem on my test system.=

>
> On 7/31/2015 04:47, Karl Denninger wrote:
>> I have an automated script that runs zfs send/recv copies to bring a
>> backup data set into congruence with the running copies nightly.  The
>> source has automated snapshots running on a fairly frequent basis
>> through zfs-auto-snapshot.
>>
>> Recently I have started having a panic show up about once a week durin=
g
>> the backup run, but it's inconsistent.  It is in the same place, but I=

>> cannot force it to repeat.
>>
>> The trap itself is a page fault in kernel mode in the zfs code at
>> zfs_unmount_snap(); here's the traceback from the kvm (sorry for the
>> image link but I don't have a better option right now.)
>>
>> I'll try to get a dump, this is a production machine with encrypted sw=
ap
>> so it's not normally turned on.
>>
>> Note that the pool that appears to be involved (the backup pool) has
>> passed a scrub and thus I would assume the on-disk structure is ok....=
=2E
>> but that might be an unfair assumption.  It is always occurring in the=

>> same dataset although there are a half-dozen that are sync'd -- if thi=
s
>> one (the first one) successfully completes during the run then all the=

>> rest will as well (that is, whenever I restart the process it has alwa=
ys
>> failed here.)  The source pool is also clean and passes a scrub.
>>
>> traceback is at http://www.denninger.net/kvmimage.png; apologies for t=
he
>> image traceback but this is coming from a remote KVM.
>>
>> I first saw this on 10.1-STABLE and it is still happening on FreeBSD
>> 10.2-PRERELEASE #9 r285890M, which I updated to in an attempt to see i=
f
>> the problem was something that had been addressed.
>>
>>
>
> --=20
> Karl Denninger
> karl@denninger.net <mailto:karl@denninger.net>
> /The Market Ticker/
> /[S/MIME encrypted email preferred]/

Second update: I have now taken another panic on 10.2-Stable, same deal,
but without any cloned snapshots in the source image. I had thought that
removing cloned snapshots might eliminate the issue; that is now out the
window.

It ONLY happens on this one filesystem (the root one, incidentally)
which is fairly-recently created as I moved this machine from spinning
rust to SSDs for the OS and root pool -- and only when it is being
backed up by using zfs send | zfs recv (with the receive going to a
different pool in the same machine.)  I have yet to be able to provoke
it when using zfs send to copy to a different machine on the same LAN,
but given that it is not able to be reproduced on demand I can't be
certain it's timing related (e.g. performance between the two pools in
question) or just that I haven't hit the unlucky combination.

This looks like some sort of race condition and I will continue to see
if I can craft a case to make it occur "on demand"

--=20
Karl Denninger
karl@denninger.net <mailto:karl@denninger.net>
/The Market Ticker/
/[S/MIME encrypted email preferred]/

--------------ms000808020402070505000800
Content-Type: application/pkcs7-signature; name="smime.p7s"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="smime.p7s"
Content-Description: S/MIME Cryptographic Signature

MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgMFADCABgkqhkiG9w0BBwEAAKCC
Bl8wggZbMIIEQ6ADAgECAgEpMA0GCSqGSIb3DQEBCwUAMIGQMQswCQYDVQQGEwJVUzEQMA4G
A1UECBMHRmxvcmlkYTESMBAGA1UEBxMJTmljZXZpbGxlMRkwFwYDVQQKExBDdWRhIFN5c3Rl
bXMgTExDMRwwGgYDVQQDExNDdWRhIFN5c3RlbXMgTExDIENBMSIwIAYJKoZIhvcNAQkBFhND
dWRhIFN5c3RlbXMgTExDIENBMB4XDTE1MDQyMTAyMjE1OVoXDTIwMDQxOTAyMjE1OVowWjEL
MAkGA1UEBhMCVVMxEDAOBgNVBAgTB0Zsb3JpZGExGTAXBgNVBAoTEEN1ZGEgU3lzdGVtcyBM
TEMxHjAcBgNVBAMTFUthcmwgRGVubmluZ2VyIChPQ1NQKTCCAiIwDQYJKoZIhvcNAQEBBQAD
ggIPADCCAgoCggIBALmEWPhAdphrWd4K5VTvE5pxL3blRQPyGF3ApjUjgtavqU1Y8pbI3Byg
XDj2/Uz9Si8XVj/kNbKEjkRh5SsNvx3Fc0oQ1uVjyCq7zC/kctF7yLzQbvWnU4grAPZ3IuAp
3/fFxIVaXpxEdKmyZAVDhk9az+IgHH43rdJRIMzxJ5vqQMb+n2EjadVqiGPbtG9aZEImlq7f
IYDTnKyToi23PAnkPwwT+q1IkI2DTvf2jzWrhLR5DTX0fUYC0nxlHWbjgpiapyJWtR7K2YQO
aevQb/3vN9gSojT2h+cBem7QIj6U69rEYcEDvPyCMXEV9VcXdcmW42LSRsPvZcBHFkWAJqMZ
Myiz4kumaP+s+cIDaXitR/szoqDKGSHM4CPAZV9Yh8asvxQL5uDxz5wvLPgS5yS8K/o7zDR5
vNkMCyfYQuR6PAJxVOk5Arqvj9lfP3JSVapwbr01CoWDBkpuJlKfpQIEeC/pcCBKknllbMYq
yHBO2TipLyO5Ocd1nhN/nOsO+C+j31lQHfOMRZaPQykXVPWG5BbhWT7ttX4vy5hOW6yJgeT/
o3apynlp1cEavkQRS8uJHoQszF6KIrQMID/JfySWvVQ4ksnfzwB2lRomrdrwnQ4eG/HBS+0l
eozwOJNDIBlAP+hLe8A5oWZgooIIK/SulUAsfI6Sgd8dTZTTYmlhAgMBAAGjgfQwgfEwNwYI
KwYBBQUHAQEEKzApMCcGCCsGAQUFBzABhhtodHRwOi8vY3VkYXN5c3RlbXMubmV0Ojg4ODgw
CQYDVR0TBAIwADARBglghkgBhvhCAQEEBAMCBaAwCwYDVR0PBAQDAgXgMCwGCWCGSAGG+EIB
DQQfFh1PcGVuU1NMIEdlbmVyYXRlZCBDZXJ0aWZpY2F0ZTAdBgNVHQ4EFgQUxRyULenJaFwX
RtT79aNmIB/u5VkwHwYDVR0jBBgwFoAUJHGbnYV9/N3dvbDKkpQDofrTbTUwHQYDVR0RBBYw
FIESa2FybEBkZW5uaW5nZXIubmV0MA0GCSqGSIb3DQEBCwUAA4ICAQBPf3cYtmKowmGIYsm6
eBinJu7QVWvxi1vqnBz3KE+HapqoIZS8/PolB/hwiY0UAE1RsjBJ7yEjihVRwummSBvkoOyf
G30uPn4yg4vbJkR9lTz8d21fPshWETa6DBh2jx2Qf13LZpr3Pj2fTtlu6xMYKzg7cSDgd2bO
sJGH/rcvva9Spkx5Vfq0RyOrYph9boshRN3D4tbWgBAcX9POdXCVfJONDxhfBuPHsJ6vEmPb
An+XL5Yl26XYFPiODQ+Qbk44Ot1kt9s7oS3dVUrh92Qv0G3J3DF+Vt6C15nED+f+bk4gScu+
JHT7RjEmfa18GT8DcT//D1zEke1Ymhb41JH+GyZchDRWtjxsS5OBFMzrju7d264zJUFtX7iJ
3xvpKN7VcZKNtB6dLShj3v/XDsQVQWXmR/1YKWZ93C3LpRs2Y5nYdn6gEOpL/WfQFThtfnat
HNc7fNs5vjotaYpBl5H8+VCautKbGOs219uQbhGZLYTv6okuKcY8W+4EJEtK0xB08vqr9Jd0
FS9MGjQE++GWo+5eQxFt6nUENHbVYnsr6bYPQsZH0CRNycgTG9MwY/UIXOf4W034UpR82TBG
1LiMsYfb8ahQJhs3wdf1nzipIjRwoZKT1vGXh/cj3gwSr64GfenURBxaFZA5O1acOZUjPrRT
n3ci4McYW/0WVVA3lDGCBRMwggUPAgEBMIGWMIGQMQswCQYDVQQGEwJVUzEQMA4GA1UECBMH
RmxvcmlkYTESMBAGA1UEBxMJTmljZXZpbGxlMRkwFwYDVQQKExBDdWRhIFN5c3RlbXMgTExD
MRwwGgYDVQQDExNDdWRhIFN5c3RlbXMgTExDIENBMSIwIAYJKoZIhvcNAQkBFhNDdWRhIFN5
c3RlbXMgTExDIENBAgEpMA0GCWCGSAFlAwQCAwUAoIICTTAYBgkqhkiG9w0BCQMxCwYJKoZI
hvcNAQcBMBwGCSqGSIb3DQEJBTEPFw0xNTA4MjcyMDIyNDFaME8GCSqGSIb3DQEJBDFCBECb
Dgr+d2nEYlHfHUt98UBNkMqjQ/Fyo5PexVssCqJTqjGMgr0Sqs4QGOWhffL3WPmBCeOy0bfQ
5kyUBozIN66oMGwGCSqGSIb3DQEJDzFfMF0wCwYJYIZIAWUDBAEqMAsGCWCGSAFlAwQBAjAK
BggqhkiG9w0DBzAOBggqhkiG9w0DAgICAIAwDQYIKoZIhvcNAwICAUAwBwYFKw4DAgcwDQYI
KoZIhvcNAwICASgwgacGCSsGAQQBgjcQBDGBmTCBljCBkDELMAkGA1UEBhMCVVMxEDAOBgNV
BAgTB0Zsb3JpZGExEjAQBgNVBAcTCU5pY2V2aWxsZTEZMBcGA1UEChMQQ3VkYSBTeXN0ZW1z
IExMQzEcMBoGA1UEAxMTQ3VkYSBTeXN0ZW1zIExMQyBDQTEiMCAGCSqGSIb3DQEJARYTQ3Vk
YSBTeXN0ZW1zIExMQyBDQQIBKTCBqQYLKoZIhvcNAQkQAgsxgZmggZYwgZAxCzAJBgNVBAYT
AlVTMRAwDgYDVQQIEwdGbG9yaWRhMRIwEAYDVQQHEwlOaWNldmlsbGUxGTAXBgNVBAoTEEN1
ZGEgU3lzdGVtcyBMTEMxHDAaBgNVBAMTE0N1ZGEgU3lzdGVtcyBMTEMgQ0ExIjAgBgkqhkiG
9w0BCQEWE0N1ZGEgU3lzdGVtcyBMTEMgQ0ECASkwDQYJKoZIhvcNAQEBBQAEggIAmb5WzQI7
lN1Y6vYI52qpPhnJ55u5LIwAJCPpsyNCJXlYh76Wu399RmyMbjc2LjGJmYSx6rpzK4O9j8BX
k/VbKPNbJ5gWjJyyVXO4RRCTLNczCNU8JkboCcnq8eUBlL/4V/QoviDjDgzQKQ75pCY9M0JW
a6vs2X6bcyVzF05clAgeHGh+/aQhSFToLXyqpitHNixjlWUDSdJ7o7EOcP9y4iWlQzgCjamZ
fQrjlCZLPm64PmJ+Jy//lfuwFThYvhErQi8SA/kNg8A+GFsTUwlhSXMEQ4n9KhlhQDGFUUhb
kNztItLgtgdLwsuUjwKJ+yc/PyUr1F9F3S9QR3lfc7JgyZYiwXGx+w40aEuR1jb3YSray3uM
lw0SEAAoNh2Mi2i2/rJIEjFqLsFiJo01wEvWqkUWPcxeG9sgODL6DoafzJM1fSw6Rz09gLAv
uzYj8+HYrEcfEvga2Ayi6ypZ/trBcbBdhHDgTVPqZ8GEAJOVFjpqhHDqVtX2tUN+cksJhvLo
/1DYLfwJc2SViApUx5GM9Xc7q7efrvz0m14/ylKbZUUlPfbfN30bb1GTuLD5eoweuasXflRY
HczVBhmMlZi9P+Stlwvb3QSWcIttXSjVUJaqOAEK93kg0odoNe8CDA/U21w/zUSK46gyiJSc
+QrAkX69arUQ5RfsOQIp2Xk5MdgAAAAAAAA=
--------------ms000808020402070505000800--





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?55DF7191.2080409>