Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Dec 2016 11:40:33 +0000
From:      Roger Pau =?iso-8859-1?Q?Monn=E9?= <roger.pau@citrix.com>
To:        <freebsd-arch@freebsd.org>
Subject:   Order of device suspend/resume
Message-ID:  <20161215114033.r33nt3fqhnfi7hqw@dhcp-3-221.uk.xensource.com>

next in thread | raw e-mail | index | archive | help
Hello,

I'm currently dealing with a bug in the Xen suspend/resume sequence, and I've 
found that lacking a way to order device priority during suspend/resume is 
proving quite harmful for Xen (and maybe other systems too). The current 
suspend/resume code simply scans the root bus, and suspends/resumes every device 
based on the order they are attached to their parents. The problem here is that 
there's no way to tell that some devices should be resumed before others, for 
example the event timers/time counters/uarts should definitely be resume before 
other devices, but that's seems to happens mostly out of chance.

Currently most time related devices are attached directly to the nexus, which 
means they will get resumed first, but for example the uart is currently 
attached to the pci bus IIRC, which means it gets resumed quite late. On Xen 
systems, this is even worse. The Xen PV bus (that contains all Xen-related 
devices) is attached the last one (because it tends to pick up unused memory 
regions for it's own usage) and this bus also contains the PV timecounter which 
should be resumed _before_ other devices, or else timecounting will be 
completely screwed and things can get stuck in indefinitely long loops (due to 
the fact that the timecounter is implemented based on the uptime of the host, 
and that changes from host-to-host).

In order to solve this I could add a hack to the Xen resume process (which is 
already different from the ACPI one), but this looks gross. I could also attach 
the Xen PV timer to the nexus directly (as it was done before), but I also 
prefer to keep all Xen-related devices in the same bus for coherency. Last 
option would be to add some kind of suspend/resume priorities to the devices, 
and do more than one suspend/resume pass. This is more complex and requires more 
changes, so I would like to know if it would be helpful for other systems, or if 
someone has already attempted to do it.

Thanks, Roger.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20161215114033.r33nt3fqhnfi7hqw>