Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 5 May 2010 13:49:31 -0300
From:      Joey Mingrone <joey@mingrone.org>
To:        freebsd-questions@freebsd.org, freebsd-cluster@freebsd.org
Subject:   saving job state over a power outage
Message-ID:  <i2pf5b896261005050949pbb11df25tb51cee326b0682b4@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hello,

Our lab has a cluster (Sun Fire X40z master node, Opteron 270 and
generic 2.0 GHz Opteron based compute nodes) running 8.0-RELEASE.
We've been informed that the power has to be turned off for the
weekend, but one of the lab members has been running some jobs since
December.  I don't know much about the jobs, i.e., if they can be
restarted without losing all the computing work that's been done so
far.

Can anyone suggest a way to save the state of the jobs so they can
continue when the power comes back on?  From experience, ACPI seems
flaky if it will work at all because of problems with BIOS
implementations.  Also, iirc ACPI has issues with SMP kernels.  Is
there something similar to software suspend found in Linux?  Does
anyone have any other suggestions to accomplish this?

Thanks in advance,

Joey Mingrone



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?i2pf5b896261005050949pbb11df25tb51cee326b0682b4>