Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 9 Jan 2007 09:31:28 -0500
From:      John Nielsen <lists@jnielsen.net>
To:        freebsd-hackers@freebsd.org
Cc:        freebsd-scsi@freebsd.org
Subject:   Re: iSCSI disconnects dilema
Message-ID:  <200701090931.28786.lists@jnielsen.net>
In-Reply-To: <E1H4B4I-0001eX-UC@cs1.cs.huji.ac.il>
References:  <E1H4B4I-0001eX-UC@cs1.cs.huji.ac.il>

next in thread | previous in thread | raw e-mail | index | archive | help
On Tuesday 09 January 2007 02:06, Danny Braniss wrote:
> Hi,
> While I think I have almost solved the problem of network disconnects,
> It downed on me a major problem:
> When a 'local' disk crashes, the kernel will probably hang/panic/crash.
> if i don't try to recover, then there is no change in the above scenario.
> if i try to recover, then the client does not know that it should
> umount/fsck/mount.
> While all this seems familiar, removing  a floppy/disk-on-key while it's
> mounted, we could always say "you shouldn't have done that!", with
> a network connection, it can happen very often - rebooting the target, a
> network hickup, etc.
>
> So, any ideas?

I think that an iSCSI network disconnect (if handled properly) is more like a 
bad/flakey set of sectors and/or extremely high latency than a total disk 
crash. The initiator should stall as long as it can while trying to reconnect 
the session, and then send "hardware" timeout errors up the stack. The the 
rest of the OS should handle those the same as it would any other timeout 
errors--retry a certain number of times and then fail. I don't know how 
graceful the failure case is (perhaps not very), but it's an honest 
approximation.

The above approach is IMO more than adequate for network interruptions lasting 
a few seconds (or a bit more). I'm not sure there's anything you can 
realistically do more than that. Administrators who intentionally reboot a 
nonredundant iSCSI target while it has active sessions are asking for 
trouble, and if the reboot is accidental they should do one or more of a) 
know to run fsck manually, b) get a better UPS, c) get a more 
stable/redundant iSCSI target device.

Disclaimer: I know next to nothing about kernel programming, device driver 
development, or scsi in general. I've just been playing with and thinking 
about iSCSI on FreeBSD a fair amount lately. Thanks for your continued work 
on this.

JN



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200701090931.28786.lists>