On Feb 18, 2014, at 10:14 AM, Bryan Venteicher <bryanv@freebsd.org> wrote:
> On Tue, Feb 18, 2014 at 10:57 AM, John Nielsen <lists@jnielsen.net> wrote:
>> On Feb 18, 2014, at 3:32 AM, Edward Tomasz Napiera=B3a <trasz@freebsd.or=
g> wrote:
>> =
>> > Wiadomo=B6=E6 napisana przez John Nielsen w dniu 17 lut 2014, o godz. =
21:21:
>> >> I run several FreeBSD virtual machines in a Linux KVM environment wit=
h a SAN. The VMs use virtio block storage, and the KVM hosts map the virtua=
l volumes to targets on the SAN. Occasionally, failover or other maintenanc=
e events on the SAN cause it to be unavailable for 30+ seconds. When this h=
appens, the FreeBSD VMs have hard failures on the vtbd* devices, and therea=
fter any attempted reads or writes return immediately with an error (even a=
fter the SAN is responsive again). The only way to recover a VM once that h=
appens is to hard boot it.
>> >>
>> >> Is there any way to adjust the timeouts or enable some kind of retry =
for the virtio block devices? It would be nice to be able to recover gracef=
ully after a SAN event without needing to reboot the VMs.
>> >
>> > Use gmountver(8) perhaps?
>> =
>> Thanks for the tip (and for writing it :), I haven't encountered that on=
e before. I will experiment with it but I'm not sure it's a fit for this pa=
rticular scenario (at least not by itself). When a SAN event happens the vi=
rtual machine's vtbd0 device doesn't disappear, the underlying hardware jus=
t fails to respond for a long-ish time. I suspect that the driver gives up =
after either a certain length of time or number of errors, but my C driver-=
fu isn't up to figuring it out exactly. Once it gives up, any I/O requests =
to the (still "present") device fail immediately, and I can't see a way to =
get the driver to actually try any (new or old) I/O again.
> =
> The vtbd driver has no internal retry mechanism, and pays no attention to=
errors other than report then, and never gives up :)
> =
> It is not clear to me whether IO is getting turned around in FreeBSD befo=
re it reaches the driver, or within the host. Do you continue to see "hard =
error ..." messages on the console?
Thanks for chiming in. I was in too much of a hurry to get the VM running a=
gain last time the issue appeared to capture any useful log messages, and o=
f course none of them were committed to disk so nothing was available follo=
wing a reboot.
I will see what I can get next time it happens and follow up on this thread=
again.
JN
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"