看板 FB_stable 關於我們 聯絡資訊
On 05/05/10 10:56, Harald Schmalzbauer wrote: > Harald Schmalzbauer schrieb am 05.05.2010 14:41 (localtime): >> Hello, >> >> one drive of my mirror failed today, but 'zpool staus' shows it >> "online". >> Every process using a ZFS mount hangs. Also 'zpool offline /dev/ad1' >> hangs infinitely. > ... > Sorry, I made an error with zpool create. Somehow the little word > "mirror" must have been lost. So the pool wasn't a mirror but a > stripe. Then of course I can't make one vdev offline. Sorry for the > noise. > But I took the opportunity to do some tests with that failing drive > and created a _real_ mirror. That works without failures, but using > the mirror again leads to: > ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left) > ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left) > ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left) > ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left) > ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left) > ad1: TIMEOUT - FLUSHCACHE48 retrying (1 retry left) > ata3: port is not ready (timeout 10000ms) tfd = 00000080 > ata3: hardware reset timeout > ad1: FAILURE - device detached > > Now zpool reporsts the vdev ad1 still online although it has been > detached and 'atacontrol list' doesn't show it anymore: > > zpool status > pool: URUBAmirrorP1 > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are > unaffected. > action: Determine if the device needs to be replaced, and clear the > errors > using 'zpool clear' or replace the device with 'zpool replace'. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > URUBAmirrorP1 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > ad1 ONLINE 3 302K 0 > ad2 ONLINE 0 0 0 > > errors: No known data errors > > atacontrol list > ATA channel 2: > Master: ad0 <TRANSCEND/20090520> SATA revision 1.x > Slave: no device present > ATA channel 3: > Master: no device present > Slave: no device present > ATA channel 4: > Master: ad2 <SAMSUNG HD154UI/1AG01118> SATA revision 2.x > Slave: no device present > ATA channel 5: > Master: ad3 <ST3750640NS/3.AEG> SATA revision 1.x > Slave: no device present > > How should such a failure be handled? > Do I have to manually mark the drive offline for zpool? > > Thanks, > > -Harry > You may want to try newer controller drivers like ahci(4) if possible. Otherwise, building the kernel with ATA_CAM may accomplish something similar. I'm not sure, but I'm speculating that the newer ATA/CAM system may feed the proper notifications back to the ZFS systems. I use many drives on the siis(4) driver, which is CAM-enabled, and haven't had any issues. However, I have not had an outright drive failure. I do recall testing situations where we would yank a working drive, and I seem to remember it working correctly after the last set of CAM improvements. It may not be something you can try on a production system, but if you can experiment, it's worth a shot. Note that your device names WILL change to adaX instead of adX. I would definitely recommend you glabel(8) and create the zpool/zdevs using the glabel devices instead to circumvent any future problems associated with device numbering. Steve _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"