on 15/04/2014 08:39 Phil Murray said the following:
> =
> On 11/04/2014, at 10:36 pm, Andriy Gapon <avg@FreeBSD.org> wrote:
> =
>> on 11/04/2014 11:02 Phil Murray said the following:
>>> Hi there,
>>>
>>> I=92ve recently experienced two kernel panics on 8.4-RELEASE (within 2 =
days of each other, and both around the same time of day oddly) with ZFS. S=
orry no dump available, but panic below.
>>>
>>> Any ideas where to start solving this? Will upgrading to 9 (or 10) solv=
e it?
>>
>> By chance, could the system be running zfs recv at the times when the pa=
nics
>> happened?
> =
> I think it might be related to this bug reported on ZFS-on-linux when upg=
rading from v3 -> v5, which is exactly what I=92ve done on this machine:
> =
> https://github.com/zfsonlinux/zfs/issues/2025
> =
> In my case, the bogus sa.sa_magic value looks like this:
> =
> panic:solaris asset: sa.sa_magic =3D=3D 0x2F505A (0x5112fb3d =3D=3D 0x=
2f505a), file: =
> =
> $ date -r 0x5112fb3d
> Thu Feb 7 13:54:21 NZDT 2013
Great job finding that ZoL bug report! And very good job done by people who
analyzed the problem.
Below is my guess about what could be wrong.
A thread is changing file attributes and it could end up calling
zfs_sa_upgrade() to convert file's bonus from DMU_OT_ZNODE to DMU_OT_SA. T=
he
conversion is achieved in two steps:
- dmu_set_bonustype() to change the bonus type in the dnode
- sa_replace_all_by_template_locked() to re-populate the bonus data
dmu_set_bonustype() calls dnode_setbonus_type() which does the following:
dn->dn_bonustype =3D newtype;
dn->dn_next_bonustype[tx->tx_txg & TXG_MASK] =3D dn->dn_bonustype;
Concurrently, the sync thread can run into the dnode if it was dirtied in an
earlier txg. The sync thread calls dmu_objset_userquota_get_ids() via
dnode_sync(). dmu_objset_userquota_get_ids() uses dn_bonustype that has th=
e new
value, but the data corresponding to the txg being sync-ed is still in the =
old
format.
As I understand, dmu_objset_userquota_get_ids() already uses
dmu_objset_userquota_find_data() when before =3D=3D B_FALSE to find a prope=
r copy of
the data corresponding to the txg being sync-ed.
So, I think that in that case dmu_objset_userquota_get_ids() should also use
values of dn_bonustype and dn_bonuslen that correspond to the txg.
If I am not mistaken, those values could be deduced from
dn_next_bonustype[tx->tx_txg & TXG_MASK] plus dn_phys->dn_bonustype and
dn_next_bonuslen[tx->tx_txg & TXG_MASK] plus dn_phys->dn_bonuslen.
-- =
Andriy Gapon
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"