On Sep 28, 2010, at 5:30 PM, Andriy Gapon wrote:
<< snipped lots of good info here... probably won't have time to look at =
it in detail until the weekend >>
>> there seems to be a layering violation in that the buffer cache =
signals
>> directly to the upper page daemon layer to trigger page reclamation.)
>=20
> Umm, not sure if that is a fact.
I was referring to the code in vfs_bio.c that used to twiddle =
vm_pageout_deficit directly. That seems to have been replaced with a =
call to vm_page_grab().
>> The old (ancient) patch I tried previously to help reduce the arc =
working set
>> and allow it to shrink is here:
>>=20
>> http://www.wanderview.com/svn/public/misc/zfs/zfs_kmem_limit.diff
>>=20
>> Unfortunately, there are a couple ideas on fighting fragmentation =
mixed into
>> that patch. See the part about arc_reclaim_pages(). This patch did =
seem to
>> allow my arc to stay under the target maximum even when under load =
that
>> previously caused the system to exceed the maximum. When I update =
this
>> weekend I'll try a stripped down version of the patch to see if it =
helps or
>> not with the latest zfs.
>>=20
>> Thanks for your help in understanding this stuff!
>=20
> The patch seems good, especially the part about taking into account =
the kmem
> fragmentation. But it also seems to be heavily tuned towards "tiny =
ARC" systems
> like yours, so I am not sure yet how suitable it is for "mainstream" =
systems.
Thanks. Yea, there is a lot of aggressive tuning there. In particular, =
the slow growth algorithm is somewhat dubious. What I found, though, =
was that the fragmentation jumped whenever the arc was reduced in size, =
so it was an attempt to make the size slowly approach peak load without =
overshooting.
A better long term solution would probably be to enhance UMA to support =
custom slab sizes on a zone-by-zone basis. That way all zfs/arc =
allocations can use slabs of 128k (at a memory efficiency penalty of =
course). I prototyped this with a dumbed down block pool allocator at =
one point and was able to avoid most, if not all, of the fragmentation. =
Adding the support to UMA seemed non-trivial, though.
Thanks again for the information. I hope to get a chance to look at the =
code this weekend.
- Ben=
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"