----- Original Message -----
From: "Jeremy Chadwick" <jdc@koitsu.org>
To: "Adrian Chadd" <adrian@freebsd.org>
Cc: "Steven Hartland" <killing@multiplay.co.uk>; "FreeBSD Stable Mailing List" <freebsd-stable@freebsd.org>
Sent: Sunday, July 20, 2014 11:58 PM
Subject: Re: Consistently "high" CPU load on 10.0-STABLE
> On Sun, Jul 20, 2014 at 03:09:55PM -0700, Adrian Chadd wrote:
>> hi,
>>
>> it looks like a whole lot of things are waking up at the same time:
>>
>> * dhcpd
>> * em
>> * usb devices
>>
>> So, do you have some shared interrupts going on here? That seems to be
>> what's causing things to all wake up all at once.
>
> I forget how to get an interrupt mapping from the I/O APIC, but dmesg
> indicates the following. Sorted by IRQ order so that you can tell
> what's associated with what, and also RELENG_9 vs. RELENG_10 (because I
> do have an old dmesg.today from this box running RELENG_9). All the
> IRQs match up:
>
> dev RELENG_10 RELENG_9
> -------- ------------- -------------
> ioapic0 IRQs 0 to 23 IRQs 0 to 23 (same)
> ioapic1 IRQs 24 to 47 IRQs 24 to 47 (same)
> attimer0 IRQ 0 IRQ 0 (same)
> atkbdc0 IRQ 1 IRQ 1 (same)
> atkbd0 IRQ 1 IRQ 1 (same)
> uart1 IRQ 3 IRQ 3 (same)
> uart0 IRQ 4 IRQ 4 (same)
> atrtc0 IRQ 8 IRQ 8 (same)
> em0 IRQ 16 IRQ 16 (same)
> pcib1 IRQ 16 IRQ 16 (same)
> pcib3 IRQ 16 IRQ 16 (same)
> pcib4 IRQ 16 IRQ 16 (same)
> uhci0 IRQ 16 IRQ 16 (same)
> ahci0 IRQ 17 IRQ 17 (same)
> em1 IRQ 17 IRQ 17 (same)
> ichsmb0 IRQ 17 IRQ 17 (same)
> pcib5 IRQ 17 IRQ 17 (same)
> uhci1 IRQ 17 IRQ 17 (same)
> ehci0 IRQ 18 IRQ 18 (same)
> uhci2 IRQ 18 IRQ 18 (same)
> uhci5 IRQ 18 IRQ 18 (same)
> siis0 IRQ 21 IRQ 21 (same)
> uhci4 IRQ 22 IRQ 22 (same)
> ehci1 IRQ 23 IRQ 23 (same)
> uhci3 IRQ 23 IRQ 23 (same)
>
> And the higher-numbered IRQs per vmstat -i. I only have this for
> RELENG_10 however:
>
> irq256: em0 1848856 26
> irq259: ahci0:ch0 273086 3
> irq260: ahci0:ch1 9990 0
> irq261: ahci0:ch2 48514 0
> irq262: ahci0:ch3 48046 0
> irq263: ahci0:ch4 48258 0
> irq264: ahci0:ch5 48052 0
>
> vmstat -i for this is kinda painful (discussed this with jhb@ in the
> past, re: kernel just appending "+" to the string to indicate "many
> things using this IRQ").
>
> I have absolute no USB devices attached to the system (meaning there are
> USB controllers and ports, yeah, but nothing attached to any of them).
> The keyboard is PS/2. All disks are on ahci0 (no disks currently
> attached to siis0).
>
> As for dhcpd: I don't know how that'd be responsible. If I stop the
> process entirely I still see the problem.
>
> I can provide some more ktrdumps, along with turning off as many daemons
> + cron jobs as I can, if you feel that'd be helpful.
>
> Likewise I can provide an ACPI DSDT dump if that would be useful (maybe
> to someone else).
>
> I haven't tried booting the box in single-user and letting it sit there
> to see if anything shows up there.
>
> In the interim I wrote the perl script I mentioned in my mail to Steve.
> When the load shoots up, there is literally no field in "vmstat -s"
> that shows a humongous increase (or decrease) consistently. Meaning
> I'd say 95% of the time when there's a sudden load jump, none of those
> statistics I can correlate with it. It's a pretty "meh" script, but
> it does the job of showing deltas between vmstat -s runs and indicating
> visually when there's a jump in load average (1m avg). It requires a
> VERY wide terminal (about 301 characters):
>
> http://jdc.koitsu.org/freebsd/releng10_perf_issue/load_vmstat.pl
>
> Some example output is here (obviously can't see the red+bold
> highlighting of the line):
>
> http://jdc.koitsu.org/freebsd/releng10_perf_issue/example_data.txt
>
> Load jumps at the following time indexes:
>
> 124.0 (from 0.02 to 0.10, load delta: 0.08)
> 153.0 (from 0.06 to 0.14, load delta: 0.08, time delta: 29.0 sec)
> 178.5 (from 0.10 to 0.17, load delta: 0.07, time delta: 25.5 sec)
> 217.0 (from 0.09 to 0.17, load delta: 0.08, time delta: 38.5 sec)
> 236.0 (from 0.12 to 0.19, load delta: 0.07, time delta: 19.0 sec)
> 244.0 (from 0.17 to 0.24, load delta: 0.07, time delta: 8.0 sec)
> 259.0 (from 0.20 to 0.27, load delta: 0.07, time delta: 15.0 sec)
> 284.5 (from 0.19 to 0.25, load delta: 0.06, time delta: 25.5 sec)
> 310.0 (from 0.18 to 0.25, load delta: 0.07, time delta: 25.5 sec)
> 341.5 (from 0.27 to 0.33, load delta: 0.06, time delta: 31.5 sec)
>
> Some of these could be due to cron jobs I run (though they really aren't
> that intensive on disk, CPU, or memory), but there's a pretty consistent
> pattern going on there load-wise. The reason noted time deltas was
> watching for "periodic tasks", e.g. ZFS txg flush. But this seems to
> have a little bit more variance.
>
> It's just that none of the vmstat -s statistics change rapidly alongside
> the load. But I'm sure there are VM bits that aren't tracked in vmstat.
Not sure if its in stable/10 but there was some talk about making
ZFS use lz4 for some things by default, wonder if that might have
something to do with it?
Regards
Steve
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"