閱讀文章 - 看板 FB_stable - 批踢踢實業坊

發信人killing@multiplay.co.uk ("Steven Hartland"),

看板FB_stable

標題Re: Consistently "high" CPU load on 10.0-STABLE

發信站NCTU CS FreeBSD Server (Mon Jul 21 08:16:48 2014)

轉信站ptt!csnews.cs.nctu!news.cednctu!FreeBSD.cs.nctu!.POSTED!freebsd.org!ow

----- Original Message ----- From: "Jeremy Chadwick" <jdc@koitsu.org> To: "Adrian Chadd" <adrian@freebsd.org> Cc: "Steven Hartland" <killing@multiplay.co.uk>; "FreeBSD Stable Mailing List" <freebsd-stable@freebsd.org> Sent: Sunday, July 20, 2014 11:58 PM Subject: Re: Consistently "high" CPU load on 10.0-STABLE > On Sun, Jul 20, 2014 at 03:09:55PM -0700, Adrian Chadd wrote: >> hi, >> >> it looks like a whole lot of things are waking up at the same time: >> >> * dhcpd >> * em >> * usb devices >> >> So, do you have some shared interrupts going on here? That seems to be >> what's causing things to all wake up all at once. > > I forget how to get an interrupt mapping from the I/O APIC, but dmesg > indicates the following. Sorted by IRQ order so that you can tell > what's associated with what, and also RELENG_9 vs. RELENG_10 (because I > do have an old dmesg.today from this box running RELENG_9). All the > IRQs match up: > > dev RELENG_10 RELENG_9 > -------- ------------- ------------- > ioapic0 IRQs 0 to 23 IRQs 0 to 23 (same) > ioapic1 IRQs 24 to 47 IRQs 24 to 47 (same) > attimer0 IRQ 0 IRQ 0 (same) > atkbdc0 IRQ 1 IRQ 1 (same) > atkbd0 IRQ 1 IRQ 1 (same) > uart1 IRQ 3 IRQ 3 (same) > uart0 IRQ 4 IRQ 4 (same) > atrtc0 IRQ 8 IRQ 8 (same) > em0 IRQ 16 IRQ 16 (same) > pcib1 IRQ 16 IRQ 16 (same) > pcib3 IRQ 16 IRQ 16 (same) > pcib4 IRQ 16 IRQ 16 (same) > uhci0 IRQ 16 IRQ 16 (same) > ahci0 IRQ 17 IRQ 17 (same) > em1 IRQ 17 IRQ 17 (same) > ichsmb0 IRQ 17 IRQ 17 (same) > pcib5 IRQ 17 IRQ 17 (same) > uhci1 IRQ 17 IRQ 17 (same) > ehci0 IRQ 18 IRQ 18 (same) > uhci2 IRQ 18 IRQ 18 (same) > uhci5 IRQ 18 IRQ 18 (same) > siis0 IRQ 21 IRQ 21 (same) > uhci4 IRQ 22 IRQ 22 (same) > ehci1 IRQ 23 IRQ 23 (same) > uhci3 IRQ 23 IRQ 23 (same) > > And the higher-numbered IRQs per vmstat -i. I only have this for > RELENG_10 however: > > irq256: em0 1848856 26 > irq259: ahci0:ch0 273086 3 > irq260: ahci0:ch1 9990 0 > irq261: ahci0:ch2 48514 0 > irq262: ahci0:ch3 48046 0 > irq263: ahci0:ch4 48258 0 > irq264: ahci0:ch5 48052 0 > > vmstat -i for this is kinda painful (discussed this with jhb@ in the > past, re: kernel just appending "+" to the string to indicate "many > things using this IRQ"). > > I have absolute no USB devices attached to the system (meaning there are > USB controllers and ports, yeah, but nothing attached to any of them). > The keyboard is PS/2. All disks are on ahci0 (no disks currently > attached to siis0). > > As for dhcpd: I don't know how that'd be responsible. If I stop the > process entirely I still see the problem. > > I can provide some more ktrdumps, along with turning off as many daemons > + cron jobs as I can, if you feel that'd be helpful. > > Likewise I can provide an ACPI DSDT dump if that would be useful (maybe > to someone else). > > I haven't tried booting the box in single-user and letting it sit there > to see if anything shows up there. > > In the interim I wrote the perl script I mentioned in my mail to Steve. > When the load shoots up, there is literally no field in "vmstat -s" > that shows a humongous increase (or decrease) consistently. Meaning > I'd say 95% of the time when there's a sudden load jump, none of those > statistics I can correlate with it. It's a pretty "meh" script, but > it does the job of showing deltas between vmstat -s runs and indicating > visually when there's a jump in load average (1m avg). It requires a > VERY wide terminal (about 301 characters): > > http://jdc.koitsu.org/freebsd/releng10_perf_issue/load_vmstat.pl > > Some example output is here (obviously can't see the red+bold > highlighting of the line): > > http://jdc.koitsu.org/freebsd/releng10_perf_issue/example_data.txt > > Load jumps at the following time indexes: > > 124.0 (from 0.02 to 0.10, load delta: 0.08) > 153.0 (from 0.06 to 0.14, load delta: 0.08, time delta: 29.0 sec) > 178.5 (from 0.10 to 0.17, load delta: 0.07, time delta: 25.5 sec) > 217.0 (from 0.09 to 0.17, load delta: 0.08, time delta: 38.5 sec) > 236.0 (from 0.12 to 0.19, load delta: 0.07, time delta: 19.0 sec) > 244.0 (from 0.17 to 0.24, load delta: 0.07, time delta: 8.0 sec) > 259.0 (from 0.20 to 0.27, load delta: 0.07, time delta: 15.0 sec) > 284.5 (from 0.19 to 0.25, load delta: 0.06, time delta: 25.5 sec) > 310.0 (from 0.18 to 0.25, load delta: 0.07, time delta: 25.5 sec) > 341.5 (from 0.27 to 0.33, load delta: 0.06, time delta: 31.5 sec) > > Some of these could be due to cron jobs I run (though they really aren't > that intensive on disk, CPU, or memory), but there's a pretty consistent > pattern going on there load-wise. The reason noted time deltas was > watching for "periodic tasks", e.g. ZFS txg flush. But this seems to > have a little bit more variance. > > It's just that none of the vmstat -s statistics change rapidly alongside > the load. But I'm sure there are VM bits that aren't tracked in vmstat. Not sure if its in stable/10 but there was some talk about making ZFS use lz4 for some things by default, wonder if that might have something to do with it? Regards Steve _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"