看板 FB_stable 關於我們 聯絡資訊
On 21.07.2010, at 10:33, Andriy Gapon wrote: > on 21/07/2010 03:57 Markus Gebert said the following: >> Another thing though: Today I compared verbose boot output from = 8-stable and >> the current box. I saw that the ioapic sets up IRQ routing = differently on >> these two systems although the hardware is the same. This seemed not = so >> interesting at first, but then I noticed that 8-stable sets up two = routes (to >> lapic0 and lapic2, or sometimes lapic3) for IRQ58 (mpt0), while = current only >> uses one route (to lapic0). >=20 > My understanding that it's not "two routes", but re-routing. > During early boot all interrupts are bound to BSP; later, when APs = become > online, the interrupts are re-distributed among available CPUs. I guess you're right, misinterpretation on my side. Thanks for = clarifying this. Now being aware of this, it seems to me that in the = machdep.lapic_allclocks=3D0 case, there might just be more interrupts to = be assigned/routed due to "more clocks being used". If that's true, = maybe it's just "luck" that in this case the mpt interrupt gets assigned = to lapic0/cpu0 and the box runs fine. I'm just guessing though, since I = have no clue how interrupts are assigned to lapics exactly (round-robin? = some logic?). >> I used 'cpuset -c -l 0 -x 58' in an attempt to make my 8-stable box = behave >> like the one running current. Indeed, this seems to have changed = IRQ58 to be >> routed to lapic0 only. And the box was running for hours without = showing the >> symptoms. >>=20 >> I just checked boot verbose outpout of my 8-stable box again (booted = with >> machdep.lapic_allclocks=3D0 as mentioned above). And now it seems to = have set >> up IRQ routes just like the current box (one route for IRQ58 to = lapic0). >=20 > Not sure how to interpret this properly. > One possibility is a hardware problem where interrupt message route = between > ioapic2 and CPU to which lapic3 belongs is flaky. > Perhaps, this might be a FreeBSD problem: it could be that the system = somehow > tells to not set up such routes, but we don't listen. But this is far = fetched. I'm not sure either. If my "theory" above proved to be true, it would = have been just luck, that 6.x and 7.x (and current) run just fine on the = X4100M2. A (short) test on Ubuntu didn't trigger the problem, so the = Linux kernel is either lucky too by selecting an interrupt route that is = "not flaky", or there's indeed some way to figure out not to use some = lapics for some interrupts. Or we didn't test Linux thoroughly enough. Markus _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"