看板 DFBSD_kernel 關於我們 聯絡資訊
Dan Melomedman wrote: > Joerg Sonnenberger wrote: > >>Actually, this is exactly one of the situations where I don't want >>automatic, silent restarts. It hides problems, which is in my position >>even more problematic. "Magic restart" doesn't solve every problem. >> >>Joerg > > > Nothing solves every problem. Supervision solves the 'Oops, something > crashed, and needs to be restarted' problem. If my nearby nuclear power > plant's reactor monitoring software running on a Unix box gets killed > due to a memory leak, I want it restarted immediately, not wait for the > administrator to find out by the time the reactor melts down. No you do not. What you DO want, when *any* fault occurs of that nature, is for a totally separate system - usually a 'state machine' - or even *gravity* to take over and 'safe' that plant until the real cause is scrutinized by a team of experts. Too much is at stake to blindly restart a daemon OR the OS. Unix has no more business running nuke power plants than Windows. That is specialized RT OS ground. Or state machines monitored by specialized computers. Or both. > All fault > tolerant systems have some kind of supervision in software. All seriously critical ones have hardware / firmware fall-backs and manual overrides as well. All failures be they oil-refinery, chemical plant, power plant or web and mail servers *should* be brought to human attention, examined and attended to by folks with brains. That way we can fix them, not be victims of them. Bill