--Sig_/ZmBnMA4RRv0oGJlzTHlHs8K
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: quoted-printable
Am Tue, 01 Jul 2014 17:57:26 +0200
Willem Jan Withagen <wjw@digiware.nl> schrieb:
> On 2014-07-01 17:33, O. Hartmann wrote:
> > Am Tue, 01 Jul 2014 17:23:14 +0200
> > Willem Jan Withagen <wjw@digiware.nl> schrieb:
> >
> >> On 2014-07-01 16:48, Rang, Anton wrote:
> >>> DOT =3D> DOD
> >>>
> >>> 444F54 =3D> 444F44
> >>>
> >>> That's a single-bit flip. Bad memory, perhaps?
> >>
> >> Very likely, especially if the system does not have ECC....
> >> It just happens on rare occasions that a alpha particle, power cycle, =
or
> >> any things else disruptive damages a memory cell. And it could be that
> >> it requires a special pattern of accesses to actually exhibit the erro=
r.
> >>
> >> In the past (199x's) 'make buildworld' used to be a rather good memory
> >> tester. But nowadays look at
> >> http://www.memtest.org/
> >>
> >> This tool has found all of the bad memory in all the systems I used and
> >> or build for others...
> >> Note that it might take a few runs and some more heat to actually
> >> trigger the faulty cell, but memtest86 will usually find it.
> >>
> >> Note that on big systems with lots of memory it can take a loooooong
> >> time to run just one full testset to completion.
> >>
> >> --WjW
> >
> > I already testet via memtest86+ (had to download the linux image, the p=
ort on FreeBSD
> > is broken on CURRENT). It didn't find anything strange so far.
> >
> > I will do another test.
> >
> > I realised, that on that that specific box, the chipset temperature is =
81 Grad Celius.
> > The chipset is a Eaglelake P45 - in which the memory controller resides=
on that old
> > platform. dmidecode gives:
> >
> > Manufacturer: ASUSTeK Computer INC.
> > Product Name: P5Q-WS
> > Version: Rev 1.xx
>
Hello Willem,
=20
> Hi Oliver,
>=20
> I've build several (5+) systems with these boards (from memory they date=
=20
> around 2009??). And if I recall right, one of them is still functional.=20
> The first one broke down in a couple of weeks, and the other did not=20
> survive time either.
>=20
> The auxiliary chips on that board do run hot, but I never realized this=20
> hot. Is 81C is the CPU temp from sysctl, or did you measure the cooling=20
> body on the motherboard. In the later case it is just too hot, probably.
> But even if it is the temp on the chip itself, I've rrarely seen temps=20
> go up this high.
The temperature is seen in BIOS and by the usage of one of those health dae=
mon, found in
ports (forgot about the name).=20
There is no sysctl MIB showing the chipset temperature on that board, as fa=
r as I know.
>=20
> You can need to run the memtest86 for more than 6-10 complete runs with=20
> all the tests.
Last time I ran memtest86+ it took ~ 1 1/2 days to finish.
>=20
> If the memtests do not reveal anything broken, then you get into even=20
> more wizardry stuff, like bad power etc... Especially since it only=20
> occurs on occasion, it is going to be a nightmare to find the root cause=
=20
> of this. Other than replacing hardware piece by piece, which won't be=20
> easy given the age of the board and parts.
>=20
> You could go into the bios, and try to config ram access at a slower=20
> speed and see if the problem goes away. Then it could be that you are=20
> running an the edge of the spec with regards to ram timing.
>=20
> But like I said, it is all lots of funky details that can interact in=20
> strange and unexpected ways.
>=20
> --WjW
I will check memory these days again.
Regards,
Oliver
--Sig_/ZmBnMA4RRv0oGJlzTHlHs8K
Content-Type: application/pgp-signature; name=signature.asc
Content-Disposition: attachment; filename=signature.asc
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAEBAgAGBQJTsugnAAoJEOgBcD7A/5N8WXMH+QGPihwFglKqVaFZ0XiH5un/
9FkGh0vfhkbpJK1xtCUz3qPOseumUSIzfs8tGOaTpfqf4VNvpAdJ4k64wqd3m95E
bgXKgiXoyubWHO9KIJ9pME9LB1UEVyzWKBkT3r4doFRiwEKiZlpRK+mVW3Hbx46y
a6ffXL+o2PKyMw8HGvuUMF0C1YPixYu7nwBN/jYRvFaui4g0kfk6PFNt/XoiU6f2
1U77pPGXXyiNsEXFknMIqrjjX+vXjza7GTFeEJw/j8teUg0akitEMOVtBQWMEAvO
FHo+iQMcGGx7Qa17qpz6wE+36ikMZopRHJNe8ZXzoBzyXMmFF9/+YTO46vVkUQ4=
=0mnH
-----END PGP SIGNATURE-----
--Sig_/ZmBnMA4RRv0oGJlzTHlHs8K--