I am solving BSOD's that my client has had intermittenly over the last couple of months - the attached being the first minidump that I got on 8 July. After analyzing it with WinDBG, I decided to do some more analysis by using .reload;!analyze -v;r;kv;lmnt. However, I am still none the wiser. With this deeper analysis, I am meant to be able to tell what process, program or device is causing the BSOD. However, I can't find it! Can anyone help?
Here is the problem in more detail, and what I've tried. Hope it helps
The specs of the machine are:
3.00GHz Pentium D
2GB RAM
250GB Hard drive
ConRoe 1333 D667 mboard (ASRock)
Since June, I have had intermittent BSOD's. I kept a record of them
8 July - VISTA_DRIVER_FAULT
10 July - CODE_CORRUPTION
12 July - CODE_CORRUPTION
26 August - IRQ_NOT_LESS_OR_EQUAL
26 August - MEMORY_CORRUPTION (2)
26 August - SPECIAL_POOL_DETECTED_MEMORY_CORRUPTION
11 October - SYSTEM_THREAD_NOT_HANDLED
12 October - MEMORY_MANAGEMENT
17 October - PAGE_FAULT_IN_NONPAGED_AREA
27 October - KERNEL_DATA_IMAGE_ERROR
27 October - UNMOUNTABLE_BOOT_DEVICE
(The last two didn't create a minidump for some reason)
Things done so far, to no avail:
Memory modules taken out, one by one - alas, TuffTest, MemTest, Windows Memory Diagnostic say no errors.
New psu - didn't solve errors
Updating drivers - didn't solve anything
Updating Windows - didn't solve anything
Two OS reinstallations due to crashes and Vista Repairs not working, didnt solve anything
What is very odd is this:
I did a TuffTest disk test (Seek Test and Surface Analysis) on his machine and it came up with F004 errors - Sector not found. Then I put the disk in my test machine, same errors. Then I tried it in his machine again with 1 memory module at a time, same errors. However, after testing the disk test again, putting the 2 modules back, the test was error-free!!
Also, in the Event Log, about once an hour, I get "some processor performance power features have been disabled due to a known bad firmware problem. Please contact the manufacturer." I did this last Friday, and still no answer!!
My mate and I (he works with computers as well) both think it's now a motherboard problem, but as it's intermittent, we're struggling!!
A mate of mine who has worked with computers for 30 years looked at it as well - this is strange.
I initially thought it was memory problems - so, my mate put his wife's memory into my client's machine, and that worked fine, after several hours of disk testing, motherboard testing, CPU testing, memory testing. Still with her memory in it, he used a piece of software called Glaze, a graphics card tester, thus testing the motherboard and putting it under stress, plus he ran other apps as well, but the computer didn't cough up once.
He ordered some more memory from Crucial, put it in, and it has worked fine since lunchtime today (Friday), (with Glaze running as well).
So, he thinks that the person who built it must have used cheap and nasty, unbranded, memory
Wow, quite a job you had! Intermitten problems are about the hardest ones to solve, because they need to be going wrong at the right time! Anyway, this is a good example of the trial and error it takes sometimes! Some of that stuff was pretty strange, but adding it all up the cheap memory would explain most or all of it so that's a good solution to come too! Shouldn't be that supprising though, cheap parts are a major problem in electronics!
The computer hasn't coughed up a single BSOD since Friday, however, I did a TuffTest this morning, and it came up with F001 - Invalid Function Request - Bad Command, for both the Seek Test and the Surface Analysis (like it did before) (I did that test about 2 hours ago). Just now I did it again, but this time it was fine; what's going on?
I still get the message 'some processor performance power features have been disabled due a known bad firmware problem', and now my client reports another BSOD 'kernel_datapage_input_error' - I've never seen that one before, and Windows didn't create a dump file!