60

My computer started to freeze at irregular times for 3 weeks now.

Please note that this question change with each things that i try. (For additional details)

What happens

  • My computer freezes, the video stops. (No graphic glitches, it just stops)
  • Sounds stops too.
  • Sometimes, randomly, the screen on my G-15 keyboard flickers and I see characters not at the right places. Usually happens for about 1-2 seconds and a bit before my computer freezes.
  • I have to keep the power button pressed for 4 seconds to shut my computer down.
  • I still hear my hard drives and fans working.
  • Sometimes it works with no problems for a full day, some other times it just keeps freezing each time I restart my computer and I have to leave it for the rest of the day.
  • Sometimes my mouse freezes for a fraction of a second (Like 0.01 to 0.2 seconds) quite randomly, usually before it freezes.
  • No errors spotted by the "Action center" unlike when I had problems with my last video card on this system (Driver errors).
  • My G-15 LCD screen also freezes.
  • Sometimes my G-15 LCD screen flickers and characters gets carried around temporary under heavy load.
  • Now, most of the times, the BIOS hard disks boot order gets reversed for some reason and I have to put it to the right one and save each times I boot. (Might be unrelated, not sure, but it first started yesterday)

What I did so far

  • I have had similar problems in the past and I had changed my hard drive (It was faulty), so I tested my software RAID-0 array and it was faulty so I changed it. (I reinstalled Windows 7 with this part). I also tested with unplugging my secondary hard drive.
  • My CPU was running at about 100 degree Celsius, I removed the dust between the fans and the heat sink and it's now between 45-55.
  • I ran a CPU stress-test and it didn't froze during the tests (using Prime95 on all cores)
  • Ran a memory test (using memtest86+) for a single pass and there were no errors.
  • Ran a GPU stress test with ati-tools and furmark and it didn't froze during the tests. (No artefacts either)
  • I had troubles with my graphic card when I got it, but I think that it got fixed with a driver update.
  • I checked the voltages in my BIOS setup and they all seemed ok (±0.2 I think).
  • I have run on the computer without problems with Fedora 15 on an external hard drive (apart that it couldn't load Gnome 3 and was reverting to Gnome 2, didn't want to install drivers since I use it on multiple computers) I used it to back up my files from the raid array to my 1TB hard drive for the reinstallation of Windows. (So the crashes only happened on Windows) [The external hard drive is plugged directly on a SATA port]
  • I contacted EVGA (My graphic card vendor) and pointed them on this question, I'm looking for an answer.
  • Ran sensors on Fedora 15 and got this output: http://pastebin.com/0BHJnAvu
  • Ran 6 short different CPU stress test on Fedora 15 (Haven't found any complete stress testers for Linux) and it didn't crash.
  • Changed the thermal paste to some Artic Silver 5 for my CPU and stress tested the CPU, temperature was at 50 idle, then 64 highest and slowly went down to 62 during the test.
  • Ran some stress testing with a temporary graphic card and it went ok.
  • Ran furmark stress test with my original graphic card and it froze again. GPU had a temp of 74C, a CPU temp of 58C and a mobo temp of 40C or 45C (Dunno which one it is from SpeedFan).
  • Ran a furmark stress test and a CPU stress test at the same time, results: http://pastebin.com/2t6PLpdJ
  • I have been using my computer without stressing it for about 2 hours now and no crashes yet. I also have disabled the AMD Cool'n'quiet function on the BIOS for a more regular power to the CPU. When I ran Furmark without C'n'q my computer didn't freeze but I had a "Driver Kernel Error" that have recovered (And Furmark crashed) all that while running a CPU stress test. The computer eventually froze without me being at it, but this time my screen just went on sleep and I couldn't wake it.
  • Using the stability tester in nTune my computer froze again (In the same manner as before). I noticed that Speedfan gives me a -12V of -16.97V and a -5V of -8.78V.
  • I have swapped my G-15 with another basic USB keyboard (HP) and I have run furmark for about 10 minutes with a CPU stability test running each 60 seconds for 30 seconds and my computer haven't crashed yet.
  • Ran some more extended tests without my G-15 and it freeze like it usually do.
  • Removed the nForce Hard disk controller.
  • Disabled command queuing in the NVIDIA nForce SATA Controller for both port 0 and port 1 (Errors from the logs)
  • Used CPUID HwMonitor, here are the voltages: http://pastebin.com/dfM7p4jV
  • Changed some configurations in the motherboard BIOS: Disabled PEG Link Mode, Changed AI Tuning to Standard, Disabled the 1394 Controller, Disabled HD Audio, Disabled JMicron RAID controller and Disabled SATA Raid.
  • "A little hope", my computer frozen while watching a youtube video, but not from GPU and CPU 10 hours straight test.
  • I have put my BIOS back to defaults and: Disabled PEG Link Mode, Disabled HD Audio, Disabled JMicron RAID Controler, Disabled Serial Port Address, Disabled Parallel Port Address and Disabled Onboard 1394 Controller.
  • I changed the SATA cable for the 750GB hard drive and I also changed the slots at which they were plugged (1->2, 2->3, 3->4).
  • Changed the power saving feature of my graphic card from "Adaptive" to "Maximum performance".
  • Ran ECGA OC Scanner and got no freeze and no artifacts
  • I installed the Logitech drivers for my G-15 keyboard and my G-500 mouse and it restarted to freeze.
  • I removed the Logitech drivers for my G-15 keyboard and my G-500 mouse and it still freeze.
  • After changing everything except the hard drives, graphic card and power supply my computer is running very fine and I haven't run into any problems (This is with the exact same install of Windows that I had problems with my old motherboard). After removing my motherboard I have found some cambered capacitors (2) which might be the source of the problems. Since it almost clearly was a motherboard problem caused by these capacitors I am going to accept the answer that is the most related to this solution.

When it happens

  • When I play video games (Mostly)
  • When I play flash games (Second most)
  • When I'm looking at my desktop background (It rarely happens when I have a window open, but it does, sometimes)
  • When my Graphic card and my CPU are stressed.
  • Sometimes when my Graphic card is stressed.
  • Sometimes when my CPU is stressed.

Specs

  • Windows Seven x64 Home Premium
  • Motherboard: M2N-SLI Deluxe
  • Graphic card: EVGA GTX 570 (The non-oc one) [nVidia driver version 275.33 from EVGA's website]
  • CPU: AMD Phenom 9950 x2 @ 2.6GHz
  • Memory: Kingston 4x2GB Dual Channel (Pretty basic memory sticks)
  • Hard drives: Was 2x250GB (Western digital caviar) in raid-0 + 1TB (WD caviar black), I replaced the raid array with a 750GB (WD caviar black) [Yes I removed the array from the raid configurations]
  • 750W Power supply
  • No overclocking. Ever.
  • There have been some power-downs like 4-5 weeks ago, but the problem didn't start immediately after. (I wasn't home, so my computer got shut-down)
  • Event logs (Warnings, errors and critical errors) for the last 24 hours: http://pastebin.com/Bvvk31T7

I would like to thank everyone who have been participating, it's really nice to see that much people ready to help others. There was many great answers that might help other people with similar problems in the future (at least I hope so).

In this situation, how can I successfully pin-point the current hardware problem? (If it's a hardware problem)

maaudet
  • 193

12 Answers12

12

First off, +1 for a perfectly documented question. This makes things AMAZINGLY easier for us to help.

You've done a lot of hardware testing to date and most come up with no problems. However, this may still be a CPU overheating problem (been there and it sucks). When you cleaned the dust, did you see any thermal paste between the CPU and the heatsink? If so, was it dried up or old? I recommend buying a small tube of Arctic Silver thermal paste (7$) and applying some to the CPU.

If that's not the issue, then I strongly suggest you start looking at your operating system and if it has any issues. You said you've run Fedora on it? I would recommend burning a Linux LiveCD and booting from it. Try using that as your OS for a bit, browsing and playing music/videos and stuff. If you don't get a crash there, that means that its either a Windows issue or a HDD issue (seeing as you aren't using it in a live environment). I would (from the LiveCD) run a disk check just to make sure. If everything comes up clean, we can safely say its Windows.

In that case, you need to determine if its your OS which is corrupted or you've installed something that is running some kind of service which locks your computer up. Give Windows safe mode a try and use that for a bit (I know, horrible resolution...I'm so sorry). If you don't get a freeze for a couple days then we can narrow it down to a Windows OS issue, in which case you need to look at backups or re-installation.

If you're re-installing your OS, make sure you back all your stuff up...

enter image description here

http://xkcd.com/612/

EDIT: While running in your LiveCD session, open up a terminal and type sensors. If that program is installed (Linux only), it will give you details about your power supply voltages, CPU temp, mobo temp, and everything else you need to know. Monitoring that while in your LiveCD session should give you strong indicators to whether this is hardware or software.

EDIT 2: Based on what you've said about running Fedora on another HDD via eSATA and not experiencing any crashes I would have to say that this is a software issue, maybe drivers. If you can run Fedora for, let's say a whole week without a crash, it's for sure an issue with Windows or deprecated/wrong drivers. How long did you run Fedora for? Did you try watching movies, playing games etc on it?

Aducci
  • 259
nopcorn
  • 16,982
10

100 deg C is WAY too hot! It's possible that your processor has already incurred some damage. But in the interest of being optimistic, I'd say to run memtest86 for another 2 passes to be sure it's not the memory. Are you sure the timings and speeds are being detected correctly?

Did you check your motherboard for bad/puffy capacitors? If it's not your motherboard, then your PSU is either going bad on you, or is insufficient to power your hardware. That sounds like the most likely cause to me.

Use CoreTemp to measure CPU temps, since it's one of the most accurate programs around. Don't use ATI Tool, as it's incompatible with Windows 7. Try RivaTuner instead.

jAce
  • 1,382
  • 6
  • 17
  • 32
Bigbio2002
  • 3,944
5

The first thought I have is that your power supply may be going bad. Playing games or watching youtube may kick your grfx card into hi gear and increase power draw. Also watch those cpu temps 100 degrees ...yikes

3

I have a laptop with windows 7 x64 ultimate that suffers of the same random crashes. I have noticed they happen mostly at home, when my G15 keyboard is plugged in. I do not remember to encounter those crashes when I use the laptop keyboard.

The G15 draws a lot of power on the USB plug. Maybe it has something to do with it ?

3

Using the stability tester in nTune my computer freezed again (In the same manner as before). I notived that Speedfan gives me a -12V of -16.97V and a -5V of -8.78V.

Replace the power supply. These voltages are far out of spec and could be the cause of your problem.

Edit: The negative rails are rarely used today; however, the HWMonitor temperature is too high for temperature sensor 2 if the system is not under load. This could be a problem with the cooling system, motherboard, or power supply.

bwDraco
  • 46,683
3

i'll stop on this

I have to keep the power button pressed for 4 seconds to shut my computer down.

and I can tell it's a power supply issue. To avoid having to leave it for rest to get up again, turn it off, unplug the screen and the power cord and hold the power button for ~15 seconds.

this will discharge any component holding some electric charge.

see if you can get a replacement and you may also check for faulty capacitors as here Why won't my computer boot? and replace them.

2

What video card chipset (brand name doesn't matter)? I've seen these problems with Radeon HD 4xxx and 5xxx series, when they bounce in and out of powersaving mode (that is, the GPU clock speed changes). Youtube in full-screen was the surest way to trigger the problem.

There are some hacks to disable PowerPlay (the clockspeed changing), involving creating an overclocking profile using Catalyst Control Center and then editing all the clock settings to be the same (no actual overclocking is necessary, but the "overclocking" mode has to be enabled for you to override the default power profiles).

One consequence of the involvement of power saving, is that GPU stress tests won't trigger the problem, since they keep the GPU busy and running at its fastest clock rate.

You mentioned ati-tools, so I think this is the most likely culprit.

I don't notice any mention of updating your video card drivers. The newest ones seem to be quite a bit better in this regard (or maybe they continue using the powerplay-neutering profile configured with the earlier version). In any case, upgrading to the latest Catalyst drivers is worth a try.

Ben Voigt
  • 7,346
2

I had the same issue. Prepared for the worst I plugged a USB HDD into one of the USB ports at the back of the unit to perform a system image and since doing that I haven't had any more freeze crashes.

I have other USB devices but they are attached to hubs. I think there is a problem with hubs being attached when no USB device is attached directly to a USB port. This is the second issue of this type I have had since running Win 7 64. The other time I had these obnoxious random freezes occurred only when I had a firewire compact flash card reader attached to the firewire port on the front of the unit. Random freezes sometimes within 2-5 minutes or sometimes after two days. Both down to external devices being attached to the computer. No errors in device manager - no errors in event logs.

JMD
  • 1
1

Try using the Windows 7 Startup Repair on the DVD.

Just boot from your Windows 7 DVD and then there should be an option to repair.

JoshB
  • 121
1

Make sure you were using latest memtest86+ vesrion. I had memory problems with my PC and dig out some old recovery CD with memtest. It runned fine, kinda slow but found no errors. After some more research I was almost sure it was memmory so I downloaded newest version and burned it on my working laptop. It runned much faster and this time found memmory errors.

TJL
  • 1,190
1

My computer is also having freezing problems and I'm so glad I came across this website. This is by far the most detailed post I've ever seen! I have gone through quite a few similar steps such as physically clean the fans & case, and also downloading softwares such as OCCT, Memtest and HW monitor.

What fixed the problem on my computer was: Updating bios to the latest version.

After that the computer ran smooth as a tiger. I hope this advice may be useful to some. Take care when updating bios though, because I don't think it's recommended unless if your computer is having problems and not being able to be fixed.

0

I'm going to add a general answer re windows freezing

To test that it really is a freeze, look at the clock in windows, it shows minutes not seconds, so give it a few minutes so as to make sure the clock is still, so you know it's not a keyboard/mouse issue. Or an alternative test could be to try plugging in the keyboard/mouse again or into a different socket. If you've already had a freeze and tested to make sure it is a freeze then you can be more sure it's a freeze and not a keyboard and mouse issue.. If it is keyboard/mouse, it's unlikely to be both keyboard and mouse, so be sure to try both before concluding that it is a freeze. A computer freezing can be quite time consuming to solve so you want to be sure it is freezing!

if you are getting freezes, a thing to do is to look at event viewer for clues.

here's an image of the window http://2.bp.blogspot.com/-vTooTxWDEpk/U7wCYWKZm3I/AAAAAAAACRI/Zpwe2sT-hwc/s1600/uptime002-eventlogVista.jpg

in event viewer you often want to expand window logs and go to "system"

Look for red Xs and particularly any events that come at the same time as the crash.

Also look in windows logs..applications (select that on the left hand side in event viewer).. again look for events that came at the same time as the crash.

Make sure Windows is set to write a memory dump when there's a crash. Not a huge memory dump..(not a complete mrmory dump), But either a "small" or "kernel" one.. kernel might be better than small.. but if kernel turns out to be too big then use small.

There is a setting to tick for that. https://i.sstatic.net/aObZK.png And you see the path to the dump file. So as long as that's set then next time it freezes, it will make a dump file. You can zip if it's a bit big, and then go here. They accept up to 40MB upload. http://www.osronline.com/page.cfm?name=analyze click the "choose file" and "upload" buttons to upload it and see the results.

Another thing to do is to check the hard disk A)look at SMART data B)run a hard drive manufacturer tool for checking the hard drive. The software is often small and often runs in windows (not needing to make a bootable cd/usb) and often has options for quick tests(not needing hours).

Another thing one can try is testing RAM The easy way is if you have more than one e.g. you have two RAM modules. So you run the computer with one RAM module, then see if it freezes. Then try running it with just the other RAM module. The slots are labelled DIMM0 DIMM1 DIMM2 you'd see with a torch or MBRD manual. So just use the first slot, DIMM0 when you have one RAM module in there.

Another way, but is long, and perhaps pointless, is running memtest86 overnight 10 hours or so, and seeing if it shows any errors. If so then the thing is it doesn't tell you whnci module. So you'd have to run it again with each module. And you'd have to make a bootable CD or USB for it.

So for RAM testing, better is the easier way, to try running the comp with one module, for RAM testing, and then the other, rather than memtest86.

barlop
  • 25,198