About a week ago my PC started exhibiting a weird series of behaviours.
Which despite all my best efforts I was unable to pinpoint or find the reason for.

It started when it started to get black screen + freeze during games.
It was almost random...
Some games it ran fine. While others got a black screen. (My monitor shows "No Signal" message)
Paradise lost crashed several times. While Hogwarts Legacy, which is as taxing graphically, worked fine.

Then, after one of the crashes - the windows stopped booting.
When I tried to login, I just got the same black screen + freeze.
I entered "safe mode", was able to login, and saw an error in the GPU drivers.
I downgraded them, and it seemed to help, as I was finally able to load Windows.

But this wasn't the case. as eventually the black screen + freeze happened again.
I figured my GPU drivers must have gotten corrupted.
So I tried a clean re-install - but instead got a "failed" message. And got stuck without any GPU drivers.

So I figured my Windows installation must have gotten corrupted somehow.
I tried to "fix" it - but got an error message it couldn't be fixed.
So I decided it may be time for a clean re-install of Windows.
Tried to do that (from within Windows), but again got an error.
Tried to boot from an Windows install USB stick - but again got a "failed" message.
Eventually I needed to delete the entire partition, disconnect all my other HD, and only then I was able to install Windows from scratch.

I figured my problems were over - as there could not be any more driver/compatibility issues on a clean Windows install, and a clean driver install.
However, now when I tried to run games. Or even put Steam into "Big Picture" - I got the same black screen + freeze as before.

So I realized it must be a HW issue.
Probably with the graphics card - after all, it's the one responsible for showing images and running games.
So I tried switching my graphics card with another PC.
But my card worked perfectly in another PC, while my PC had the same issue with another graphics card.

So that's where I'm at right now.
My suspicion is on the Motherboard.
As It's responsible for everything in the PC, including communications with the graphics card.
Trying to upgrade the BIOS to the latest version - doesn't seem to have had any effect.

And it kind of seems weird to me: that everything else works fine.
But when I run anything even slightly graphically taxing (even just put Steam in Big Picture mode), suddenly we get a failure.
The motherboard is supposed to be the nervous system of the PC - it holds the communication bus between the components.
I would expect everything to shutdown in case of an issue with the motherboard...

In any case.
I would love to hear if anyone encountered a similar behaviour in the past - and maybe can help shed some light on what might be the reason...

EDIT:
Sorry if it wasn't clear from my post - this is not a new setup.
It 100% used to work up until a week ago.
The last time I swapped something was a couple of months ago, when I got a new PSU.
And everything worked great (obviously) up until a week ago.
And before that, I swapped my GPU 2 years ago.

If it's important, my setup is:
CPU: Ryzen 5 2600
Mobo: MSI B450-A Pro
RAM: 2 X 8GB G.SKILL Sniper X
GPU: Nvidia RTX 3070Ti
PSU: CoolerMaster 750W Gold - V2
OS: Windows 10 Home 64Bit
Disk 1: NVMe 1TB
Disk 2: SSD (SATA) - 1TB
Disk 3: HDD (SATA) - 1 TB

The one I used to test my graphics card has an Intel CPU, 16GB RAM, GTX 1070 and EVGA 750W GQ PSU, running Windows 10 Pro.
I put the GTX 1070 into my PC - but got the same result as with my RTX 3070 Ti.

EDIT2:
Removed 1 RAM stick & disconnected CD-ROM and 2 hard drives (left only NVMe one) - everything seems to work fine.
Returned 2nd RAM stick - still fine.
Connected CD-ROM - still fine
Connected one disk - still fine
Connected the other disk - still fine
Ran CPU stress test - all fine
Ran GPU stress test - all fine
Ran memtest86 - It found 0 errors
Ran Hogwarts Legacy - all working fine.

I'm even more baffled than before...

1 month ago*

Comment has been collapsed.

Since it worked on the other PC, it would be helpful to have a HW and OS list, maybe even DXDIAG of both PCs.

1 month ago
Permalink

Comment has been collapsed.

Added HW & OS list to the post

1 month ago
Permalink

Comment has been collapsed.

Similar thing happened to me few years ago. when i checked the task manager there was some unknown background programs taking to much cpu and hard disk. An old friend who was a software engineer detected a crypto malware using my pc resources for mining. He switch the hdd and then everything was back to normal. In my experience all these freezes/black screen/blue screen of death are 80% caused by faulty ram/gpu or some malware, trojan eating pc resources..

1 month ago
Permalink

Comment has been collapsed.

As I mentioned - I reinstalled my Windows from scratch.
So no chance of a malware...

1 month ago
Permalink

Comment has been collapsed.

I believe, that there are quite a few different types of malware, that can survive not only Win reinstall, but even full HDD reformatting. Though my gut feeling is, than much more likely it's some kind of a hardware problem - HDD most likely, but also possibly RAM, GPU, CPU or even still another component.

1 month ago*
Permalink

Comment has been collapsed.

Try another PSU?

1 month ago
Permalink

Comment has been collapsed.

I can.
But I must say, the PSU is very new, and from a good company: CoolerMaster 750W Gold - V2

1 month ago
Permalink

Comment has been collapsed.

After reading your edit, I would say the PSU is the most likely culprit.

Bad power supplies from good companies are rare, but they do happen. Had it happen to me years ago with a higher-end model from Corsair.

1 month ago
Permalink

Comment has been collapsed.

My guess also would be the PSU.

1 month ago
Permalink

Comment has been collapsed.

disconnect all my other HD, and only then I was able to install Windows from scratch.

Test all your HDDs and check for their SMART values. The freezes are typical of defective hard drives.

Free software to check SMART: https://crystalmark.info/en/software/crystaldiskinfo/

1 month ago
Permalink

Comment has been collapsed.

Firstly like @orono mentioned, check your disks' SMART data. If that's not it, my second guess would be hardware acceleration. Check if it's enabled for Steam. If so, disable it. Less likely but it could also because of RAMs, either use memtest86+ or try unplugging them one by one to see if one of them is defective.

1 month ago
Permalink

Comment has been collapsed.

Just ran memtest86 - It found 0 errors

1 month ago
Permalink

Comment has been collapsed.

So only PSU and motherboard left to suspect I guess. Maybe some cables get in the way of fans and fans make them loose.

1 month ago
Permalink

Comment has been collapsed.

If you're using an amd processor, try setting a fixed processor clock speed and voltage to the base clock speed for your cpu in your bios and see if it still happens. That fixed a similar black screen data corruption problem for me. For me the voltage needed to be 1.2 and no boosting or the data corrupting crashes happen. Then use sfc /scannow in a command prompt set to administrator mode to fix the data corruption. Before doing this I replaced everything except cpu motherboard power supply and it was still happening..I only knew to try this because I ran HWinfo after putting all the old components back and noticed that my processor would boost to 5050mhz before a crash when the default boost was supposed to be lower than that.

1 month ago
Permalink

Comment has been collapsed.

Not sure how my clock speeds could have been wrong for 6 years, and only started crashing now...

1 month ago
Permalink

Comment has been collapsed.

My 5950x worked fine without doing any of that stuff on default settings for about a year or so, then the problems started after a certain windows update...but rolling back the update and all that other stuff I talked about already didn't help anything. I run it at 3.9 clock speed 1.2 voltage now and no more data corrupting crashes.

1 month ago
Permalink

Comment has been collapsed.

Eventually I needed to delete the entire partition, disconnect all my other HD, and only then I was able to install Windows from scratch.

That is usually a sign of the PSU failing to deliver enough or stable current to the system. As you decrease the load on the PSU by disconnecting parts of the computer (HDDs, GPU, fans,...), the system gets more stable.

It could also be the MB, small things like a failing capacitor can lead to random behaviors like the ones you describe.

I'd rule out the GPU since it crashed on "simple games" but worked fine with intensive-gfx games like Hogwarts Legacy.

If you are confident with it, you could try to boot from a Linux distro and run some diagnostics. Ubuntu includes memtest86 as a boot option, so you can start by testing the memory. There are also many stress tools to check the CPU and GPU. as well as the hard disks. It will also give you better logs of what's happening, unless it's a total freeze where the system hangs and becomes unresponsive.

1 month ago
Permalink

Comment has been collapsed.

I'd rule out the GPU since it crashed on "simple games" but worked fine with intensive-gfx games like Hogwarts Legacy.

This is not exactly true.
It did work well on simple games. (like Space Crew)
"Paradise Lost" specifically, is not an AAA game. But it is running on Unreal Engine 4, and is notoriously UN-optimized.
So I wouldn't be surprised if it draws more Watts / GPU power than Hogwarts Legacy.

1 month ago
Permalink

Comment has been collapsed.

Just ran memtest86 - It found 0 errors

1 month ago
Permalink

Comment has been collapsed.

As you could run Hogwarts without a crash I'd rule out GPU. My money is on RAM or disk failure. Try to run your pc with one RAM stick only in different RAM slots and see if the pc crashes. Do this with each stick to eliminate this possible point of failure. Check S.M.A.R.T. values of your storage drives with eg. HWiNFO64. Run memtest to check RAM reliability. In motherboard BIOS/UEFI change all settings to default. Check if your CPU is seated in socket correctly and check for overheating. If available try another PSU.

I hope you understand that I suggest to do those steps one by one not all at once.

Your motherboard could have crapped out but I never experienced it myself and from what I know it's rather rare.

Edit: generally it's always helpful to provide us with a short summary of specs. If you're running a 4090 with a 500 Watts PSU we could pinpoint to issue immediately.

1 month ago*
Permalink

Comment has been collapsed.

Sorry - added details of the system to the post

1 month ago
Permalink

Comment has been collapsed.

Just checked this site https://pangoly.com/en/browse/motherboard/msi for RAM compatibility with your motherboard. G.SKILL Sniper X isn't very helpful in this case though as the exact model number is necessary. Check for yourself if your RAM is on the compatibility list for your mobo and CPU (it probably is but just to make sure).
Edit: if you want to dive deeper into the art of pc failure analysis I can recommend Greg Salazar's vids on YT.

1 month ago*
Permalink

Comment has been collapsed.

Checked the list - it's compatible.
Also - this motherboard is running with this memory for 6 years now.
I think I would have notices if it wasn't compatible.

1 month ago
Permalink

Comment has been collapsed.

First, make sure that you are not overclocking any part of your system before declaring any hardware at fault. Second, this may seem obvious but its worth asking to rule out: are you actually connecting your monitor to the GPU video ports rather than the motherboard on-board video port? If you are using the motherboard ports then switch to the HDMI ports on the GPU itself. If you were using the proper GPU ports, then I would suggest testing the power supply as others have mentioned. If your power supply is insufficient, or failing, it would explain why you have problems with different GPUs and why those GPUs work fine in a different PC with a working PSU. Swapping PSU can be kind of annoying, but pulling the power supply from the working PC and just temporarily hooking it up to the one that has issues will allow you to determine if it is actually the PSU or the motherboard.

1 month ago
Permalink

Comment has been collapsed.

  1. No, I never overclock
  2. Connected directly to GPU
  3. I also tried my machine with GTX 1070, which is much weaker than my 3070 Ti - but got the same result.
    (not sure if that's enough to rule out PSU as the problem)
1 month ago
Permalink

Comment has been collapsed.

Some hardware comes pre-overclocked by the manufacturer as a 'feature'. Worth double-checking that your GPU and CPU are running at correct speeds.

If your motherboard does have on-board video, give it a shot (with GPU completely unpowered and unplugged) and see if it can run anything on low settings, just to test if the issue persists. If it doesn't occur that doesn't tell you much, but if it does that is another step to ruling out the GPU as the problem.

The 3070 draws significantly more power than the 1070, but consider if the PSU is actually the problem then it may be unable to supply even the smaller amount the 1070 needs to run a game. Swapping the PSU will tell you for sure if that is the problem, though that is a huge hassle so you may want to try all other suggestions in this thread first.

One thing to check before doing that: are you using the same monitor for all this? If the GPU is actually outputting a signal that your display can't handle, then the monitor might just go blank and display "No Signal" rather than some error message such as "Video not supported". This could be if you are trying to display at a refresh rate or resolution that is not supported. Windows itself can have different resolution & refresh rates than video games, so that could explain why it sometimes works.

1 month ago
Permalink

Comment has been collapsed.

It's not a matter of a video signal - as the PC actually freezes and eventually reboots itself.

1 month ago
Permalink

Comment has been collapsed.

GTX 1070, which is much weaker

still eats 145 watts, still the most of any single component

everything else works fine.

attempted a full cpu load test?

last time I swapped something was a couple of months ago, when I got a new PSU.

That alone makes it the prime target. Weird symptoms as well. Can always break even if new. Can check voltages but they might only spike erratically. What happened with the previous one?

My rec also is to check with a different PSU.

I was going to be leaving a positive review... until it broke my PC. Twice.

no, stop. Software is incapable of breaking hardware. It can trigger the failure, but hw was broken at that point already.

1 month ago
Permalink

Comment has been collapsed.

no, stop. Software is incapable of breaking hardware. It can trigger the failure, but hw was broken at that point already.

Tell that the programmers of Stuxnet.

1 month ago
Permalink

Comment has been collapsed.

not sure if you are serious, joking or trolling, and whether to entertain this further..

Aside from that being an entirely different field anyway:

Stuxnet specifically targets programmable logic controllers (PLCs), which allow the automation of electromechanical processes such as those used to control machinery and industrial processes

It basically reprogrammed hardware controllers, not PCs.

1 month ago
Permalink

Comment has been collapsed.

No trolling, basically it's not very different to malware residing in the BIOS chip of a computer which can't be removed by wiping disks and re-installing the OS.

1 month ago
Permalink

Comment has been collapsed.

Stuxnet didn't actually do any damage to PCs

malware residing in the BIOS chip

is not damaging hardware physically.

Anyway this is derailing the topic, obviously my statement was in regards to regular programs/games.

1 month ago
Permalink

Comment has been collapsed.

Memtest

1 month ago
Permalink

Comment has been collapsed.

As someone who spent 6 years repairing this stuff, this looks a LOT like a RAM issue, where most likely one of the sticks is failing. As was already mentioned above, you can use memtest86 utility to test the memory sticks, either both or separately, for errors

1 month ago
Permalink

Comment has been collapsed.

Just ran memtest86 - It found 0 errors

1 month ago
Permalink

Comment has been collapsed.

I once had something similar aka undefined crashes without any apparent reason.
Check your SATA cables if they are properly plugged in on both ends.

1 month ago
Permalink

Comment has been collapsed.

My my primary disk is on NVMe slot, so there are no cables there

1 month ago
Permalink

Comment has been collapsed.

You got any other disks that are connected by SATA?
For me it was a secondary disk which caused the problem as well.

1 month ago
Permalink

Comment has been collapsed.

A broken motherboard, it's basically the brains of a computer (with it's CPU), and when you get faulty brains it can send all sorts of wrong signals to other body parts, making them not work anymore but it can also be a bodypart itself. I had that happen before (but been 20 years ago and i always had my dads pc to try and replace things) but it was hard to figure out it was the motherboard.

There is no hurt in trying to replace the PSU or a motherboard if you got a spare one near you, granted with old computers that was much easier to do then new computers, i never had problems fixing things, but i wouldn't have a clue how to disassemble mine now, it's all so built in (prebuild) and my case weighs 25kg and no drivers license for a repair shop AND no replacement parts anymore at hand, so i have to knock on wood there.

1 month ago*
Permalink

Comment has been collapsed.

Have you looked at the event viewer after these crashes? There were a couple of infamous instances where GPU timings (or something similar) were the culprit, usually related to specific versions of GPU driver.
Check the event viewer and see what exactly the error/issue is when system is frozen.

1 month ago
Permalink

Comment has been collapsed.

But he made a clean windows install, unless he used the same latest driver and not a previous one that worked.

1 month ago
Permalink

Comment has been collapsed.

When I had this issue, it was across multiple driver versions. \
Also, it's kind of related (don't know the exacts) to Motherboard CMOS cell as well, as far as I remember.

1 month ago
Permalink

Comment has been collapsed.

Sometimes popping out the CMOS and back in can resolve issues, setting the pc's "memory" back.

1 month ago
Permalink

Comment has been collapsed.

Have you checked the ram? Maybe the problem is there.

"I swapped something was a couple of months ago, when I got a new PSU" well, an obvious suspect is there

1 month ago
Permalink

Comment has been collapsed.

yep, check the ram, swap sticks, 1 boot for each stick with single stick each time

1 month ago
Permalink

Comment has been collapsed.

See my comment below

1 month ago
Permalink

Comment has been collapsed.

Removed 1 RAM stick & disconnected CD-ROM and 2 hard drives (left only NVMe one) - everything seems to work fine.
Returned 2nd RAM stick - still fine.
Connected CD-ROM - still fine
Connected one disk - still fine
Connected the other disk - still fine
Ran CPU stress test - all fine
Ran GPU stress test - all fine
Ran Hogwarts Legacy - all working fine.

I'm even more baffled than before...

1 month ago
Permalink

Comment has been collapsed.

Hope it stays that way. Sometimes disassembling/reassembling can fix a bad connection. Clean PCIe connector of GPU and RAM contacts with Isopropyl alcohol. Look for anything that could cause a short and double check if all connectors and cables are firmly seated. It wouldn't hurt if you run monitoring software in the background to spot issues before the get critical.

1 month ago
Permalink

Comment has been collapsed.

Deleted

This comment was deleted 1 month ago.

1 month ago
Permalink

Comment has been collapsed.

as hbarkas said, maybe a bad connection, also could be some moisture that corroded something somewhere and it will act sometimes

1 month ago
Permalink

Comment has been collapsed.

Long ago I kept the PC of a friend in shape whenever it misbehaved. Once we got a similar problem. Tried different approaches and after several failed attempts we ended replacing the motherboard and that was the issue. New motherboard, no issues. We would have returned the new motherboard if it didn't help, ofc.

Check visually all capacitors (well I think that's the name of the cylindrical things on the MB) to see if any is strangely swollen. That's a sign that it's the MB.

That's just one experience, which might or might not be the same as your problem. But if you have a nearby good shop with good prices and good return policy, you can try several things "for free".

Good luck!

1 month ago
Permalink

Comment has been collapsed.

Yes, capacitors.

1 month ago
Permalink

Comment has been collapsed.

For me hard to give help to others as it's always almost a blank(in my brain) for me but when it comes to my system I always know/understand what could be the issue etc.. Not really sure what could I suggest if you had blue screen it would be simpler but you could check it if you have a "dump", though by default it's off I think(doesn't save). But it could help diagnose some things as using google it could say it's this or hardware issue.. If you want I could at least write a small guide how to get that functionality and where to check if you had blue screen of death "dump" when you have a black screen/freeze...
And also "fun" story from me about a problem I had... I had occasional BSOD and didn't really know the reason... I reinstalled windows 3 times and still the same issue... Then decided to check what's up (using bsod dumps and a program to read it etc.) and turns out the problem was a ds4windows driver(used that program for dualshock controller and somehow that driver stayed for some reason when reinstalled windows(with formatting the drive))...Reinstalled that one(driver) and no problem since.... Except dead ram but at least that was an upgrade to 32gb :)

1 month ago*
Permalink

Comment has been collapsed.

Loose sata cable?

1 month ago
Permalink

Comment has been collapsed.

(Reading your last edit) Sure sounds like it could have been just a loose connection someplace. I think, that temperature changes alone can lead to this at times, especially if it wasn't too well connected to start with. Lucky you, if it stays this way :)

1 month ago*
Permalink

Comment has been collapsed.

Loose connector or a connection compromised by dust.

1 month ago
Permalink

Comment has been collapsed.

"It started when it started to get black screen + freeze during games.
It was almost random...
Some games it ran fine. While others got a black screen. (My monitor shows "No Signal" message)"

I had the same thing and I was NOT able to REALLY pinpoint the source with trial and error by using multiple PC parts.

I basically pick apart 3 PCs to test this issue. I checked everything and did clean my PC and all parts. I was thinking it is the graphic card problem as it was getting hot in some games so I put my PC into a bigger case with better airflow and more computer fans and tested and it was fine on other motherboards.

In the end I did replace the motherboard for a different one and it looks like this fixed the problem.

What is strange is that the motherboard was not broken in any way and is now sitting in my father PC working just fine and since he is not a gamer the problems I experienced never happened to him.

I think some motherboards + graphic cards configuration have this strange problem with communicating with each other or something BUT I can't be sure.

1 month ago
Permalink

Comment has been collapsed.

From a philosophical POV it amazes me that such delicate devices operate as intended most of the time while they have so many points of possible failure. Modern electronics follow the rules of quantum physics where our concept of causality doesn't apply.

1 month ago
Permalink

Comment has been collapsed.

From a sociological POV I just wish I could speak to them like to humans and ask them nicely to work and make my life easier.

1 month ago
Permalink

Comment has been collapsed.

With AI cores in every new device this is not to far off.

1 month ago
Permalink

Comment has been collapsed.

Well, they will tell us to fuck off, before exterminating us.

1 month ago
Permalink

Comment has been collapsed.

Sign in through Steam to add a comment.