Welcome to Inkbunny...
Allowed ratings
To view member-only content, create an account. ( Hide )
LittleSypher

such frustrations


i just need to gripe for a minute

i've been dealing with constant problems with my desktop for what must have been several months at this point. Basically the system randomly closes programs or gives a BSOD (blue screen of death). Random faults each time. For a while it was very spaced out: long delays in between crashes, and then crashing repeatedly.

i'd run Linux out of a thumbdrive and then experienced problems with it not being able to boot *into* Linux in the first place. So naturally i concluded there was a hardware issue. i did a ton of testing and then finally ship of Theseus-ed it. Took out all memory sticks except for one, and tested them individually: still fucked. Took out the graphics card: still fucked. Replaced the motherboard: still fucked. Replaced the CPU.... and suddenly it was behaving.

My windows installation was fucked, and i was pretty sure what was going on was that the crashing caused software corruption. So then i replaced the NVME drive and reinstalled everything from scratch. It worked!!!!


........until it didn't. It worked fine for just over a week i think. Now it behaves mostly like it did before, where it will crash repeatedly, but then work fine for hours until the cycle begins again.

Just today i replaced the PSU and it's still fucked. i *can* boot into Linux from USB, and it doesn't crash there. However that might be because i'm not running anything stressful...

So my next hypothesis is that maybe there's a low level software issue. Most regular programs don't cause BSODs, and crashes will even happen before logging in, so whatever software issue it could be would be something that loads up before login. Maybe it could be some kind of incompatibility, but at this point i have no fucking idea.

i could do yet *another* full reinstallation, then try installing things one by one to see when it starts breaking. However if it takes over a week for problems to begin showing up that's going to be horrendous and render the machine basically unusable. It's still *borderline* unusable, but still. So instead i'm going to start uninstalling things.

this whole business has been consuming nearly the entirety of my attention. In particular it's fucking up my ability to draw. i have a spare tablet, but still.

so anyway.
i just needed to gripe.

i'm open to suggestions, tho it's always tricky because homebuilt systems are pretty idiosyncratic. However: before you suggest switching to Linux 100%, yes: i'm quite aware that's an option. Maybe it's in my future. idk

Viewed: 55 times
Added: 6 days, 19 hrs ago
 
LegendaryLycanthrope
6 days, 19 hrs ago
Have you installed a temperature monitor for your components? Things could be overheating.
LittleSypher
6 days, 18 hrs ago
yep, there's some built-in software that monitors various temperatures, and those were fine last time i checked. wouldn't hurt to double-check. However if i leave the computer off for hours for it to cool down, it can still crash right after startup

one thing that i haven't replaced yet is the liquid cpu cooler, but its display never goes above 45c. Could try reseating it. again x3
Pux
Pux
6 days, 18 hrs ago
So I know this could be a silly question...but could it be your power bar, or dirty power?

(Or a ghost even?)

Is the computer on its own circuit, because sometimes a heater kicking on could cause a drop in voltage enough to kick the computer off.

I asked my bf to help me with your conundrum.
LittleSypher
6 days, 18 hrs ago
it's going through a UPS, so seems like that should keep power stabilized
also no heaters have been running x3
Mabibabi
6 days, 17 hrs ago
It could possibly be a RAM problem, you can check through a memory test to see if the RAM sticks are are still good or not. That could be the reason why it works sometimes then suddenly crashes at a certain point.
LittleSypher
5 days, 1 hr ago
i ran memory checks a couple of times a while back. i did check each stick individually in my most recent round of tests; the probability of all four failing is extremely unlikely. Even still i ran another memcheck86+ yesterday and still no failures...
ArielCelestia
6 days, 16 hrs ago
I would use the Windows Even Logger to see what fails out at each crash. Should help you troubleshoot a little.
LittleSypher
5 days, 1 hr ago
i was looking in there a while ago and didn't see anything that seemed useful. The critical errors are all about rebooting without a clean shutdown. Doesn't really say anything about why it shut down tho.


i do see a number of errors related to the indexer, so added a number of directories with high file counts to the exclude groups.
ArielCelestia
4 days, 8 hrs ago
Well, Google spat out this "To find the reason for a Windows shutdown, use the Event Viewer to examine the System log for specific event IDs like 1074 (user-initiated or application-caused shutdowns) or 41 (unexpected shutdown/power loss), as well as 6006 (clean shutdown). Open Event Viewer, navigate to Windows Logs > System, filter the log for these Event IDs, and then read the event details to understand the shutdown's cause. " Maybe that slop will come in handy somehow? Hope you can figure it out hon! <3
Tampa
6 days, 16 hrs ago
When you say you replaced the parts, do you mean for different parts or the same?

There are known hardware issues with Intel CPUs, AMD and certain boards and memory instability from certain memory settings.

If you suspect load is causing it then grab some benchmark tools and give those a whirl. Record a couple crashes and analyze the dumps Windows spits out to see if they have similar causes.

What's the current hardware you have? What bios version? Updated drivers?
LittleSypher
5 days, 1 hr ago
How would i go about "recording" a crash? i have tested the memory pretty extensively, as well as running benchmarks in the Linux environment. The drivers are all up-to-date, including the ones suggested by the motherboard software (beyond what the regular windows update suggests). It seems unlikely to be the BIOS, since when i replaced the motherboard that was updated too. There is another BIOS update which i could try, tho i'm nervous to attempt that in case it crashes at some dangerous point...

Now i *have* replaced the parts with the same ones as before. However it's hard to tell where a potential incompatibility might lie, given cpu/motherboard compatibility doesn't leave many options for variations. It can't be the graphics cards since crashes occur even without them installed. The only thing i haven't replaced are the memory sticks. One thing i want to try is using one from a different computer.

i'm not going to share my pcpartpicker list since it's got my old account name attached to it x3

Tampa
4 days, 20 hrs ago
Windows bluescreens create dump files that contain memory dumps among other information on what exactly it crashed on. These can be analyzed to see if there is some commonality between the crashes. Initially it sounded like cpu degradation of the Intel variety, but if you changed the cpu then it's less likely it is that.

Without having a list of parts it's hard to search for reports on incompatibility. Just copy and past the names from the pcpartpicker then. There are some known incompatibilities between certain boards and cpu's. Last rig I built with a Gigabyte board would never post, but ran totally fine with an Asus one for example.

Knowing the parts and their firmware version have you checked if there are any reports on those breaking stuff? Generally I'd advice outside of security or degradation issues to leave hardware on the firmware that worked. Software should be updated yes, but there is also the old saying "never change a running system". Firmware fixing one thing adding incompatibility with another or exposing some problem elsewhere. This has been some hit and miss lately so you may have stumbled into that.

The stop codes you posted hint at memory, but they also hint at general system instability, which is usually either a board or cpu problem. With parts exchanged and no change in behavior that speaks more to incompatibility between parts or their firmware. It is possible the boot drive is bad as well, depending on which one you use. There has been some recent issue with a windows update and nvme firmware nuking drives, make sure it isn't that.

Regarding the dump files. There is software to decode these, but they aren't likely to give much information given the variety you see. I think your best option is exchanging board and cpu, perhaps even memory, for different brands or slightly different types that still work together. If that does not help I would normally suspect an issue with the power supply, but if you feel that's unlikely then it would be the boot drive next on the list. Grab a cheap hdd, install Win10 on it(because 11 is still a trainwreck) and see if it stabilizes.
MrMyxo
6 days, 13 hrs ago
the error of the blue-screen is a indicate of what is broken. so try to take a pic of it or so when it happen, it my be in the event viewer to with mot detal, it sold exist a dump file of the crach witch can be usefull but hard to get info from.

i most of the time use the blue-screen error as first way of se in witch area problem is on.

it cal also be eny external device like mous, keybord or combination of so.

good luck,
LittleSypher
5 days, 1 hr ago
heh oh man, we have a lot of stopcodes x3
here are some that i was able to catch:

* page fault in non-paged area 0x50
* system service exception 0x3b (with ntfs.sys and fltmgr.sys)
* kmode exception not handled 0x1e
* machine_check_exception 0x9c
* memory management 0x1a

These can be due to just about anything. The crashes have created some file corruption in the past and i'm guessing that's what the ntfs and fltmgr errors are about.

Some of these do generate dumps. Haven't gotten anything meaningful out of the logs. i dunno how to decode a .dmp file, but maybe that's something to try
MrMyxo
5 days ago
Gota catch them all. normally  its one bsod that allwas return. so i think its possible to use that you get a lot of different ones.
one thing that hit me is do you have any more harddrivs of anny typ in the pc. have you run the system without them if so?

the general info from alla of the error is, file corruption. memory and driver/software. witch sadly is all hard. corrupt filesystem is possible from the other errors sens one of them is to prevent dmg to system.

when you installed a new m2 with new windows. any extra antivirus that was installed? sens a new window with nothing on it shod not bsod from software as long as all drivers have bin updated. today its possible to get almost all of them from windows update, dependent on how pc was build.

when testing you shod have as few things connected as possible. no extra usb, disk, speakers... but i understand that its hard to do if it takes long time to happen.

next is if you do something when it happens. specific loads like more ram use, cpu loead and so on.

lolimouse, did have a ide about the windows update that create problems with specific m2 drives control units. i dont know match about them more then it some specific controllers that crests problem.

hop any of this is to help sens its a hard problem and not a easy to give help. just give what i will check. for me a lot of my diagnostic skill com from feelings witch i dont have here

you wich CPU don you have? intel 13 and 14 did have some falty model that over time get wars.
lolimouse
5 days, 23 hrs ago
it sounds like it could just be that a recent windows update introduced some bug that arises with your particular hardware configuration. if you can, see if you can roll it back to a previous version. although im not sure if thats possible on windows 11.. fucking microsoft
LittleSypher
5 days, 1 hr ago
This could be a possibility.
But i don't have a snapshot of a previous state to rollback to :<

i think there are ways to revert, but probably what i would have to do to test this would be a fresh reinstall D:
bullubullu
2 days, 15 hrs ago
I can't help at all lol other than "re-install everything from the BIOS to the OS" , but what I do know is that certain mixes of components can cause issues, so maybe it would be helpful if you listed the specs (probably in a new journal for visibility) ^^
New Comment:
Move reply box to top
Log in or create an account to comment.