this post was submitted on 11 Jan 2024
38 points (91.3% liked)

Linux

48002 readers
988 users here now

From Wikipedia, the free encyclopedia

Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).

Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.

Rules

Related Communities

Community icon by Alpár-Etele Méder, licensed under CC BY 3.0

founded 5 years ago
MODERATORS
 

Hi everyone,

ever since I switched to Arch about two months ago, most applications segfault multiple times a day. There doesn't seem to be any pattern for the crashes, sometimes it's even happening while idling (e.g. reading a news article).

Things I've tried without any luck so far:

  • Running Firefox in safe-mode without any extensions
  • Switching from regular to LTS kernel
  • Disable Hardware Acceleration in Firefox
  • Change RAM speed and timings
  • Run Memtest successfully
  • Replace entire RAM with a new certified kit
  • Use only a single RAM slot
  • Apply Ryzen fixes (iommu=soft, limit c-states)
  • Use only a single CPU core (maxcpus=1)
  • Downgrade Nvidia driver to 535xx
  • Use Nouveau instead of the nvidia driver
  • Use Openbox instead of KDE
  • Disable zswap and THP

Here's full journalctl from a day where both Spotify and Firefox crashed at the end, a few seconds after each other:

https://pastebin.com/BH0LMnD9

Some more info about my system:

  • Ryzen 5 3600X
  • MSI B450M PRO-VDH Max
  • 32GB RAM @ 3200MHz
  • Geforce RTX 2070 SUPER (using nvidia-dkms)
  • Plasma 5.27.10 on X11

I'm pretty sure that it's not hardware related, because I've booted up a Debian 12 live image where everything ran for several hours without a crash. But it seems to be Arch related, as I also booted up a fresh EndeavourOS live image (so basically Arch), where applications also randomly segfaulted. Any idea why everything works fine on Debian but not on Arch? Debian uses the 6.1 kernel, which I already tried, so that's not it.

Let me know if you need any more information that might help solve this issue. Thanks!

Edit [solved]: It looks like disabling PBO in the UEFI/BIOS did the trick. The strange thing is, after enabling it again, it's still not crashing again. Someone suspected that the MoBo default/training settings were faulty, so I guess this was a very rare case here. That's probably why it took so long to find a solution. Thanks everyone for helping me out!

you are viewing a single comment's thread
view the rest of the comments
[–] mmstick@lemmy.world 2 points 9 months ago* (last edited 9 months ago) (1 children)

It's difficult to say for sure with certainty what the issue is without trial and error. I would expect that the motherboard's manufacturer would make sure that their board can successfully pass all tests with the standard JEDEC spec for DDR4 (2133 MHz).

Since you say that you've tried different RAM kits, another alternative could be the cleanliness of power from the power supply. Perhaps there is intermittent voltage droop, and you need to experiment with the Load Line Calibration settings to adjust for vdroop between idle and load. Disabling frequency boosting and manually setting the CPU frequency could help check if it's related to that. PBO curves might be undervolting too much while idle.

[–] NoisyFlake@lemm.ee 3 points 9 months ago (1 children)

I'm a bit speechless right now. I've disabled PBO and didn't have a single crash since then, everything's been running fine for hours. Just to make sure that this really was the issue, I've enabled PBO again - but still haven't experienced any crashes in the last hours. I have no idea how simply disabling and then enabling the feature again fixed my issue, but for now it seems like all is well.

Do you have any explanation for this weird behavior?

Anyway, thank you very much for your suggestion, looks like this actually did the trick!

[–] mmstick@lemmy.world 3 points 9 months ago* (last edited 9 months ago)

Sounds like voltage droop and/or a motherboard with faulty automatic "training" settings. I don't recall if the Ryzen 3000 had custom PBO curves, but tweaking this can fix it. Upping LLC and the SOC and CPU voltage slightly alternatively could help. Though I've had my most stable overclock by disabling PBO entirely and using a manual CPU multiplier.