Announcement

Collapse
No announcement yet.

PC with Ryzen 7950x shuts down when using Progressive rendering or IPR

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PC with Ryzen 7950x shuts down when using Progressive rendering or IPR

    Hi,

    We got a brand new PC with Ryzen 7950x, 2x32GB 5600 DDR5 RAM, Gigabyte B650E Aorus Master motherboard and 1200W power supply Be Quiet Dark Power Pro 12. The CPU is being cooled by Arctic Liquid Freezer II 360 AIO liquid cooler.

    Unfortunately, we're experiencing PC shutdowns when using IPR or when trying to do production renders with Vray's Progressive renderer. It shuts down within a minute of starting IPR or the Progressive production render. The PC doesn't overheat, I am monitoring all the temperatures with Hwinfo64. The CPU itself doesn't reach the 95*C throttle point when the shutdowns happen, it hovers around 87-89*C at most. I tried interactive rendering with Corona's trial and the same shutdowns happen there. What's curious is that I can run Cinebench R23 for 10 minutes on loop without any shutdowns with the CPU reaching 95*C and staying there for the whole 10min duration of the test. Also, the PC doesn't seem to shut down if I disable Core Performance Boost from BIOS (this forces the CPU to run only at its base clock which is 4.5Ghz for 7950x).

    Any ideas what might be wrong?
    Last edited by Alex_M; 14-12-2022, 01:43 PM.
    Aleksandar Mitov
    www.renarvisuals.com
    office@renarvisuals.com

    3ds Max 2023.2.2 + Vray 7 Hotfix 1
    AMD Ryzen 9 9950X 16-core
    96GB DDR5
    GeForce RTX 3090 24GB + GPU Driver 566.14

  • #2
    Its likely that your system is overheating indeed. Did you assemble the system your self? It could be that the thermal paste is not properly applied. I did have a similar issue with the threadripper 3990x 128 thread. In the end I had to reinstall windows because I think initially whoever built the system did not install correct system drivers from amd for that particular cpu and it caused the system to overheat, but until I reinstalled windows I did not realize that this was the issue... perhaps its worth looking into that the correct system drivers are installed for that specific cpu.

    Turning off the core boost simply means the cpu isn't computing as much as it can so not heating as much, but if you let it run for a long time it may still overheat.
    Dmitry Vinnik
    Silhouette Images Inc.
    ShowReel:
    https://www.youtube.com/watch?v=qxSJlvSwAhA
    https://www.linkedin.com/in/dmitry-v...-identity-name

    Comment


    • #3
      Hi, Morbid Angel . I assembled the system myself. As I mentioned in my first post, the system does not overheat. I am monitoring the temperatures with Hwinfo64 and they all seem to be within reason. Thanks for the suggestion about the drivers. I will check.
      Aleksandar Mitov
      www.renarvisuals.com
      office@renarvisuals.com

      3ds Max 2023.2.2 + Vray 7 Hotfix 1
      AMD Ryzen 9 9950X 16-core
      96GB DDR5
      GeForce RTX 3090 24GB + GPU Driver 566.14

      Comment


      • #4
        Agree, 7950x with arctic here, 82C is average during heavy loads with slightly below 90C short peaks, but it can be the memory expo frequency is set too high as well (running 4400Mhz here rock stable - 4800Mhz rare crashes during loads - 128GB here - 64GB was ok at 6000Mhz). Try some lower values for comparison.
        Last edited by filip; 14-12-2022, 03:11 PM.

        Comment


        • #5
          DDR5 ram eh, could be the reason. I would suggest swapping out ram and see if it makes it any better. I mean 90 is pushing it for this cpu.
          Dmitry Vinnik
          Silhouette Images Inc.
          ShowReel:
          https://www.youtube.com/watch?v=qxSJlvSwAhA
          https://www.linkedin.com/in/dmitry-v...-identity-name

          Comment


          • #6
            Actually, the PC doesn't really shut down, sorry for the confusion. All screens go black and all fans and LEDs keep working, including the PC's power LED. Also, the PC's on/off and reset buttons stop working when this problem happens and I can't turn off the PC. The only way to turn it off or to reboot the computer is to turn the power supply off and on again from its power switch on the back.

            Things I've tried so far without success:
            1. Running the memory at the lowest speed - DDR5 2000 instead of XMP and the default DDR5 4800 speed
            2. Running only 1 memory module (tried both of them)
            3. Resetting BIOS to factory defaults
            4. Reinstalling the AMD chipset drivers
            5. Reinstalling Nvidia display drivers
            6. Running "sfc /scannow", "chkdsk" and DISM windows image repair
            7. Making sure all Windows updates are installed
            8. Running the PC without a graphics card (only with the integrated CPU graphics)
            9. Making sure all PSU cables are properly seated
            Last edited by Alex_M; 14-12-2022, 06:20 PM.
            Aleksandar Mitov
            www.renarvisuals.com
            office@renarvisuals.com

            3ds Max 2023.2.2 + Vray 7 Hotfix 1
            AMD Ryzen 9 9950X 16-core
            96GB DDR5
            GeForce RTX 3090 24GB + GPU Driver 566.14

            Comment


            • #7
              I have arctic 420 you 360 there should not be such a big difference - maybe 1C, in 30min cpu rendering my temperature is stable 81-82C, the very short peaks are just during the start where probably fans kicking up speeds. I would try again Morbids suggestion (thermal paste or check again if the mounting has the proper offset - in manual you see the cpu cores are not symetricaly in the middle). If the problem still persists I would try to buy some cheap am5 air cooler - should work without OC and you will have some values for comparison. Had similar issues few years back with 2990WX and after changing to air cooler the temperatures were ok and I left it there until now.
              Last edited by filip; 15-12-2022, 01:45 AM.

              Comment


              • #8
                A small update. The screens go black also when I stress the CPU with Prime95 with the "Large FFTs" test. The "Blend" test doesn't cause black screens.

                Also, just to reiterate, I'm not having problems with the CPU temperature. As I already mentioned, when the problem happens it's not even hitting the throttling point which is 95*C. It hovers in the 80s *C. If the CPU were overheating, the whole system would shut down, but it is running, just the screens go black. Besides, these Ryzen 7950x CPUs will often run 95*C no matter what the cooling solution is and it is by design and not damaging to the CPU according to AMD - https://www.techpowerup.com/review/a...al-throttling/ The problem is elsewhere, but I don't know what it is.
                Last edited by Alex_M; 15-12-2022, 03:34 AM.
                Aleksandar Mitov
                www.renarvisuals.com
                office@renarvisuals.com

                3ds Max 2023.2.2 + Vray 7 Hotfix 1
                AMD Ryzen 9 9950X 16-core
                96GB DDR5
                GeForce RTX 3090 24GB + GPU Driver 566.14

                Comment


                • #9
                  Have you looked in system logs? maybe there would be a clue there. Otherwise, not sure what to suggest. It may be anything from powersupply to video card.
                  Dmitry Vinnik
                  Silhouette Images Inc.
                  ShowReel:
                  https://www.youtube.com/watch?v=qxSJlvSwAhA
                  https://www.linkedin.com/in/dmitry-v...-identity-name

                  Comment


                  • #10
                    90 and 95 degrees are fine for this platform, it is within AMD specs as mentioned above
                    I think you did enough troubleshooting Alex_M this oddness has to be memory, or a rare case of faulty CPU/board
                    When you use a single memory stick do you still get the same behavior?
                    Like what Filip said, 32 GB DDR5 sticks are problematic.. I can barely get mine to run stable at all
                    You will need to test with other parts to verify, if you have access to that at least one 16 GB stick for testing, if you get the same issue I would RMA the board first, then RMA the CPU
                    I usually avoid using Prime 95, either Cinebench or Aida 64 for at least an hour for the CPU to pass

                    Best,
                    Muhammed
                    Muhammed Hamed
                    V-Ray GPU product specialist


                    chaos.com

                    Comment


                    • #11
                      Is there a DEBUG LED on this board?
                      Do you get any codes?
                      Muhammed Hamed
                      V-Ray GPU product specialist


                      chaos.com

                      Comment


                      • #12
                        Hi, Muhammed_Hamed . Yes, there is a debug LED on the board. It usually reads "76" when the PC works normally, but when the problem occurs, the debug LED goes blank.

                        BTW, I've now also tried a 2nd power supply which I know is working well (used for a Threadripper 3970x), Corsair HX1200i. Issue still occurs. Other things I tried is updating to the latest BIOS and then running an overnight RAM test with MemTest86. I ran it from a bootable flash drive so it can test the full capacity, and after almost 10 hours, it completed without any errors, so I know it's not the BIOS or the system memory. This leaves 2 possibilities - faulty motherboard or CPU.


                        Originally posted by Morbid Angel View Post
                        Have you looked in system logs? maybe there would be a clue there. Otherwise, not sure what to suggest. It may be anything from powersupply to video card.
                        I looked and there's nothing useful in the logs. It just says that the previous shutdown was unexpected. That's when I switch the power supply off in order to shut down the PC.
                        Last edited by Alex_M; 16-12-2022, 06:15 AM.
                        Aleksandar Mitov
                        www.renarvisuals.com
                        office@renarvisuals.com

                        3ds Max 2023.2.2 + Vray 7 Hotfix 1
                        AMD Ryzen 9 9950X 16-core
                        96GB DDR5
                        GeForce RTX 3090 24GB + GPU Driver 566.14

                        Comment


                        • #13
                          Originally posted by Alex_M View Post
                          Other things I tried is updating to the latest BIOS and then running an overnight RAM test with MemTest86. I ran it from a bootable flash drive so it can test the full capacity, and after almost 10 hours, it completed without any errors
                          Yup, MemTest will detect issues in the first hour usually, this rules out the memory
                          It is either a faulty CPU or MOBO, let us know how this goes

                          Did you find anything about the Debug LED behavior for this specific board on google? or the manual?

                          Best,
                          Muhammed

                          Muhammed Hamed
                          V-Ray GPU product specialist


                          chaos.com

                          Comment


                          • #14
                            Ok, I will keep you posted. What exactly would you like to know about the debug LED?
                            Aleksandar Mitov
                            www.renarvisuals.com
                            office@renarvisuals.com

                            3ds Max 2023.2.2 + Vray 7 Hotfix 1
                            AMD Ryzen 9 9950X 16-core
                            96GB DDR5
                            GeForce RTX 3090 24GB + GPU Driver 566.14

                            Comment


                            • #15
                              When I started building all of our 7950x nodes I ran into memory test errors until I updated the BIOS. With a new platform the BIOS updates have big fixes. You might try that.

                              Then try lowering the RAM frequency, as others have suggested above. Get it stable with the default BIOS settings (after updating the BIOS) first. Then you can play with EXPO, but there is no guarantee that the EXPO settings will be stable. Hence why you try lowering the frequency (first down to the stock RAM setting (usually 4800)).

                              If you want to test faster than render try Prime95. It was giving me the RAM errors pretty quickly.
                              Last edited by Joelaff; 19-12-2022, 06:31 PM.

                              Comment

                              Working...
                              X