Announcement

Collapse
No announcement yet.

Poor Performance on Dual Epyc system

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Poor Performance on Dual Epyc system

    Hi,

    I've been in need of some more rendering power, so I've just purchased a dual Epyc 7763 system.
    The intention is to use it pretty much solely for VRay rendering, so my purchase was very much guided by VRay Benchmark scores, where it features twice on the first page of results.

    As expected, when I run the Benchmark, the Epyc system scores considerably higher than my Threadripper 3990x workstation - 80,000 and 55,000 respectively.
    However, when I send any actual projects to it, it seems to render at about half the speed of the 3990x machine.
    Both machines have the same amount or RAM and are running the same OS.

    Am I missing something in how it's set up?
    I've tried it with and without Hyperthreading - Benchmark scores are a bit lower without SMT, render performance is much the same.

    I'm using Vray 6 for 3ds max.

    Any help would be appreciated.

    Cheers,

    John
    Website
    Behance
    Instagram

  • #2
    I've sorted this now, and I thought it may be helpful to add to the thread just in case anyone else comes across similar problems.

    Basically, the problem seems to have been due to the NUMA memory configuration.
    I had 4x 64GB sticks in the server, and no matter how I configured the NUMA domains, performance was poor.
    I then swapped them out for 16x 16GB sticks, and render speed rocketed.
    VRay Benchmark is now up at about 100,000. Corona benchmark numbers tripled.
    But more importantly, the scenes I'm sending to it are rendering at about the speed I'd expect them to.

    Cheers,

    John
    Website
    Behance
    Instagram

    Comment


    • #3
      Thanks that's helpful, which motherboard did you use to build the system with? It's likely to do with the board as well.
      Dmitry Vinnik
      Silhouette Images Inc.
      ShowReel:
      https://www.youtube.com/watch?v=qxSJlvSwAhA
      https://www.linkedin.com/in/dmitry-v...-identity-name

      Comment


      • #4
        Originally posted by Morbid Angel View Post
        Thanks that's helpful, which motherboard did you use to build the system with? It's likely to do with the board as well.
        Building it would've been far beyond my capabilities!
        It's a Dell Poweredge R6525.

        Cheers,

        John
        Website
        Behance
        Instagram

        Comment


        • #5
          I wonder if you would get good performance with the 32 gb sticks. Also, for threadripper you do need high Mhz ram, initially I got threadripper 64 core with 6 32 gb ram sticks but they ran at 2200Mhz, and it was not enough. I swapped them for 3200Mhz sticks and performance went up. I bet this may also be related, I read that for threadripper pro you should have higher then that, 3600 or 4200 Mhz sticks.
          Dmitry Vinnik
          Silhouette Images Inc.
          ShowReel:
          https://www.youtube.com/watch?v=qxSJlvSwAhA
          https://www.linkedin.com/in/dmitry-v...-identity-name

          Comment


          • #6
            Originally posted by Morbid Angel View Post
            I wonder if you would get good performance with the 32 gb sticks. Also, for threadripper you do need high Mhz ram, initially I got threadripper 64 core with 6 32 gb ram sticks but they ran at 2200Mhz, and it was not enough. I swapped them for 3200Mhz sticks and performance went up. I bet this may also be related, I read that for threadripper pro you should have higher then that, 3600 or 4200 Mhz sticks.
            Yeah as long as you use many sticks to feed all memory channels, you will get good performance with 32 GB sticks. For Threadripper Pro minimum is 8 sticks, and for older Threadrippers like the 3990X minimum is 4 sticks of memory.. For Dual EPYC it should be 8 sticks minimum as well
            Note that the 32 GB sticks have lower frequency and high latency which is expected.
            On another note, while high frequency memory is going to give better performance it is not necessary. I would say stick to 3200 Mhz if you fill up all memory slots on your motherboard, it is a lot harder to make 3600 stable in my experience..

            Best,
            Muhammed
            Muhammed Hamed
            V-Ray GPU product specialist


            chaos.com

            Comment


            • #7
              I'd like to chime in as we are also using a similar Epyc processor. I am noticing on some CPU render jobs, it's not using all cores fully or at least fluctuating a lot. Maybe you can take a quick look at the system specs Muhammed_Hamed ? We are using a ESC4000A-E11 Server from ASUS. Currently with 4x MEMS DDR4 3200 ECC Reg. 64 GB | Samsung. Some render jobs I'm seeing only half of the cores using constant 100% and the other half only peeking breifly than going down to @20-30%. There are also 2 RTX A6000 installed.
              Thx

              Comment


              • #8
                Originally posted by LeePuznowski View Post
                I'd like to chime in as we are also using a similar Epyc processor. I am noticing on some CPU render jobs, it's not using all cores fully or at least fluctuating a lot. Maybe you can take a quick look at the system specs Muhammed_Hamed ? We are using a ESC4000A-E11 Server from ASUS. Currently with 4x MEMS DDR4 3200 ECC Reg. 64 GB | Samsung. Some render jobs I'm seeing only half of the cores using constant 100% and the other half only peeking breifly than going down to @20-30%. There are also 2 RTX A6000 installed.
                Thx
                Hello,

                You need to have at least 8 sticks of memory at least, to have the full performance of your EPYC CPU. This is where you should start your trouble shooting, similar to the original post and the fix he posted

                Best,
                Muhammed
                Muhammed Hamed
                V-Ray GPU product specialist


                chaos.com

                Comment


                • #9
                  Hi, Thanks for the quick reply. So having all slots full is just as important as the amount and type of overall RAM? Currently we're at 256GB, but only 4 sticks. I wasn't aware of that kind of bottleneck. Is this more specific to Vray's use of the processor or the processor in general? I'm not that tech savy but need to clarify this with our IT first. Thanks for any help.

                  Comment


                  • #10
                    Originally posted by LeePuznowski View Post
                    Hi, Thanks for the quick reply. So having all slots full is just as important as the amount and type of overall RAM? Currently we're at 256GB, but only 4 sticks. I wasn't aware of that kind of bottleneck. Is this more specific to Vray's use of the processor or the processor in general? I'm not that tech savy but need to clarify this with our IT first. Thanks for any help.
                    Hello,

                    This is not specific to V-Ray, it is about the architecture of these new CPUs. If you wanna point your IT to this, it is called "NUMA configuration for AMD EPYC CPUs"
                    In conclusion, The EPYC machines need to be equipped with at least 8 memory sticks
                    Muhammed Hamed
                    V-Ray GPU product specialist


                    chaos.com

                    Comment


                    • #11
                      I found that there are some limitations of using epyc cpu's directly as a traditional system. For example some software like houdini is internally limited to 190 threads (or so), and cannot go above that. Also managing that many threads can be an issue for the memory. I spoke with my IT and they told me that the best use of such system is to virtualize it. Setup multiple machines and split the rendering job on them, or treat them as virtual workstations. This is the main purpose of such system, as a 2u rack server it will be compact in the rack while able to serve 10-20 users easily.
                      Dmitry Vinnik
                      Silhouette Images Inc.
                      ShowReel:
                      https://www.youtube.com/watch?v=qxSJlvSwAhA
                      https://www.linkedin.com/in/dmitry-v...-identity-name

                      Comment


                      • #12
                        Just wanted to update. We've now installed 8 x 64 RAM (total 512) and upon initial testing I'm seeing some improvement at the beginning of the render. For the first 1/3 of the render all the CPUs are going full, but after that half of them throttle down (not being utilized). Maybe this is scene specific? Or could there be a BIOS setting or such affecting this? Out IT will be checking on the temps to see if anything is happening there, but the server is properly cooled so this shouldn't be a factor.

                        Comment


                        • #13
                          Update: After trying out various NUMA confis, nothing changed. But I changed from Progressive (BF/BF) to Bucket (BF/BF) and now all Cores are being used 100% throughout. The performance increase is quite substantial. It seems Buckets are better for this Epyc config. In case any others are having issues.

                          Comment

                          Working...
                          X