Announcement

Collapse
No announcement yet.

GPU scaling not as expected with Vray vs Redshift

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GPU scaling not as expected with Vray vs Redshift

    Not sure if this is a current limitation of Vray GPU, but since the other day i noticed that the GPU's don't scale as linearly as i would expect with Vray... My workstation has 3x 980 Ti's, soon 4x or more. So i did a very basic scene and compare render times per GPU combo, and also compare with Redshift renderer. While RS scales almost linearly, and that's what i would expect from a GPU renderer, Vray not so much.

    Here's the results (times on the left for Vray):

    Click image for larger version

Name:	VrayGPU.jpg
Views:	1
Size:	257.5 KB
ID:	884823
    Click image for larger version

Name:	RSGPU.jpg
Views:	1
Size:	444.5 KB
ID:	884824

    Vray
    00:31.5 - 100%
    00:42.5 - ~34.9%
    01:04.2 - ~33.6%

    Redshift
    00:38 - 100%
    00:57 - ~50%
    01:46 - ~46%

    Is there any reasoning for this massive difference in scaling?
    I want to get the most of my GPUs, so the only thing i can think of with Vray is to somehow render sequence frames per GPU, instead of assigning 3 GPUs to a single frame. (I don't own deadline...). How would i go about this then?
    But with that said, for still images, there's still a big crush in performance.
    Last edited by Moriah; 23-06-2016, 08:14 AM.

  • #2
    Normally V-ray RT GPU scales very well up to an including 8 GPUs, at lest from the tests that we've done.

    It would be best to test a more complicated scene; 30s is not enough - most of that time is probably spent not on rendering, but on other things. A reliable test should be at least 3-5 minutes of pure rendering (not loading scene or textures etc). Also make sure you are using the latest version of V-Ray.

    Best regards,
    Vlado
    I only act like I know everything, Rogers.

    Comment


    • #3
      Originally posted by vlado View Post
      Normally V-ray RT GPU scales very well up to an including 8 GPUs, at lest from the tests that we've done.

      It would be best to test a more complicated scene; 30s is not enough - most of that time is probably spent not on rendering, but on other things. A reliable test should be at least 3-5 minutes of pure rendering (not loading scene or textures etc). Also make sure you are using the latest version of V-Ray.

      Best regards,
      Vlado
      I've done the tests with yesterday's nightly, and with most recent RS build too.

      Thing is i noticed this on a way more complex scene that was taking around 30min on 3 GPUs... and with only 1 GPU it took around 50min. Something didn't add up therefore i wanted to do this small test, but yes i can do it on a more complex one.

      Comment


      • #4
        Originally posted by Moriah View Post
        I've done the tests with yesterday's nightly.
        Hmm, ok. We'll run a few tests here too.

        Thing is i noticed this on a way more complex scene that was taking around 30min on 3 GPUs... and with only 1 GPU it took around 50min. Something didn't add up therefore i wanted to do this small test, but yes i can do it on a more complex one.
        It's definitely something that we have not seen and it might be a bug. Let me ask someone to look into it...

        Best regards,
        Vlado
        I only act like I know everything, Rogers.

        Comment


        • #5
          Hey Moriah,
          As Vlado mentioned RT GPU should scale pretty linearly. The last time we tested with 8 GPUs it was 7.94 times faster.
          So this definitely seems as a problem that I would definitely like to fix.

          There is a slightly chance for this to be related with the setup. I know that it is a simple scene, but can you share it anyway ?
          Few more questions is which exact version of V-Ray have you tested (there are multiple nightly builds ...), and do you use RT GPU as Production or as Active Shade ?
          What are your ray bundle size and rays per pixel settings ? As a matter of fact, if you just send the scene over we will figure all this from it Also, did you tested Redshift with bucket or progressive ?

          Thanks,
          Best,
          Blago.
          V-Ray fan.
          Looking busy around GPUs ...
          RTX ON

          Comment


          • #6
            Originally posted by savage309 View Post
            Hey Moriah,
            As Vlado mentioned RT GPU should scale pretty linearly. The last time we tested with 8 GPUs it was 7.94 times faster.
            So this definitely seems as a problem that I would definitely like to fix.

            There is a slightly chance for this to be related with the setup. I know that it is a simple scene, but can you share it anyway ?
            Few more questions is which exact version of V-Ray have you tested (there are multiple nightly builds ...), and do you use RT GPU as Production or as Active Shade ?
            What are your ray bundle size and rays per pixel settings ? As a matter of fact, if you just send the scene over we will figure all this from it Also, did you tested Redshift with bucket or progressive ?

            Thanks,
            Best,
            Blago.
            Sure i can share it but only tomorrow i'm off work now. (scene only has vray sun, GI, 1 chrome mat and 1 default vraymat)
            Vray i installed yesterday's nightly build so 3.45.01 june 22nd. Vray GPU as Production. Bundle size and rays per pixel are default settings.
            I did test RS in bucket mode (progressive is not really advised there), and tried to have similar settings with vray.

            I can also try and see if i can see you the scene where i had the initial issue with scaling, from 30min 3 gpus to 50min 1 gpu, i'll see if that's possible.

            I don't know if it's related to hardware but here goes (recent workstation)

            i7 6850K OC to 4.6
            32GB DDR4
            3x 980Ti 6gb
            ASUS X99-E WS 3.1
            Windows 10
            3dsmax 2015

            Comment


            • #7
              Great, thank you so much ! Having the scene will be very useful.
              I guess that it is just that the default settings for the production are not that great. The rays per pixel and ray bundle size are too low for some multi-gpu setups. I guess just increasing them to something like 64 or 128 rays per pixel and 256 or 384 ray bundle size will fix the issue (and make the render faster in general). We have a note to bump those defaults for the next SP.
              And btw, we have multiple nightly builds with the same V-Ray core version (probably we have to change this). The build number (it is part of the .zip that you download) helps identifying the exact V-Ray build better.

              Best,
              Blago.
              V-Ray fan.
              Looking busy around GPUs ...
              RTX ON

              Comment


              • #8
                Originally posted by savage309 View Post
                Great, thank you so much ! Having the scene will be very useful.
                I guess that it is just that the default settings for the production are not that great. The rays per pixel and ray bundle size are too low for some multi-gpu setups. I guess just increasing them to something like 64 or 128 rays per pixel and 256 or 384 ray bundle size will fix the issue (and make the render faster in general). We have a note to bump those defaults for the next SP.
                And btw, we have multiple nightly builds with the same V-Ray core version (probably we have to change this). The build number (it is part of the .zip that you download) helps identifying the exact V-Ray build better.

                Best,
                Blago.
                Ok, so i tried what you said, 128 rays per pixel and 384 bundle size... And it is 3.5x SLOWER, also there seems to be a big light difference, i think the GI contribution has gone up for some reason. Percentages seem to be the same too...

                Here's the screen:

                Click image for larger version

Name:	VrayGPU2.jpg
Views:	1
Size:	424.2 KB
ID:	862466

                You can dl this simple scene here (max 2015): http://s000.tinyupload.com/index.php...64342919048656

                Vray version is 26831

                Edit: Installed version 26832, now it spends alot of time in "Compiling kernels.." and it takes 53s with 3 GPUs... still almost 2x slower than default settings.
                Last edited by Moriah; 24-06-2016, 04:38 AM.

                Comment


                • #9
                  Every new V-Ray version spends time compiling kernels ... but this happens only the first time you start it.
                  All the next starts will happen without this phase.

                  We will check the scene and get back to you .
                  V-Ray fan.
                  Looking busy around GPUs ...
                  RTX ON

                  Comment


                  • #10
                    Another test on a more complex scene (evermotion). 192 ray bundle, 32 rays per pixel, 6000 max paths/pixel, 0.01 max noise.

                    Click image for larger version

Name:	VrayGPU3.jpg
Views:	1
Size:	489.3 KB
ID:	862468

                    5min52s - 100%
                    8m19s - 41%
                    13m48s - 39%

                    Comment


                    • #11
                      Originally posted by Moriah View Post
                      Another test on a more complex scene (evermotion). 192 ray bundle, 32 rays per pixel, 6000 max paths/pixel, 0.01 max noise.
                      I got the chance to take a look at the scene.
                      For the moment the calculation phase of the Light Cache does not scale with the number of GPUs. We have plans to improve that.
                      Also, do you happen to use DR ? The Light Cache takes a bit more to be computed in DR, it is a known bug, but we haven't fixed it yet and it can bias the render times.
                      When I increased the rays per pixel to a higher value, the scaling gets better. You are right that it could take longer with bigger rays per pixel value. The reason for this however is not because the render gets slower. We just check the noise threshold every few passes, and when the rays per pixel is bigger we check less often. Because of that it happens that we oversample, but in the end you get a cleaner render.
                      Also, keep in mind that the different render engines compute the noise threshold differently (even we get different end result depending on the rays per pixel), so you should also compare the noise in the renders by your own.
                      I also noticed a opportunity to optimize a bit our progressive sampler when the additivity is enabled. We will see how this will end up.

                      Thanks a lot for the report !
                      Best,
                      Blago.
                      V-Ray fan.
                      Looking busy around GPUs ...
                      RTX ON

                      Comment


                      • #12
                        Originally posted by savage309 View Post
                        I got the chance to take a look at the scene.
                        For the moment the calculation phase of the Light Cache does not scale with the number of GPUs. We have plans to improve that.
                        Also, do you happen to use DR ? The Light Cache takes a bit more to be computed in DR, it is a known bug, but we haven't fixed it yet and it can bias the render times.
                        When I increased the rays per pixel to a higher value, the scaling gets better. You are right that it could take longer with bigger rays per pixel value. The reason for this however is not because the render gets slower. We just check the noise threshold every few passes, and when the rays per pixel is bigger we check less often. Because of that it happens that we oversample, but in the end you get a cleaner render.
                        Also, keep in mind that the different render engines compute the noise threshold differently (even we get different end result depending on the rays per pixel), so you should also compare the noise in the renders by your own.
                        I also noticed a opportunity to optimize a bit our progressive sampler when the additivity is enabled. We will see how this will end up.

                        Thanks a lot for the report !
                        Best,
                        Blago.
                        No, i didn't use DR for these tests.

                        So as far as i understand, with default rays per pixel it has the worst performance with scaling, but if i up the value the scaling improves but the overall render is way slower. In the end we're not fully utilizing the GPUs to their best performance either way... I'm not comparing noise thresholds/clean renders, just the fact that one seems to almost fully utilize all the gpu power, while the other doesn't. I'm just concerned about this because i've been testing GPU renderers for a while so we can start doing some serious work in GPU, and build a farm around it.

                        Would a temporary solution for this issue be to attribute a frame per GPU when rendering sequences, instead of multiple GPUs per frame? How would one do this in Vray GPU?

                        I noticed too that the noise cleans up very fast, but it's the max paths/pixel that takes a long time to clean up, and even with a really high value +20.000, a simple overbright source like VraySun doesn't clean up properly in reflections and such. Even with lowering the max rays intensity to 2 or 3, using subpixel mapping and even clamp output, results in constant "fireflies" on those overbright reflections. Also in the evermotion scene some objects had random fireflies, not sure if it's due to the same issue since it also had a VraySun, or if it could be shading problem with evermotion pre-made scenes.

                        Anyway thanks for the prompt response!

                        Comment


                        • #13
                          Originally posted by Moriah View Post
                          No, i didn't use DR for these tests.

                          So as far as i understand, with default rays per pixel it has the worst performance with scaling, but if i up the value the scaling improves but the overall render is way slower. In the end we're not fully utilizing the GPUs to their best performance either way... I'm not comparing noise thresholds/clean renders, just the fact that one seems to almost fully utilize all the gpu power, while the other doesn't. I'm just concerned about this because i've been testing GPU renderers for a while so we can start doing some serious work in GPU, and build a farm around it.

                          Would a temporary solution for this issue be to attribute a frame per GPU when rendering sequences, instead of multiple GPUs per frame? How would one do this in Vray GPU?

                          I noticed too that the noise cleans up very fast, but it's the max paths/pixel that takes a long time to clean up, and even with a really high value +20.000, a simple overbright source like VraySun doesn't clean up properly in reflections and such. Even with lowering the max rays intensity to 2 or 3, using subpixel mapping and even clamp output, results in constant "fireflies" on those overbright reflections. Also in the evermotion scene some objects had random fireflies, not sure if it's due to the same issue since it also had a VraySun, or if it could be shading problem with evermotion pre-made scenes.

                          Anyway thanks for the prompt response!
                          It will not be a way slower, but it can't make a few more passes. That's it. Usually the RT GPU users are setting high values for rays per pixel for production and are very satisfying results.
                          If you take out the time needed for the LC phase, you will see that the scaling is not bad. For example, having the LC out of the equation, for the scene you have shared it gives, with two GPUs and 128 rays per pixel I get 52% of the result of 1 GPU. It turns out that you are right, and it is not ideal for some setups. But we have few ideas improving that. One of those is in the nightlies actually, for some reason it decides to not kick in these situations, we will check why.

                          RT GPU does not use sub-pixel mapping and the max ray intensity only works on the secondary rays, because it has to preserve the highlights. We have to look on case by case for the fireflies, since they could be because of different things.
                          Btw, keep in mind that there could be some extra bugs in the nightly builds. Nothing of those you have reported seems to be such and nothing major that I am aware of, but I just wanted to warn.

                          Thanks,
                          Best,
                          Blago,
                          Last edited by savage309; 24-06-2016, 11:23 AM.
                          V-Ray fan.
                          Looking busy around GPUs ...
                          RTX ON

                          Comment


                          • #14
                            Yes you are right, taking LC out the equation it improves the scaling a bit better indeed!

                            Thanks for the clarification, can't wait to see RT GPU improving more and more

                            Comment

                            Working...
                            X