Announcement

Collapse
No announcement yet.

3DSMAX / Vray GPU / Empty bucket render time

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • 3DSMAX / Vray GPU / Empty bucket render time

    Hello there,


    I wanted to address a more general issue and not a specific bug. We could talk more about optimization here than a technical problem.

    We have been working with Vray for 10 years now and until last year we were exclusively on VRAY CPU for the studio productions. For a little less than a year I've been starting to get entire projects on VRAY GPU.

    Apart from the known bugs, the various but known limitations and the unknown future ones, I have a particular point that attracts my attention without being able to answer it with my technical knowledge.

    It happens very often for many projects to render in multi-layers, a single jewel then a background, a character lighted in one way then lighted in another way etc. for hundreds or thousands of frames, daily.
    This method is quite classical but it implies a lot of renderings with empty space, nothing, 100% black alpha and where only a small part of the image contains pixel information with real calculation inside.

    The main concern we have is that the computation time of these "empty" buckets is quite long and not negligible compared to the speed of Vray GPU computations on buckets where there are actually computations to do.
    On the one hand, we save a considerable amount of time on the complex calculations of things that need to be rendered thanks to GPU technology and on the other hand we lose precious time calculating nothing.

    This concern does not exist in Vray CPU. The computational speed of an empty bucket in CPU is unquestionably insignificant, incomparably fast.

    As mentioned earlier, multiple render layers, multiple options for those render layers and multiple versions of each of those renders before arriving at the validated renders, that's millions of empty buckets that are computed at our end.
    So on some specific projects, the use of VRAY GPU is questioned for this "simple" reason.



    To illustrate what I'm saying, I'll do some practical tests with a sample scene simulating the problem. A 3840x2160p rendering, 10 frames in a row to avoid limiting the error margins, a 6 faces cube, a Vray light and 98% of empty space in the image, no HDRI, no backplate, no FX, native VRAY rendering parameters.

    VRAY 5 update 2.2
    VRAY CPU : AMD Threadripper 3975WX
    VRAY GPU : 3090x2 + Nvlink (CUDA versions are only rendered with graphics cards, without adding the processor)

    1/ VRAY GPU RTX BUCKETS : 10 frames = 2m55s // average 1 frame = 17,5s
    2/ VRAY GPU CUDA BUCKETS : 10 frames = 2m43s // average 1 frame = 16,3s
    3/ VRAY GPU RTX PROGRESSIVE : 10 frames = 1m13s // average 1 frame = 7,3s
    4/ VRAY GPU CUDA PROGRESSIVE : 10 frames = 57s // average 1 frame = 5,7s
    5/ VRAY CPU BUCKETS : 10 frames = 21s // average 1 frame = 2,1s
    6/ VRAY CPU PROGRESSIVE : 10 frames = 1m34s // average 1 frame = 9,4s

    During these tests, we can clearly see a difference between the GPU BUCKETS rendering and the GPU PROGRESSIVE rendering beyond the difference with the CPU rendering.
    I do not note the difference in CUDA and RTX which for me is negligible in these results.
    But when we compare our classic pipeline (CPU BUCKETS) to the new pipeline we are trying to implement (GPU RTX BUCKETS) we end up with a difference of about 8 times more computing time, for " emptiness ".
    I know it's a bit tricky to compare render time with differents devices, differents technologie, but I consider that this GPU hardware should be supperior as this CPU hardware (even if it's nearly impossible to compare)

    We have not succeeded in integrating the GPU PROGRESSIVE version in our rendering pipeline, the calculation times being much more versatile in this mode compared to the BUCKET mode, even if I had already read on this forum that the PROGRESSIVE mode was faster than the BUCKETS mode.

    We may have pipeline issues and may be thinking about the wrong way to work with VRAY GPU. If this is the case, any advice is welcome and I thank you in advance for your time and advice.

    I don't question at all the benefits of GPU rendering over CPU rendering. During classic calculations, the comparison is not to be made. The idea is to mention this particular case of "empty" buckets.


    Is this a problem that we are the only ones to encounter, or even to have noticed (which would surprise me)?
    If it is a problem, or a known remark, are there any existing solutions that could help us?
    And finally, there may be solutions that are already in development.



    Thank you for your answers and your time,
    Nicolas
    Attached Files
    Last edited by nicolas_fuminier; 24-03-2022, 09:38 AM.

  • #2
    interesting.
    Region render maybe?
    what is the issue of GPU Progressive sampler integration?
    Marcin Piotrowski
    youtube

    Comment


    • #3
      Hello Marcin,

      - Region rendering is one of the solutions we already use, but it's a difficult strategy to maintain simply and stably over an entire production.
      On a render pass, the region has to be there and of this size, on this render pass, the object and the camera move, so the render region is not of the same size, on this render pass, we need a piece of background, so we need another size and position for the render region...in short, so many complex traps when you have to produce 5, 10 iterations of the same shot with the same pass/render region settings during 4 months of production.

      This is a solution, but it is limited and challenging to supervise



      - For the progressive GPU rendering, it's mainly because we haven't tested this option enough that we don't use it in production.
      Most probably also because of our CPU pipeline habit where we always use the buckets option and never use the progressive option.
      Finally, it was only a partial answer to our rendering speed problem.

      That said, following your message, I started to push my tests on progressive rendering, to see what it makes us gain and lose.

      I'll update this discussion as soon as I've gathered a bit more data.



      Do some people use progressive rendering in production?

      Thanks



      Comment


      • #4
        100% progressive with VRayGPU. single 3090. bucket would only make sense if you render longer frames (few minutes+) on more than one pc I guess. BF+BF gi can also render faster in your case.
        Marcin Piotrowski
        youtube

        Comment


        • #5
          Thanks again Marcin for your answers and expertise,

          We don't render images split on differents computers, only full frames on differents computers. (so you are totally right, progressive is fully compatible in the production phase)
          We use two different type of computer, some with single 3090 and some with dual 3090 + NVlink.

          Early test this morning with progressive renderer show some small render time difference vs bucket (both way), but not as much as I was affraid and no difference at all in quality. I must try differents scenes and differents setups to be sure but I will probably try a full week in progressive render and see what's happening.

          We use BF Gi in some case (Full jewelry / full product shot for example) but we keep LC Gi for more complex scene with environment, vegetation, set etc.

          Do you render with RTX or with CUDA ?



          That being said, progessive render GPU is still behind CPU bucket in a matter of empty pixel calculation.

          More test to follow,
          Thanks again,
          Nicolas
          Last edited by nicolas_fuminier; 28-03-2022, 05:03 AM.

          Comment


          • #6
            Hi Nicolas,

            Thanks for the feedback, it is very helpful to us


            Originally posted by nicolas_fuminier View Post
            The main concern we have is that the computation time of these "empty" buckets is quite long and not negligible compared to the speed of Vray GPU computations on buckets where there are actually computations to do.
            We are aware of this behavior, Bucket mode optimizations are at the top of our current priority list. We discuss this on regular basis and are actively looking for solutions. It is not a trivial task, we will get there eventually

            Originally posted by nicolas_fuminier View Post
            and on the other hand we lose precious time calculating nothing.
            V-Ray GPU loads the scene data(geometry and textures) on each frame, this is an expected behavior for now. You may find that rendering the frame takes one minute and loading the frame into GPU memory takes 5 minutes. This is not easy to solve, I think our Out Of Core research will help us in finding a solution. I'm aware of how important this is for animation rendering, it is near the top of our priority list as well

            Originally posted by nicolas_fuminier View Post
            This concern does not exist in Vray CPU. The computational speed of an empty bucket in CPU is unquestionably insignificant, incomparably fast.
            Correct, V-Ray's bucket mode is very efficient and big part of how the standard V-Ray renderer is very fast. There are many optimizations for speed and memory usage, bucket mode has been around for a long time
            Keep in mind that V-Ray GPU is a separate render engine, Bucket mode was only introduced with V-Ray Next. We look closely at the performance of the GPU buckets, I will update the thread when I can share more information about the topic.

            We recommend GPU progressive mode. It should be faster than buckets, and it uses less GPU memory. In my previous 2 studios I used GPU progressive mode for rendering animations, with a time limit per frame, it gets the job done.


            Originally posted by nicolas_fuminier View Post
            But when we compare our classic pipeline (CPU BUCKETS) to the new pipeline we are trying to implement (GPU RTX BUCKETS) we end up with a difference of about 8 times more computing time, for " emptiness ".
            Some tips from my side for animation rendering
            -Use RTX mode
            -Use Progressive sampler
            -Use a time limit per frame, in Print I would do some test runs to see roughly how much time is needed for a clean image. Then I use this time limit for all frames, the denoiser helps with the rest.
            -Render one frame per GPU, it is not efficient to use 2 GPUs on the same frame(unless you are working with massive resolutions)
            There is an option in Deadline for rendering a frame per device, it should be helpful.

            Something else that should help is using the least amount of Render Elements possible, they can affect performance a lot(we are aware of that and working on a solution)


            Originally posted by nicolas_fuminier View Post
            I know it's a bit tricky to compare render time with differents devices, differents technologie, but I consider that this GPU hardware should be supperior as this CPU hardware (even if it's nearly impossible to compare)
            Correct, V-Ray GPU is very fast, with proper setup one 3090 should be at least twice as fast as a 3970X 32-core/64 Threads
            Speed and expandability have been always advantages to GPU rendering..


            Originally posted by nicolas_fuminier View Post
            We have not succeeded in integrating the GPU PROGRESSIVE version in our rendering pipeline, the calculation times being much more versatile in this mode compared to the BUCKET mode, even if I had already read on this forum that the PROGRESSIVE mode was faster than the BUCKETS mode.
            This is not an expected behavior, progressive should be always faster unless you hit an issue. I hope my advice above will be helpful, please test again and let me know

            Originally posted by nicolas_fuminier View Post
            I don't question at all the benefits of GPU rendering over CPU rendering. During classic calculations, the comparison is not to be made. The idea is to mention this particular case of "empty" buckets.
            Thanks for the feedback, Please upload me one basic example of your use case of empty pixels. Just to be sure we cover that, I will test and add to our system. Any upcoming improvements will be tested with your example as well

            Originally posted by nicolas_fuminier View Post
            Do some people use progressive rendering in production?
            Plenty of people do use GPU progressive rendering, my scenes for GTC were rendered with progressive mode here and here as well as many animations I did for Carlex and Renntech when I worked there

            Best,
            Muhammed
            Muhammed Hamed
            V-Ray GPU product specialist


            chaos.com

            Comment


            • #7
              Thanks for the help as well Marcin
              Muhammed Hamed
              V-Ray GPU product specialist


              chaos.com

              Comment


              • #8
                Hello Muhammed,

                Thank you again for your comprehensive answers and your time.
                They perfectly match Marcin's advices and other posts I have read on the forum.

                We are going to switch our rendering to progressive mode today and we'll take a review at the end of the week. My fears about the GPU progressive rendering were not backed up with concrete tests and I notice that it was mostly due to the CPU buckets habit.

                Thanks again for all your advice that we will implement while waiting for possible updates of the renderer (that we follow assiduously).

                For the time limit advice, sadly we can't use it in production, we are fully commited to animation render (no still frames / print) and time limit per frame is never recommended for animation, as it does not make the noise value consistent from one frame to the next.


                I have a very small question about this suggestion:
                "Render one frame per GPU, it is not efficient to use 2 GPUs on the same frame(unless you are working with massive resolutions)"
                Does rendering one frame per GPU keep the NVlink running for RAM or if I render one frame per GPU, I go back down to the RAM value of one GPU (which seems more plausible to me knowing that the two frames rendered on the two graphics cards have possibly different RAM requirements, I guess it's not possible to pool RAM for two different frames).
                This is often the reason why we use our duo of 3090 at the same time, for RAM reasons and not for computation speed (even if this is a factor)

                Best,
                Nicolas
                Last edited by nicolas_fuminier; 28-03-2022, 05:27 AM.

                Comment


                • #9
                  Originally posted by nicolas_fuminier View Post
                  For the time limit advice, sadly we can't use it in production, we are fully commited to animation render (no still frames / print) and time limit per frame is never recommended for animation, as it does not make the noise value consistent from one frame to the next.
                  I have used this reliably on rendering animations, the workflow is not odd on GPU rendering. The plan is usually to get the render time to one minute per frame, and the denoiser does the rest. It is worth a try in my view

                  Originally posted by nicolas_fuminier View Post
                  Does rendering one frame per GPU keep the NVlink running for RAM or if I render one frame per GPU
                  No, you will not be able to share the memory between the cards anymore. In your case it is fine to stick to NVlink and multiple GPUs per frame

                  We will add this information to the GPU documentation soon
                  Please let me know how your testing goes, and if you need help with something. Have a great day!

                  Best,
                  Muhammed
                  Muhammed Hamed
                  V-Ray GPU product specialist


                  chaos.com

                  Comment


                  • #10
                    Originally posted by nicolas_fuminier View Post
                    Do you render with RTX or with CUDA ?
                    depends on the scene. raytracing vs shading ratio. quite often I just test a frame and more times that I would like to I’m suprised by the result. it’s never vast speed difference though in my case.
                    Marcin Piotrowski
                    youtube

                    Comment

                    Working...
                    X