Announcement

Collapse
No announcement yet.

Bucket and Progressive Samplers' Analysis

Collapse
This is a sticky topic.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Bucket and Progressive Samplers' Analysis

    Intro and Goals:

    The Bucket and the Progressive sampler have their very distinct pros and cons.
    This post aims to analyse them and clarify their use cases, along with providing for some tips to avoid common mistakes.

    The Bucket Sampler:

    Pros:
    1. The Bucket Sampler is kinder on memory usage, allocating only the amount of data needed for the active buckets. This includes the image data for the VFB and Render Elements (also called AoVs).
    2. The Bucket Sampler is ideal to render very high resolution images, thanks to its ability to check for noise threshold only in and around the current buckets.
    3. The Bucket Sampler can render without a frame buffer, writing the image directly to disk, bucket by bucket.
    4. The Bucket Sampler is preferred when using proxies and tiled textures, as it will allow for their graceful loading and unloading (note: provided the Light Cache is not used).
    5. The Bucket Sampler is more efficient with distributed rendering, providing for ideal network traffic, and optimal CPUs usage.
    6. The Bucket Sampler doesn’t suffer a big penalty when the Render Elements (or AoVs) are many.
    7. The Bucket Sampler allows for the use of the CryptoMatte RE, thereby enabling more efficient masking of scene elements.
    8. The Bucket Sampler allows for unfiltered Render Elements (f.e. Z-depth, or any other data-element which needs the unfiltered pixel value.) and Deep EXRs.
    Cons:
    1. The Bucket Sampler does not provide for realtime feedback on the Render quality, only offering the complete bucket or nothing at all.
    2. The Bucket Sampler does not provide as good a realtime feedback on post effects.
    3. The Bucket Sampler cannot use realtime denoising.
    4. The Bucket Sampler has limited use with resumable rendering, only offering the option to complete an unfinished job, exclusively at the quality it had been started at.
    5. The Bucket Sampler only allows to render to a given quality budget, instead of a time one.
    The Progressive Sampler:

    Pros:
    1. The Progressive Sampler provides for realtime feedback on the Render’s quality, which will improve over time.
    2. The Progressive Sampler provides a very good, and fairly accurate, realtime feedback on post effects.
    3. The Progressive Sampler is ideal for the use of realtime denoising, making it very useful for Lighting and Look Development.
    4. The Progressive Sampler is well suited to resumable rendering, offering the option to increase the sampling quality of a previously stopped render (f.e. Turning a preview render into a final).
    5. The Progressive Sampler allows to render stills and animation with a fixed time (instead of quality) budget per frame.
    Cons:
    1. The Progressive Sampler needs to allocate the memory for the whole scene data. This includes the image data for the VFB and Render Elements (also called AoVs).
    2. The Progressive Sampler is not ideal to render very high resolution images, as the noise threshold checking on the whole screen will suffer a sensible performance penalty at higher resolutions, while the needed memory may become sizable.
    3. The Progressive Sampler cannot use the bufferless rendering mode, which would allow writing the image to disk while rendering it, and save on RAM use.
    4. The Progressive Sampler isn’t well suited to the use of proxies and tiled textures, as it will force the loading of all of the visible (directly and indirectly) ones from the early passes onward, and make their unloading from memory less likely.
    5. The Progressive Sampler is by its nature less efficient with distributed rendering, providing for non-optimal network traffic, and CPUs usage.
    6. The Progressive Sampler suffers a performance penalty when the Render Elements (or AoVs) are many.
    7. The Progressive Sampler cannot produce unfiltered Render Elements or Deep EXRs.
    Common Pitfalls:

    Problem:

    Switching Samplers can lead to much increased render times when passing from bucket to progressive, or quicker and noisier renders when moving from progressive to bucket.
    This is the case, for example, if one was to switch leaving all options at defaults.

    Solution:

    Make sure to match the Max AA Subdivs and the Noise Threshold between sampler types, as the defaults currently are not identical between the two.

    For example, the default max AA Subdivs for the Bucket Sampler is 24 by default (or 24 ^ 2 samples, 576), while the one for the Progressive Sampler is 100 (or 100^2 samples, 10000, some seventeen times more!).

    In the same way, the Noise Threshold for the Bucket Sampler defaults at 0.01, while for the Progressive Sampler it defaults to 0.005.
    This alone would lead to four times longer renders for the Progressive Sampler as compared to the Bucket one.

    Features Reference Table

    Legend for the table below:
    “Y” means a feature is available with the specified sampler, “N” means it’s not.
    “+” Means a feature is more efficient with the given sampler, “-” means it’s less so.
    Always refer to the points above for a more comprehensive explanation.
    Sampler Type Bucket Progressive
    VFB/REs Memory Use + -
    High Resolutions + -
    Direct-to-Disk Rendering Y N
    Proxies and Tiled Textures + -
    Distributed Rendering + -
    Many Render Elements (AoVs) + -
    CryptoMatte Y Y
    Unfiltered REs / Deep Exrs Y N
    Realtime Render Quality Control N Y
    Realtime Post Effects Feedback - +
    Realtime Denoising N Y
    Resumable Rendering Only to completion. With improving quality.
    Render to Time Budget N Y
    Parting Words:

    While both engines already prove useful at their preferred tasks (final, high resolution rendering, versus quick, iterative look-development, for example), work is being actively done on them with the goals of expanding use cases, and ameliorating the weaker sides.

    Last but not least, remember that these are guidelines only, and your mileage may vary depending on scene type, size and complexity, on the available hardware, and on the presence or not of Earth’s magnetic field-aligned Coronal Mass Ejections from the Sun intercepting our planet.
    You’d better not be rendering then!
    Last edited by ^Lele^; 02-11-2022, 06:00 AM.
    Lele
    Trouble Stirrer in RnD @ Chaos
    ----------------------
    emanuele.lecchi@chaos.com

    Disclaimer:
    The views and opinions expressed here are my own and do not represent those of Chaos Group, unless otherwise stated.

  • #2
    Always a pleasure to have this kind of threads.

    In a concise way you cleared the situation (at least from my point of view).
    Thank you very much.

    Comment


    • #3
      Excellent, thanks all for this.
      Sorry to be pedantic but you have the 'i' and 'y' the wrong way around
      https://www.behance.net/bartgelin

      Comment


      • #4
        Thanks Lele! It is nice to have all this info in one place finally
        -------------------------------------------------------------
        Simply, I love to put pixels together! Sounds easy right : ))
        Sketchbook-1 /Sketchbook-2 / Behance / Facebook

        Comment


        • #5
          Originally posted by fixeighted View Post
          Excellent, thanks all for this.
          Sorry to be pedantic but you have the 'i' and 'y' the wrong way around
          Gah! thanks for pointing it out, i'll ask the Admins to fix it ( i can't myself as it's a sticky now).

          Lele
          Trouble Stirrer in RnD @ Chaos
          ----------------------
          emanuele.lecchi@chaos.com

          Disclaimer:
          The views and opinions expressed here are my own and do not represent those of Chaos Group, unless otherwise stated.

          Comment


          • #6
            I'm surprised Grammarly didn't pick it up!
            JOKE! Definitely a joke!
            https://www.behance.net/bartgelin

            Comment


            • #7
              Thanks for all the information, Lele. So I assume the performance hit in the scene I sent to you is to be expected in progressive mode?
              Which results did you get while benchmarking the scene?
              Last edited by kosso_olli; 06-03-2019, 04:16 PM.
              https://www.behance.net/Oliver_Kossatz

              Comment


              • #8
                I can most def confirm the DR issues. Tested it pretty thoroughly and the efficiency on about 10 machines on 1 image with DR was down to about 50-60 % from the slave nodes.

                Comment


                • #9
                  Originally posted by kosso_olli View Post
                  Thanks for all the information, Lele. So I assume the performance hit in the scene I sent to you is to be expected in progressive mode?
                  Which results did you get while benchmarking the scene?
                  It's STILL profiling, incredibly (some renders are going in 18/20hrs a piece), but initial results seem to indicate that no, besides the resolution growing (ie. the 9x6k image), performance is well in line: the quarter resolution, and half resolution, seem to indicate the slack is within bounds (that 10% i spoke of).
                  The scene you sent had the issue with max AA and noise Thresholds being different between samplers, at the expense of the progressive (cfr. the pitfalls section above).
                  Once adjusted, i got 4195 seconds for bucket, and 4289 for progressive, for the quarter resolution (still an abundant 2k image), while 16558 versus 18955 for the half res (or 4.5k x 3K or so pixels).
                  It's only render number five of 12 completed, so it's early still, maybe with REs and the rest it'll worsen, we'll see.
                  Lele
                  Trouble Stirrer in RnD @ Chaos
                  ----------------------
                  emanuele.lecchi@chaos.com

                  Disclaimer:
                  The views and opinions expressed here are my own and do not represent those of Chaos Group, unless otherwise stated.

                  Comment


                  • #10
                    Originally posted by ^Lele^ View Post
                    It's STILL profiling, incredibly (some renders are going in 18/20hrs a piece), but initial results seem to indicate that no, besides the resolution growing (ie. the 9x6k image), performance is well in line: the quarter resolution, and half resolution, seem to indicate the slack is within bounds (that 10% i spoke of).
                    The scene you sent had the issue with max AA and noise Thresholds being different between samplers, at the expense of the progressive (cfr. the pitfalls section above).
                    Once adjusted, i got 4195 seconds for bucket, and 4289 for progressive, for the quarter resolution (still an abundant 2k image), while 16558 versus 18955 for the half res (or 4.5k x 3K or so pixels).
                    It's only render number five of 12 completed, so it's early still, maybe with REs and the rest it'll worsen, we'll see.
                    Thanks for the info! Looking forward to improvements.
                    One additional thing that came into my mind: Bucket sampler may sometimes render black buckets with heavy motion blur or DOF, depending on bucket size and Min AA. This doesn't happen with progressive.
                    https://www.behance.net/Oliver_Kossatz

                    Comment


                    • #11
                      could we have some thoughts on how this analysis carries over to GPU rendering?

                      For example does the bucket renderer on GPU have the same memory saving attributes as the cpu renderer?

                      Does it do on-demand load/unload of proxies, textures, per bucket displacement calculations etc?

                      ive seen it discussed as being better for GPU DR, however given the limited GPU ram available, the other advantages provided when cpu rendering would be most welcome on GPU too.

                      Comment


                      • #12
                        I think -and stand ready to be corrected - it's really too early still for the GPU implementation of the bucket sampler.
                        Someone on the GPU team will know best, however.
                        Last edited by ^Lele^; 08-03-2019, 04:41 AM.
                        Lele
                        Trouble Stirrer in RnD @ Chaos
                        ----------------------
                        emanuele.lecchi@chaos.com

                        Disclaimer:
                        The views and opinions expressed here are my own and do not represent those of Chaos Group, unless otherwise stated.

                        Comment


                        • #13
                          Hey super gnu
                          On GPU unfortunately there are no such memory optimizations just yet. The proxies and displacement calculations are all done beforehand and they are in GPU memory all throughout the render.
                          The on-demand textures continue to work in bucket mode as they work in progressive, so you have this memory optimization on texture sizes. The GPU Bucket sampler will continue to evolve though,
                          so things are going to improve. Indeed the main goal of the GPU bucket sampler right now is to enable better scaling of GPU rendering in DR and also through it, add support for Cryptomatte.
                          Alexander Soklev | Team Lead | V-Ray GPU

                          Comment


                          • #14
                            thanks for the feedback, yes i suspected this was the case, as in my tests i saw no benefit gpu ram wise.

                            is it theoretically possible to do on-demand proxies and displacement for buckets with gpu and keep the speed? id imagine the bandwith to the cards would be a limiting factor...

                            Comment


                            • #15
                              It is definitely possible to do it, but I fear the constant memory transfers are going to have a very severe impact on performance. We are doing R&D on how to save more GPU ram, but it might just be irrelevant soon enough
                              With cards such as the Quadro RTX 8000 with 48GB of VRAM onboard and NVLink capabilities.
                              Alexander Soklev | Team Lead | V-Ray GPU

                              Comment

                              Working...
                              X