Announcement

Collapse
No announcement yet.

Theoretical DR slave limit (performance wise)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Theoretical DR slave limit (performance wise)

    Hi,

    I have the possibility to virtually throw 100 slaves onto a DR limit. But I'm suspecting this may not be very efficient. Is there a theoretical limit where maybe an additional slave won't necessarily decrease the total render time of a frame?

    Let's assume I use DR on something that takes 20 hours to render on one single machine.
    Best Regards,
    Fredrik

  • #2
    How long is string?
    It depends on whole bunch of things. How big is the textures, how fast is your network, how big is the buckets etc...
    Kind Regards,
    Morne

    Comment


    • #3
      It's was perhaps wrong of me to write "theoretical", when I'm actually looking for something practical to work with.

      What I'm wondering is how many slaves a master (managing the DR slaves) can handle before the managing of the slaves and possibly the network speed is going to become a bottleneck. There must be some general ballpark figure of how many DR nodes is too many.

      I'm on a gigabit network, about 500 MB in textures, Maya scene file size is 700 MB, vrmeshes are 600 MB, bucket size is 64x64, final frame size is A4 (3508 x 2480) and the scene takes about 23 hours to render due to a large amount of lights (no RAM issues). This is on a dual 2.93GHz machine with 12 cores. Maya 2012 and V-Ray 2.2.
      Best Regards,
      Fredrik

      Comment


      • #4
        Heya

        Well you have 1 gb network speed which is 100mb/s transfer speed. You have around 2gb per scene to send. So you looking at 20-25 second to upload scene to 1 node. After that its all rendering.

        It will take you around 40 minutes to upload ur scene to 100 slaves.
        CGI - Freelancer - Available for work

        www.dariuszmakowski.com - come and look

        Comment


        • #5
          Yeah that's why we just updated to a 10Gb switch.
          The server sustain 98% of upload speed on the NIC for the moment, going at 1200Mo/s to 50 nodes, which is "only" 25Mo/s per node...
          We are thinking of teaming the 10Gb NIC but the bottleneck is at the raid storage rack now, it's only connected to the server through 2 6Gb links (12Gb limit).

          I don't think there is any limitation to the amount of slaves, vray handels it so well... I would say the limitation would be everywhere but not on Vray's side.

          Stan
          3LP Team

          Comment


          • #6
            Originally posted by 3LP View Post
            Yeah that's why we just updated to a 10Gb switch.
            The server sustain 98% of upload speed on the NIC for the moment, going at 1200Mo/s to 50 nodes, which is "only" 25Mo/s per node...
            We are thinking of teaming the 10Gb NIC but the bottleneck is at the raid storage rack now, it's only connected to the server through 2 6Gb links (12Gb limit).

            I don't think there is any limitation to the amount of slaves, vray handels it so well... I would say the limitation would be everywhere but not on Vray's side.

            Stan
            So at the moment when you send data it goes node by node right? so 2gb is being updated 50 times. Maybe you should buy unmanaged switch and configure it urself? Then you could set it up to mirror data and send 2gb only one time to all 50 nodes at the same time... not sure how to do it, I only hear about it from another IT tech guy talking about it a while back...
            CGI - Freelancer - Available for work

            www.dariuszmakowski.com - come and look

            Comment


            • #7
              DADAL,

              Just so that I really understand how this is working ...

              So the master starts by uploading the scene to each slave.
              Question 1: Rendering on a slave starts when the complete scene is uploaded to the slave? (this means this uploading phase would be the bottleneck with a high number of slaves, right?)
              Question 2: If the master is not included in the slave list, will it only manage the render or will it also render? (I have noticed its CPU is pretty much always at 100% no matter how many slaves are rendering)
              Best Regards,
              Fredrik

              Comment


              • #8
                Heya

                The way I thnk maya works from my experience (I think its different between DBR RT and RT) is that when you press render. Maya export your scene to Vray scene file. Once this is finish it process it and start rendering. Once it finish saving the scene that vrayscene file is being send to all your slaves. Not sure if 1 by 1 or all at once. At this moment you are sending only vrayscene - so it does NOT include textures and (I think) proxy. So it is sort of bottlenecking your connection. Second bottleneck kick in once slave needs a texture at which point it will go to server to use it. Here ur second bottleneck starts.

                If master is not included in the list then it only means vray wont start another service on your workstation to use in rendering. You should never add ur master workstation to the list of DBR machines. The Master is responsible for collecting buckets from slaves and assigning new ones as far as I know. You can disable master from rendering by unticking the tick box next to DBR settings I think the name is "use local machine" or simmilar. At which point your master will NOT render but it will still distribute work, collect and save images to HDD.
                CGI - Freelancer - Available for work

                www.dariuszmakowski.com - come and look

                Comment


                • #9
                  Originally posted by DADAL View Post
                  At this moment you are sending only vrayscene - so it does NOT include textures and (I think) proxy. So it is sort of bottlenecking your connection. Second bottleneck kick in once slave needs a texture at which point it will go to server to use it. Here ur second bottleneck starts.
                  Luckily our solution for that is almost completed

                  Best regards,
                  Vlado
                  I only act like I know everything, Rogers.

                  Comment


                  • #10
                    Originally posted by DADAL View Post
                    So at the moment when you send data it goes node by node right? so 2gb is being updated 50 times. Maybe you should buy unmanaged switch and configure it urself? Then you could set it up to mirror data and send 2gb only one time to all 50 nodes at the same time... not sure how to do it, I only hear about it from another IT tech guy talking about it a while back...

                    This is called multicasting, and that's exactly why I asked 4 moths ago if Vray supported multicasting :
                    http://www.chaosgroup.com/forums/vbu...t=multicasting

                    Vray doesn't support it (for the moment?), but they are developping another way of transfering files, and I think that's what Vlado just mentionned.


                    Originally posted by Fredrik Averpil View Post
                    DADAL,


                    Just so that I really understand how this is working ...


                    So the master starts by uploading the scene to each slave.
                    Question 1: Rendering on a slave starts when the complete scene is uploaded to the slave? (this means this uploading phase would be the bottleneck with a high number of slaves, right?)
                    Question 2: If the master is not included in the slave list, will it only manage the render or will it also render? (I have noticed its CPU is pretty much always at 100% no matter how many slaves are rendering)

                    Well, you need to see the difference beween the master and the server who is hosting the files.
                    It could be the same computer in a small structure, but in a larger structure, the server is independant of the artist's computer and the nodes.
                    We have a whole bunch of nodes and few dedicated nodes that are actually rendering through BB and spawning. That means it can use the whole farm, or part of it, and not use the artist computer.


                    Knowing that, to answering your questions :
                    1: Yes, the rendering on a slave starts only when the files (master scene, xref, VrayProxys, textures, etc) are completely send, and then it can actually load up the scene and start to render.
                    2: With 3dsmax (I don't know how it works on maya) the master allways render. And it will manage the slaves at the same time. The master needs to open the file first, load the scene, and send the render to the nodes, so it will start render before the other.


                    Originally posted by DADAL View Post
                    Heya


                    The way I thnk maya works from my experience (I think its different between DBR RT and RT) is that when you press render. Maya export your scene to Vray scene file. Once this is finish it process it and start rendering. Once it finish saving the scene that vrayscene file is being send to all your slaves. Not sure if 1 by 1 or all at once. At this moment you are sending only vrayscene - so it does NOT include textures and (I think) proxy. So it is sort of bottlenecking your connection. Second bottleneck kick in once slave needs a texture at which point it will go to server to use it. Here ur second bottleneck starts.


                    If master is not included in the list then it only means vray wont start another service on your workstation to use in rendering. You should never add ur master workstation to the list of DBR machines. The Master is responsible for collecting buckets from slaves and assigning new ones as far as I know. You can disable master from rendering by unticking the tick box next to DBR settings I think the name is "use local machine" or simmilar. At which point your master will NOT render but it will still distribute work, collect and save images to HDD.

                    Yeah max doesn't export a Vrayscene at each render, it save out a max file and send it. What you're explaining seems pretty close to what Vray does with RT tho. And there you do have the option to not use local host to render, and it will like you said just manage the slave.


                    Overall, this does emphasize the fact that the more slaves, the more network usage will be used to send the files to the nodes. That's why, chaosgroup is working on a solution to avoid this problem. For exemple, few months back when we where still with a 1Gb switch, for small/test renders, it was quicker to only use few nodes and not the whole farm. Otherwise, the nodes took sometime 2-3 minutes to only receive the files and then start the render, by then, the master had allready render 50% or more of the image. Using only few computers, the nodes jumped in the render quicker.


                    Depending of the wight of the master file, it could be a bottleneck like DADAL explained, that's why xrefing as much as you can is the key, you switch the networkload from the master to the server.


                    RT has still a big issue concering this, so we are thinking of setting a 10GB NIC card in the artist computer so they are not waiting loads of time. I'm happy to hear Vlado tentionning that all this is almost completed


                    Hope this help, please correct me if I'm wrong.
                    Stan
                    3LP Team

                    Comment


                    • #11
                      Originally posted by 3LP View Post
                      Hope this help, please correct me if I'm wrong.
                      Correct on all counts

                      Best regards,
                      Vlado
                      I only act like I know everything, Rogers.

                      Comment


                      • #12
                        Omg I'm in max forum AAA >.< Sorrryy... yea my previous post was tad miss leading... Its like guys corrected me.

                        I gotta pay more attention to where I post >.<

                        Thanks, bye.
                        CGI - Freelancer - Available for work

                        www.dariuszmakowski.com - come and look

                        Comment

                        Working...
                        X