Announcement

Collapse
No announcement yet.

Distributed rendering: Added 96 cores, pre-passes now slower!!

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Distributed rendering: Added 96 cores, pre-passes now slower!!

    Hi,

    I have a small render farm, previously 3 machines, each with 2 Xeons, duel core, hyper threaded, so 8 cores per machine (E5240 @ 2.5GHz). That gave me an extra 24 cores for distributed rendering. My local machine is a duel Xeon 6 core with hyper threading, giving me 24 cores locally (E5649 @ 2.53 GHz). I did a test render on a scene using that configuration of machines and the rendering took 16 minutes 22 seconds.

    I have also just added 4 new machines. Each has duel Xeon 6 core, hyper threaded, so 24 cores per machine (E5-2630 v2 @ 2.60 GHz). That should add 96 cores to the distribution pool. My thought was that it should dramatically speed up the distributed rendering. Not so! For exactly the same image as benchmarked above, the rendering took 13 minutes and 13 seconds.

    To me there is something very wrong. When observing the rendering process, during the pre-passes, the nodes performed quite strangely. I have only noticed when the new nodes have been added, but during the pre-passes they are allocated 24 buckets. As each bucket is finished rendering a new bucket is not allocated to that node, until the last bucket has finished and then a new complete set of buckets is added. Why is this?

    I really need help with this as we've just spent a small fortune on the extra capacity, and the benefit seems to be minimal.

    Regards,

    Bill

  • #2
    Which V-Ray version are you using? What kind of network connection do you have between the nodes? Also, have you tried other scenes? What about brute-force rendering?

    Best regards,
    Vlado
    I only act like I know everything, Rogers.

    Comment


    • #3
      Hi Vlado,

      Thanks for the quick response.

      We are using V-Ray 2.40.04. The network connection is a 1.0 Gbps for all machines. I have not tried other scenes yet. Here are my render settings for the particular scene (provided as I am not sure what you mean by "brute-force rendering"):
      Click image for larger version

Name:	V-ray_settings_Fulham_13-12-17.jpg
Views:	1
Size:	515.2 KB
ID:	851077

      Please let me know if there is anything else I can provide.

      One quick question on the same topic; I was looking around to make sure that I have installed V-Ray correctly on the render nodes, and I noted the comment that "1. You will need to run at least one Backburner job where each render server participates, before you can use them for distributed rendering with V-Ray." I have not done that on any of the render nodes, but they sill render. Is that causing a problem?

      Thanks again,

      Bill

      Comment


      • #4
        Originally posted by LQ2 View Post
        We are using V-Ray 2.40.04. The network connection is a 1.0 Gbps for all machines. I have not tried other scenes yet. Here are my render settings for the particular scene (provided as I am not sure what you mean by "brute-force rendering")
        I mean brute-force GI as the primary GI engine.

        One quick question on the same topic; I was looking around to make sure that I have installed V-Ray correctly on the render nodes, and I noted the comment that "1. You will need to run at least one Backburner job where each render server participates, before you can use them for distributed rendering with V-Ray." I have not done that on any of the render nodes, but they sill render. Is that causing a problem?
        If it works, then it's probably ok as it is.

        Also, do you run the V-Ray DR spawner as a service or manually? Is there a difference if you do it the other way?

        Best regards,
        Vlado
        I only act like I know everything, Rogers.

        Comment


        • #5
          Originally posted by vlado View Post
          I mean brute-force GI as the primary GI engine.
          I have not used brute-force as the primary GI engine before. I will try this later in the week as I have a deadline today.

          Originally posted by vlado View Post
          do you run the V-Ray DR spawner as a service or manually? Is there a difference if you do it the other way?
          I have only run the DR spawner manually. I did try to use the "Register V-Ray spawner service" but it didn't seem to do anything. I might have had the spawner running already though, so I'll try this also.


          I have also been trying using different bucket sizes. 128x128 seems to be the optimum for this image as it reduces the render time to below 10 min, but I still see the waiting time for the buckets to be allocated. It just seems like when the pre-pass is being rendered that the master machine is too busy to be bothered giving out single buckets to the slaves. When the node reports that is has nothing to do, a full compliment of buckets is allocated.

          EDIT: I can't run V-Ray spawner as a service! Nothing happens. It is worth noting that I have followed the "Target" of the "Register V-Ray spawner service" shortcut, and it points to "vrayspawner2013" (with -service as the end of the path). Is that the correct application? Surely it should be 2014?

          regards,

          Bill
          Last edited by LQ2; 18-12-2013, 08:58 AM.

          Comment


          • #6
            Originally posted by LQ2 View Post
            It just seems like when the pre-pass is being rendered that the master machine is too busy to be bothered giving out single buckets to the slaves. When the node reports that is has nothing to do, a full compliment of buckets is allocated.
            Hmm, in that case, open the MaxScript editor on the client machine and type
            Code:
            renderers.current.system_numThreads=22
            and see if it helps. This will cause the client machine to use one core less for rendering so that the other can take care of DR stuff. (Note that the slaves will still use all cores).

            [EDIT] I put 22 as you mentioned that you have 24 logical cores on your main machine, so 22 leaves 2 logical cores free.

            Best regards,
            Vlado
            I only act like I know everything, Rogers.

            Comment


            • #7
              Vlado,

              That made a big difference to the progress of the pre-pass! You're a star. I did an experiment with 23 cores instead of 22 and there it was back again. I assume that this is due to the fact that there are only 12 physical cores, but 24 logical ones. I'm now testing the difference between 64x64 buckets and 128x128 buckets with the 22 core setting.

              does that setting now stay with v-ray, or do I have to set it per scene, or do I have to set it every time I start max?

              [EDIT] I have also tried sending the same render as a backburner job to one of the render nodes, also with distributed rendering enabled. The same problem was present. That was before I ran your script to reduce the number of cores used on the master. Does that script also affect the master node using backburner?

              Kind regards,

              Bill
              Last edited by LQ2; 18-12-2013, 10:07 AM. Reason: added extra info about some tests that I have been doing.

              Comment


              • #8
                Originally posted by LQ2 View Post
                does that setting now stay with v-ray, or do I have to set it per scene, or do I have to set it every time I start max?
                You have to set it per scene and it stays in the scene; it's not very convenient and in V-Ray 3.0 we also added an environment variable for that.

                Best regards,
                Vlado
                I only act like I know everything, Rogers.

                Comment


                • #9
                  Thanks so much for this, it made a big difference.

                  I have done some more experiments, and using my machine + the three original DR machines, with the 22 core setup on my machine, actually renders quicker anyway. Setting aside the two logical cores for DR admin took a whole minute off a 16 min render. When I applied that to the full set up, the 16 min render reduced to 7 min, which is what I thought should happen when first adding the new render nodes.

                  I have also saved that line of script into a maxscript file so I can add it to the pre-render scripts in the common render settings. I realise that it's saved to each file, but as there are a team of us here, and we can't know if that setting has been applied, it's easier to add it to a pre-render script.


                  What I have to solve now is why the render comes out with different colour buckets when I send it to backburner with distributed, but renders fine if rendered locally with distributed. Also I tried with the "check for missing maps" option set on, and the render fails on the DR machines, saying that there are missing maps. Without the setting on, they render absolutely fine.

                  Comment


                  • #10
                    Originally posted by LQ2 View Post
                    What I have to solve now is why the render comes out with different colour buckets when I send it to backburner with distributed, but renders fine if rendered locally with distributed. Also I tried with the "check for missing maps" option set on, and the render fails on the DR machines, saying that there are missing maps. Without the setting on, they render absolutely fine.
                    That's why we added the automatic asset transfer in V-Ray 3.0 But usually it's network permissions on the render servers...

                    Best regards,
                    Vlado
                    I only act like I know everything, Rogers.

                    Comment


                    • #11
                      I already have local admin privileges on my machine, the user account on the render nodes have local admin rights, and I have full access to the files I am currently working with.

                      I've spoken with my IT guy and there is nothing else that we can think of in terms of permissions that we can change. Do you have anything more specific?

                      Cheers,

                      Bill

                      Comment


                      • #12
                        Originally posted by LQ2 View Post
                        I've spoken with my IT guy and there is nothing else that we can think of in terms of permissions that we can change. Do you have anything more specific?
                        Try using a VRayHDRI map and see if there are any errors or warnings in the log file on the render servers. If yes, can you post the exact error message?

                        Best regards,
                        Vlado
                        I only act like I know everything, Rogers.

                        Comment


                        • #13
                          This is an old thread, but regarding the setting for making the host machine (when rendering with DR) use one or two less cores, Vlado gave the following Maxscript code and then noted it's an environmental variable in V-Ray 3.0. Where is that environmental variable? Thanks, Matt.

                          Originally posted by vlado View Post
                          Hmm, in that case, open the MaxScript editor on the client machine and type
                          Code:
                          renderers.current.system_numThreads=22
                          and see if it helps. This will cause the client machine to use one core less for rendering so that the other can take care of DR stuff. (Note that the slaves will still use all cores).

                          [EDIT] I put 22 as you mentioned that you have 24 logical cores on your main machine, so 22 leaves 2 logical cores free.

                          Best regards,
                          Vlado
                          Originally posted by vlado View Post
                          You have to set it per scene and it stays in the scene; it's not very convenient and in V-Ray 3.0 we also added an environment variable for that.

                          Best regards,
                          Vlado

                          Comment


                          • #14
                            The environment variable is called VRAY_NUM_THREADS and can be set from the Control Panel (let me know if you need help with this).

                            Best regards,
                            Vlado
                            I only act like I know everything, Rogers.

                            Comment


                            • #15
                              If you set it from the Control Panel (and yes, I would need a screenshot or directions for doing this) does the setting apply to the host computer only when you use distributed rendering or to the host when you do any type of rendering? There are many times when I don't use the slaves and therefore I want all the cores of the host machine used for rendering and calculating the light cache (which I did not think you could do over DR). If I would have to go to the control panel each time to reset the vray_num_threads then it seems easier to simply use the Maxscript editor for it.

                              Matt

                              Comment

                              Working...
                              X