Announcement

Collapse
No announcement yet.

Resolving DR servers

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Resolving DR servers

    Can anyone at Chaos explain exactly what the correct workflow is for using multiple machines to render into the VFB with distributed rendering? In particular I'm interested in trying to understand how and when remote servers are "active", since (based on observation and practical experience) this seems to be rather random and arbitrary.

    It seems to be necessary to constantly relaunch the Vray render server on remote machines AND constantly resolve the servers within the DR settings window on the master machine to prevent remotes from closing the DR session.

    Can anyone elaborate on exactly why this is necessary? I'd have thought that remote machines should only need resolving once from the master and would stay active until the render server window is closed.

    I've also had instances of remote machines apparently picking up the render job (based on line feedback showing processing occurring) only to have active procs then NOT show up in the master VFB. When I've gone back to look at the output on the remote machines, the DR session has again closed itself.

    It's a great feature, but I'm finding it's taking up a lot of time trying to keep remote machines working reliably, and having to wait until I see active buckets from those boxes before I walk away from a render.

  • #2
    What kind of DR setup do you have currently? Our DR works flawlessly. You are suppose to load the servers into the list once, start spawner on the servers and render. They should free up once render job is complete. We start spawner with cmd using deadline so, so the way that works is if DR has started then those particular servers can't contribute to any other farm rendering.
    Dmitry Vinnik
    Silhouette Images Inc.
    ShowReel:
    https://www.youtube.com/watch?v=qxSJlvSwAhA
    https://www.linkedin.com/in/dmitry-v...-identity-name

    Comment


    • #3
      Sorry, I should have been clearer. I am NOT using any sort of render management software -- this is a very simple setup with (currently) three machines which are loaded directly into the DR settings section of Vray's render globals on the master machine. Mostly, as you say, it does work fine -- and once machines pick up they generally will continue to work fine until the render is finished or killed (though occasionally one will close its session mid-render, generating a "server x is not responding" error in Maya's feedback line).

      The main recurring problem is getting machines to actually pick up to begin with. Sometimes this works as expected, other times I am having to constantly restart the server service on remote machines. The DR window also seems to revert to NOT displaying servers' IP addresses in the DR window between every render session, resulting in the need to hit the resolve button every single time just to ensure that they are indeed still online. The need to do this in particular is an irksome quirk of the way that DR seems to function.

      Comment


      • #4
        It's going to be hard to determine what the issues are tbh. I had a few issues like, improper vray installation (or different vray versions) (we were installing from command line) and for some reason it didn't install entirely. That was under 3ds max though. We then switched to using deployment versions and it works quite well. The other issue was the firewall. Internally we have all windows firewalls disabled as we are using a hardware/software router firewall.
        What I can think of is if your machine picks up a job then exits mid render it means it crashed. It probably is not related to DR, rather some render exception. If your machines render lets say 90% of the time fine then 10% of the time they stop responding for whatever reason (it is common and difficult to troubleshoot) in my instance using the command line control is easy to restart the dr spawner remotely rather then try to figure out why it failed. For us 100% of our network renders in deadline go through just fine, so often the DR instability is caused by something else.

        For the DR list window I could be assuming things, but it seems the DR window likes to resolve to server names rather then their IP numbers. Are you using static or dynamic ip? the instability might be related to dynamic ips. We are using all static ips just fyi.
        Dmitry Vinnik
        Silhouette Images Inc.
        ShowReel:
        https://www.youtube.com/watch?v=qxSJlvSwAhA
        https://www.linkedin.com/in/dmitry-v...-identity-name

        Comment


        • #5
          Forgot to mention, what I would to test stability is submit the same job to your mini farm via bb or another render manager in all low res low settings and get them machines to render like 50-100 frames. If all of them render 100% of the frames with no issue using mayabatch.exe then in theory there is no reason DR should not work the exact same way. Additionally custom plugins may also impact this, mismatching or missing plugins can cause them to crash. For example recently I had an issue where having arnold present in the scene (arnold nodes) caused it to crash 100% of the time on the farm on some jobs. Removing the arnold plugin, arnold nodes from the scene etc resolved that issue. This was particularly odd because it worked on some jobs, and worked without issues in the past, but suddenly it started to crash. I also had some issues where in deadline during a vray render arnold would throw an exception and crash the render - go figure.
          Dmitry Vinnik
          Silhouette Images Inc.
          ShowReel:
          https://www.youtube.com/watch?v=qxSJlvSwAhA
          https://www.linkedin.com/in/dmitry-v...-identity-name

          Comment

          Working...
          X