Announcement

Collapse
No announcement yet.

Network Rendering with VRay Material Library

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Network Rendering with VRay Material Library

    Hello all,

    I'm looking for some assistance in troubleshooting an intermittent issue we are experiencing.

    At our studio, we have 13 render nodes and 7 workstations. All machines are running version-identical software packages of Max 2022 and VRay 5.2.3.

    We have installed the VRay Material Library on a shared drive, and that drive is mapped with UNC paths from each machine. The UNC path for the VRay Material library is part of the User Paths > External Files (Image 1).

    Sometimes, when a machine is loading a file to render - either a frame of an animation OR as a Distributed Render (VRay Spawner) - the VRay Material paths fail to load properly and the render fails. The second image I've attached shows the Max log file entries of such an event.

    The solution I've found through trial and error is to open Max on the "problem" machine, load the VRay Material browser, and then close Max. When the "problem" machine is instructed to try the distributed job again, the paths are no longer missing and the render is able to initiate properly.

    So, obviously, this isn't a real solution because it keeps happening at seemingly random times. There is no pattern as far as I can tell. It does not happen on the same machine all the time, it does not matter which file is being rendered, and it does not matter which Illustrator is actually sending the file to the render manager (we use Pulze Render Manager).

    To give you an idea of frequency: if we are rendering an animation over the course of a few days (so, let's say 72 hours of constant rendering), 1-3 machines will fail to load the VRay assets at some point, requiring manual intervention.


    My only hunch right now is that our file server is unable to respond to all the requests from all the render nodes trying to access the same assets at the same time. The theory here is that VRay (or Max?) has a simple timeout timer running when attempting to access assets on a drive, and if the server does not respond quickly enough, Max assumes the files are missing and throws the error.

    The reason I have this hunch is because we have experienced network latency issues when trying to play video, load sequences into After Effects, or save large photoshop files. But again, it's just a hunch.

    The file server is a Dell R740xd with a RAID-10 array of SSDs. Our LAN speed is 1gbps, and all machines are 1 switch away from the file server. The switch is a fairly new unifi switch.

    Has anyone else experienced network rendering issues like what I've described? The file paths for the material library are clearly correct, but the accessing of those files is not stable for some reason.

    Thank you in advance for any advice or ideas on how to solve this!

  • #2
    If it is a Max issue I am not sure. If it is fileserver issue you could try disabling oplocks in Samba either completely or for the file types in question.

    HOWEVER, the fact that it can't find *any* assets, I don't think that will help you. It seems more like it can't access an entire directory or server.

    How many render nodes are hitting the server? TCP should back off if packets are indeed getting dropped.

    I would definitely bump up to 10 or 25Gbit for the LAN if you have more than a few clients. It makes a huge difference overall, and that machine you describe can handle over 10Gbit. However, I am not sure this is your problem.

    Is the server at least connected to the switch at 10Gbit, or using some trunking or LACP with multiple 1Gbps links? If not, then you are indeed trying to pull everything through a single 1Gbps link.

    We use mapped drive letters and have no troubles. Some people swear by mapped drives, and others swear by unc.

    When you reconnect manually... when you first get to the machine can you see the server and access it from the render node?

    You say you have to launch Max and ope the VRayMatLib in order to clear the issue. We actually copy assets into our individual scene directories so they get archived properly. So I don't have experience with dealing with the Mat Lib beyond find initial materials and copying them over. It almost sounds like it is somehow negatively caching the server.

    Hopefully somebody else have better info for you.

    Comment


    • #3
      Originally posted by Joelaff View Post
      If it is a Max issue I am not sure. If it is fileserver issue you could try disabling oplocks in Samba either completely or for the file types in question.

      HOWEVER, the fact that it can't find *any* assets, I don't think that will help you. It seems more like it can't access an entire directory or server.

      How many render nodes are hitting the server? TCP should back off if packets are indeed getting dropped.

      I would definitely bump up to 10 or 25Gbit for the LAN if you have more than a few clients. It makes a huge difference overall, and that machine you describe can handle over 10Gbit. However, I am not sure this is your problem.

      Is the server at least connected to the switch at 10Gbit, or using some trunking or LACP with multiple 1Gbps links? If not, then you are indeed trying to pull everything through a single 1Gbps link.

      We use mapped drive letters and have no troubles. Some people swear by mapped drives, and others swear by unc.

      Hey, thanks for your thoughtful response!

      Yes, the file server is connected to the switch through a single 1gbps LAN connection, so I'm sure it's bottlenecked to that speed, unfortunately. I would love to upgrade to a 2.5gbps or 10gbps connection for our entire office, but it is not in the cards right now.

      Our "farm" consists of 20 clients... they are either workstations or render nodes. At any given time, there are at least 13 nodes available for rendering.

      I have not tried using drive letters instead of UNC paths for the VRay Material library. That sounds like a worthwhile test, however. UNC paths are sometimes the only way to go, but other times a drive letter works better. I have no strong opinion one way or the other, I just use whatever works best!


      Originally posted by Joelaff View Post
      When you reconnect manually... when you first get to the machine can you see the server and access it from the render node?

      You say you have to launch Max and ope the VRayMatLib in order to clear the issue. We actually copy assets into our individual scene directories so they get archived properly. So I don't have experience with dealing with the Mat Lib beyond find initial materials and copying them over. It almost sounds like it is somehow negatively caching the server.

      Hopefully somebody else have better info for you.​
      When I connect to the problem machine via RDP, there isn't any indication that anything is wrong with its connection to the file server. The drive is mapped and available, and all assets are showing like they should.

      As far as asset collection is concerned, we do the same as you with project-specific files. However, it usually doesn't make sense to do that with the VRay material library materials because we're usually just using those as generic materials... like generic concrete, metal, etc.. It's a waste of time and storage to replicate those files all over the place.

      Anyway, thanks again for your response. I'll dig a bit deeper and see if I can't find any more clues. In the meantime, if anyone else has any ideas, let me know! Thanks!

      Comment


      • #4
        FWIW, to anyone who is experiencing the same issue with VRay Library Materials... the behavior I've described does NOT happen when the material paths are literal paths instead of the typical "\assets\V-Ray Material Library\...".

        So, this is definitely an issue with the way Max and/or V-Ray handles external file paths. Of course, there might also be a network speed component, but that is not the primary cause as far as I can tell.

        To fix the problem, I stripped all the "V-Ray Material Library" material paths (i.e. any bitmap paths like "\assets\brick_something_texture.tx"), and re-pathed those materials to the same exact spot on our network drive, but with a drive letter path instead. The issue does not happen anymore.

        Comment

        Working...
        X