Announcement

Collapse
No announcement yet.

distributed rendering problems

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • distributed rendering problems

    I'm using distributed rendering to process a few thousand files, and I'm getting occasional failures on the slave processors.

    Every so often, every slave renderer will display a message reading "Error receiving DR scene (0), closing DR session". At this point, all I can do is shut down all instances of DRSpawner and restart them.

    But this has the further complication that if I try to restart DRSpawner while rendering, DRSpawner displays "Receiving DR scene from xxx.xxx.xxx.xxx" and immediately allocated 2 GB of memory (as seen by monitoring the "VM Size" column in Task Manager). DRSpawner then crashes a few seconds later.

    So what I end up having to do is to shut down my rendering and restart all the DRSpawners. This happens several times each day.

    I have verified that all my slave renderers are using the same version of V-Ray as I'm using in Rhino.

    Anyone have any ideas what's happening, and what I can do about it?

    - Rich Wells

  • #2
    Re: distributed rendering problems

    I would say - you use a 32bit OS only, right? A 64bit OS could be a solution to get more RAM for the spawners. My exprience is, that the master need more RAM than the spawner, so I wonder me that the spawner crashs and not the master.

    If it's an other failure than try to look at the process window of the spawners, maybe you fine a trace there.
    www.simulacrum.de - visualization for designer and architects

    Comment


    • #3
      Re: distributed rendering problems

      Originally posted by Micha
      I would say - you use a 32bit OS only, right? A 64bit OS could be a solution to get more RAM for the spawners. My exprience is, that the master need more RAM than the spawner, so I wonder me that the spawner crashs and not the master.
      I don't think RAM is the issue. When working properly, the slaves use about 500 MB of memory. It's only when they get the "Error receiving DR scene" message that they go nuts and try (unsuccessfully) to allocate 2 GB.

      I have yet to render more than 7 images before this problem occurs. Having to shut down my automated process, restart the slaves, and restart the automation every 5-10 images is not going to work. I have about 6000 images to render over the course of the next few months.

      If it's an other failure than try to look at the process window of the spawners, maybe you fine a trace there.
      That's where I'm seeing the "Error receiving DR scene" message. There's nothing unusual in the output before that point.

      - Rich Wells

      Comment


      • #4
        Re: distributed rendering problems

        Maybe it's a problem with the used textures and this bug:

        http://forum.asgvis.com/index.php?topic=6319.0
        www.simulacrum.de - visualization for designer and architects

        Comment


        • #5
          Re: distributed rendering problems

          Originally posted by Micha
          Maybe it's a problem with the used textures and this bug:

          http://forum.asgvis.com/index.php?topic=6319.0
          I appreciate the response, but I'm quite suire that has nothing to do with my problem. When the render does work, I get the right results. I just can't get the distributed processing to work.

          Here's what I've tried:

          Fewer rendering slaves (4 instead of 9) - no success.
          Lower resulution (1536x1536 instead of 2800x2800) - no success.

          Anyone have any other ideas?

          - Rich Wells

          Comment


          • #6
            Re: distributed rendering problems

            Is it possible the "Error receiving DR scene" is due to a time-out in DRSpawner? I had Rhino running on a virtual machine that was getting bogged down somehow and getting little CPU time, so itwas taking an inordinately long time to transfer the scene data to the slaves.

            If that's the case, is there a way to increase the time-out in DRSpawner?

            - Rich Wells

            Comment


            • #7
              Re: distributed rendering problems

              I think you're on the right track with that assessment, Rich. After making sure the spawners are updated and running the correct version, the other causes of issues is the "in between" transfer process, which is highly dependent on your network setup, firewalls, etc.

              I don't know exactly what the time out is on the spawners, but I do know that is not that hard to go over it. The issue with extending that time out is that the render process as a whole will suffer significantly (either wait longer before no contribution, or each contribution takes so long that the benefits of doing them on the other machine are diminished). If timing out does seam to be at the root of the issue, then I would try and see if there's a way to get faster communication between the host and slave computers. I would say that the set up on the host computer probably plays the biggest role, with the network contributing significantly as well. The first thing I would check on the host computer is that Low Thread Priority is not enabled. This would definitely put transferring info to the spawners if any other processor intensive process comes up.
              Damien Alomar<br />Generally Cool Dude

              Comment


              • #8
                Re: distributed rendering problems

                Hi- last week i`ve got the same Problem.

                Try to change your graficoutput-folder and give a simple name and another place. On my machines, the problem was that i used a name like >client+ year< " xxx09" + subfolder "clientGrafik 09".
                In this case, i had to use 2 projectfolders ( one for .3dm-files and one for the grafics and no subfolder and it works fine with no another Probs.

                Marcel

                Comment

                Working...
                X