Announcement

Collapse
No announcement yet.

Help! DR Client loses communication with DR Slave systems

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help! DR Client loses communication with DR Slave systems

    I am using Vray with Rhino4. Using the distributed rendering feature with two slave systems that are running Vista Home Premium. The client DR system is running WinXP SP2.

    My problem:
    1) If I run small renderings everything works great, building light cache and rendering gets distributed accross all 10 CPUs.
    2) If I render larger pictures that take any significant amount of time to build the light cache (like 30 minutes or more), building light cache works for a while but then the CPU usage on the slave systems goes to zero. Eventually, the rendering log will report that it can no longer communicate with the slave systems. Once communications is lost with the slave it is not restored and the rest of the light cache building and rendering is done just using the client CPUs.

    I thought this was a Vista power issue (perhaps it was shutting down in the middle but I now have all power modes turned off. My network connection seems to be fine.

    Questions:
    1) Could Vista be causing the problem?
    2) What would cause the slave systems to stop communicating? Is there a log that would help resolve this?

    Help! My distributed rendering is broken and I have a huge deadline!!!

  • #2
    Re: Help! DR Client loses communication with DR Slave systems

    I think Vista could definitely be contributing to the problem...I'm not sure exactly why or how, but I wouldn't put it past vista. It may be Vista blocking the port after a certain amount of inactivity (which would explain why the smaller renderings work fine), or it may just be a vista to xp issue.

    Question...Are the dr spawners crashing on the slaves, or do they just loose communication. I've seen the spawner crash a bit, which would cause the issue that your talking about, but to simply loose communication is a bit odder.

    As far as helping you out with you're pressing deadline, LC solutions are resolution independent. What that means is that you can calculate your LC pass at your small resolution, which seams to be working, save it, then load it up for the final render. You can do this with IR too if you'd like.
    Damien Alomar<br />Generally Cool Dude

    Comment


    • #3
      Re: Help! DR Client loses communication with DR Slave systems

      I am rebuilding both machines with XP SP2 and will try that. I will definately try saving the light cache pass to a file and then doing the final rendering from the file. That's a great idea...

      Thanks! I'll let you know how it goes.

      Comment


      • #4
        Re: Help! DR Client loses communication with DR Slave systems

        No luck with Windows XP. Does the same thing...
        Runs for awhile on the slave systems. When everything is running well the last output on the spawner is the jpg files it loaded for my bitmaps...

        At the point of the failure, the output is as follows:
        Threads completed
        Merging light cache passes.
        Merging light cache passes...
        Light contains 34100 samples.
        Light cache samples collected; compiling lookup tree.
        Prefiltering light cache.
        Prefiltering light cache...
        Clearing global light manager.
        Clearing direct light manager.
        Clearing ray server.
        Clearing geometry.
        Clearing camera image sample.
        Clearing camera sampler.
        Clearing QMC Sampler.
        Clearing Path sampler.
        Clearing color mapper.

        Both slaves are sitting at this point with no CPU activity. The client system is still running and compiling the light cache.

        Help!

        Also, are there instructions on how to save the light cache pass to a file and render from the file? I found online docs but they did not match vray for rhino menus....

        Comment


        • #5
          Re: Help! DR Client loses communication with DR Slave systems

          Here is the output on the client system right before and after the failure:

          Server 123.123.123.104: Starting Frame 0
          Server 123.123.123.103: Starting Frame 0

          Slaves chugging along for 10 minutes or so.....

          Right at the point the slave systems lose communications, the client systems outputs:

          Light Cache Time Interval is (0.0000000, 0.000000)
          Light Cache Time Interval is (0.0000000, 0.000000)

          Comment


          • #6
            Re: Help! DR Client loses communication with DR Slave systems

            Hmmm...this is even crazier because whole clearing this and that thing from the log is the process done on render's end. Basically the whole system (host included) looks like its going through the LC calculation, prefiltering it, then just thinks its done. The prefiltering and merging of the LC solution is a single threaded operation, meaning that the slaves won't be able to contribute to the operation, but I still don't think those things should be reported to the log (hopefully I'm wrong). Also the larger/more phases to the LC calcuation, the longer the merging/prefiltering will take.

            As far as the interval thing. Thats something that gets reported to the log when the LC pass is assigned to a certain thread, not when its finished. Question...do you have the number of phases in the LC parameters set to match the number of available threads? You should do that, because it may be that there aren't enough phases in the LC calc to even be sent to the slaves (i.e. if you had the phases set to 2 and you had a dual core machine, the slaves wouldn't have any threads to work from).

            With saving the LC solution, just either set it up to auto save or click the save button. After you've made the file, enable the From File mode and set the file path there.
            Damien Alomar<br />Generally Cool Dude

            Comment

            Working...
            X