Announcement

Collapse
No announcement yet.

Render Farm...Loosing connection

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Render Farm...Loosing connection

    Hello.

    I'm having some troubles with a 8 node renderfarm I built.
    I have 8 nodes with xp home and a P4 as fileserver with xp pro plus a workstation where corrently I run the backburn manager and monitor
    I can render small scenes with the 8 nodes but as the scenes get a little larger (and the bitmap files they use) I can only render in 3 or 4.

    That's strange because the paths are alright, I think the connection limit in xp pro is not reached and the network strafic if not much either!!
    3dstudio just blocks in "loading bitmaps" fase with cpu at 0% (the only process at 99% is system idle) and I can't access the fileserver from that node and prety soon all nodes block there.

    I notice that after this happens file server shows a lot (10 actually) of connection is the CLOSE_WAIT state even after closing 3dstudio and I need to restart the server connection to access it again or wait a long time...maybe the 240 seconds.

    Anyone have any ideia what may be wrong??
    Something in network settings or is this some OS issue??

    Thanks a lot!

  • #2
    I notice that after this happens file server shows a lot (10 actually) of connection is the CLOSE_WAIT state

    XP Pro only allows up to 10 connections. XP Home has a lot of network functionality stripped out. I would not run XP Home on your rendernodes.

    Comment


    • #3
      Hello Dynedain
      The fact is that the 10 CLOSE_WAIT connection only appear "after" lossing connection. The fact is that I loose connection before the connection limit.
      I don't know much about networks but...does "netstat -b" in the command line shows all active connections???
      About he xp home?? I thought the network funcionality for the nodes could be very basic!
      Can you tell me more about this stripped out functionality? Is there something that is really necessary???
      Thanks

      Comment


      • #4
        it takes some time to close tcp connections - thats why you are getting the close_wait message in netstat
        so you probably have more than 10 open connections, causing the problems

        try to move your maps onto a windows server machine or a linux machine running samba
        network attached storage (NAS) might also work, as most of them special linux versions and samba

        one thing that doesnt work in xp home is network shares, at least not like in xp pro

        ps: you might want to try tcpview from sysinternals - graphical tool for looking at network connections
        http://www.sysinternals.com/Utilities/TcpView.html
        you can even use it to manually close connections
        tdimon does a similar thing in realtime
        http://www.sysinternals.com/Utilities/TdiMon.html

        Comment


        • #5
          Before jumping to conclusions - try to not use the machine with your textures on it as a renderer and see if this problem occurs. Then write back and we could go from there.
          LunarStudio Architectural Renderings
          HDRSource HDR & sIBL Libraries
          Lunarlog - LunarStudio and HDRSource Blog

          Comment


          • #6
            Hi! Thanks again for the responses.
            Let me stress again that I don't reach the 10 connection limit.
            If I have 4 nodes rendering the fileserver has only 4 established connections (eventualy a 5th connecting to the workstation) but when I had another node to the rendering it losses connection.
            The xp home edition is in the nodes only. The fileserver has xp pro.
            I also created a TcpTimedWaitDelay=0 secs entry in the registry althought it didn't help much.
            I'm trying the TCPView right now...I'm guessing that only ESTABLISHED (and CLOSE_WAIT unfortunatly) connections count for the limit
            By the way...the machine with the textures (fileserver) does not render anything.
            About the linux...any advice what and where to download a good and simple version??
            Well...any more ideas are welcome.
            Thanks

            Comment


            • #7
              For linux i just setup a few days Fedora Core 4 has a file server nad did't took much trouble, also there are good tutorials on the net on setting up samba in diferent scenarios (mine was in a w2k03 domain).

              http://www.howtoforge.com are some good tuts.

              Best regards,
              Daniel Santana
              4+Arquitectos, lda
              Daniel Santana | Co-Founder / Technical Director
              You can do it! VFX
              Lisbon/Porto - Portugal
              http://www.ycdivfx.com

              Comment


              • #8
                This may be a very obvious thing, but make very sure ALL used Home and Pro licenses have unique serial numbers

                Comment


                • #9
                  Before going crazy - check my sugggestion above. Network outages can result from overloads when Max is running on the same comp with the texture drive. You'll have to install a low-priority script on this computer.
                  LunarStudio Architectural Renderings
                  HDRSource HDR & sIBL Libraries
                  Lunarlog - LunarStudio and HDRSource Blog

                  Comment


                  • #10
                    try using 2-3 machines to render and see they render ok?

                    8 machines will push you over the 10 connection limit occasionally so you should be running windows server 2003 (5 cal licences is enough for infinate connections, dont ask me why)

                    The other problem i've has is referencing textures...

                    make sure each machine has identicle mapped network drives. Make sure you tick "include textures" when sending the net render job etc etc. It sounds like each node is trying to load textures individually?

                    Can you lower the resultion of the bitmaps? (just for something to try)

                    could be your switch? although this would report errors in the manager

                    Make sure each machine has permissions (try using the same user for all machines)

                    Make sure each machine has the same max plugins (a lot of people forget things like egz glass and wonder why it renders darker on some machines than others)

                    All obvious things, but you never know....maybe something above helps? GL!

                    Comment


                    • #11
                      I woul definetely go the linux route....we're running fedora here and it has been flawless.
                      -----Dwayne D. Ellis-----

                      Comment

                      Working...
                      X