Hi everyone.
After heavy testing I recently solved my problems with DR Spawner. Now I have a nearly 100% working solution.
I had problems using DR all the time ending up in crashed nodes an unfinished renderings where I had to restart all the nodes manually. That was really pain in the ass, so I tried to figure out a reliable way. And I made it. But it´s a somewhat crazy workaround, so people tell me what you think about it.
So this is my solution:
1. Register spawners as service
Register the spawners as service, restart the machines, then they are present all the time with "vrayspawner90.exe" and max running in background.
(You have to restart the machines, because the service is registered but not started. You can alternatively start the service manually)
2. Give spawner-service access-right to network-drives
If necessary modify the service in the computer-administration (I do not know what this is called in english Windows, my OS is german) to start under a certain username with the appropriate password. You have to restart the service afterwards.
(This was necessary to me, because by default the spawner and the subsequent max-session start under the user "SYSTEM". I do store my files on a NAS-Linux-server, so there is no user "SYSTEM" - and therefore spawner and max do not have access to this drive - and therefore they do not find any maps or XRefs. I have to provide a user with password for the "vrayspawner 90"-service and then max does render fine. I think this might be a solution for many users reporting the problem that the rendernodes seem not to find the maps and XRefs. I heard very often of the solution to store all the maps and XRefs locally on each machine, but maybe this solution may adress this problem in many cases.)
3. Batch-starting, -stopping and -restarting the nodes
Provide *.bat-files to start, stop and restart all spawner-services on all nodes with psservice. See PSTools Website.
(This way you are able to start, stop or restart all rendernodes at once. So if any node crashes simply restart them all. Many thanks to Clifton Santiago for the tip.)
4. Very important: Everything in your max-file has to be UNC!
All maps, XRefs, *.ies-files - everything that has to be OLE must use UNC-path-names.
(If not the nodes simply crash, do nothing and have to be restarted. This is not the case if the spawners - that is to say the file "vrayspawner90.exe" in the max-root - are started manually and not via service. Started manually the nodes load files from mounted network-drives properly.)
---------
So good so far. Now everything is in principal prepared for DR-rendering. With me it does work in principal - but...
...it´s not stable. The nodes always do crash after a certain amount of renderings (5 to 15, it´s never the same). This is much more unstable if Win XP Home-machines are involved in the rendering-task (that´s the case for me). When one or more slaves crash you now can use your *.bat-files to restart the nodes, but this is not very handy.
Now the crazy stuff begins:
I decided to automate rendernode-restart after every rendering.
----------
5. Restart all nodes via maxscript after every rendering
So i wrote a max-script that starts the restart.bat automatically after every render. This script checks if VRay is the current renderer and if "Distributed rendering" is checked (does not make sense to restart nodes if you are not rendering with the nodes). It has to be placed in the autostart-folder for scripts (restart max). The result is that when you render with DR all the slaves start rendering and when the rendering is finished all slaves are restarted automatically, so that they are clean and fresh for the next rendering.
Advantage: DR rendering successes for me in nearly 100%. Sometimes a node quits rendering immidiately, but it seems that this does not disturb the other nodes in the most cases. So the rendering is finished anyway but maybe with one node missing. After finishing all nodes are restarted and fresh again for the next rendering.
Disadvantage: The restart of every node needs some time. In my case I use 5 nodes and it lasts about 30 seconds to restart them. So I have to wait 30 seconds before I can start the next rendering. If you start too early maybe none of your nodes is ready and the workstation only renders. Maybe only some of them are ready, I think this can confuse the DR when another node signs up while processing the job. But I think the ready-ones are involved, the not ready - not... I think in most cases this is no disadvantage, because after finishing a rendering you start to work on your scene again and not many tweakings are done in 30 seconds.
------------
So in sum with this solution I am very very happy, because now I can use DR - without this kind of reliablility it wasn´t usable at all.
If you want to try this workaround: This is the "restart VRay-spawners.bat"-file I work with:
You can download the needed tools and usage-information here: PSTools Website.
Tip: The psservice.exe needs to placed in the Windows "System32"-folder.
...and this is the maxscript I use in the startup-folder:
You have to modify both codes for your needs.
----------
So, may this can help some people to get the stuff to work and maybe it can give the people at ChaosGroup some ideas to do things better, but...
...I think DR is completely out when RealtimeRendering is in, eh? But who knows? Maybe some issues adress RealtimeRendering as well?
Best regards.
Sascha
EDIT: And just to mention the benefit of DR: My little cute testscene, rendered with DR = 4min1sec, without DR = 11min6sec
After heavy testing I recently solved my problems with DR Spawner. Now I have a nearly 100% working solution.
I had problems using DR all the time ending up in crashed nodes an unfinished renderings where I had to restart all the nodes manually. That was really pain in the ass, so I tried to figure out a reliable way. And I made it. But it´s a somewhat crazy workaround, so people tell me what you think about it.
So this is my solution:
1. Register spawners as service
Register the spawners as service, restart the machines, then they are present all the time with "vrayspawner90.exe" and max running in background.
(You have to restart the machines, because the service is registered but not started. You can alternatively start the service manually)
2. Give spawner-service access-right to network-drives
If necessary modify the service in the computer-administration (I do not know what this is called in english Windows, my OS is german) to start under a certain username with the appropriate password. You have to restart the service afterwards.
(This was necessary to me, because by default the spawner and the subsequent max-session start under the user "SYSTEM". I do store my files on a NAS-Linux-server, so there is no user "SYSTEM" - and therefore spawner and max do not have access to this drive - and therefore they do not find any maps or XRefs. I have to provide a user with password for the "vrayspawner 90"-service and then max does render fine. I think this might be a solution for many users reporting the problem that the rendernodes seem not to find the maps and XRefs. I heard very often of the solution to store all the maps and XRefs locally on each machine, but maybe this solution may adress this problem in many cases.)
3. Batch-starting, -stopping and -restarting the nodes
Provide *.bat-files to start, stop and restart all spawner-services on all nodes with psservice. See PSTools Website.
(This way you are able to start, stop or restart all rendernodes at once. So if any node crashes simply restart them all. Many thanks to Clifton Santiago for the tip.)
4. Very important: Everything in your max-file has to be UNC!
All maps, XRefs, *.ies-files - everything that has to be OLE must use UNC-path-names.
(If not the nodes simply crash, do nothing and have to be restarted. This is not the case if the spawners - that is to say the file "vrayspawner90.exe" in the max-root - are started manually and not via service. Started manually the nodes load files from mounted network-drives properly.)
---------
So good so far. Now everything is in principal prepared for DR-rendering. With me it does work in principal - but...
...it´s not stable. The nodes always do crash after a certain amount of renderings (5 to 15, it´s never the same). This is much more unstable if Win XP Home-machines are involved in the rendering-task (that´s the case for me). When one or more slaves crash you now can use your *.bat-files to restart the nodes, but this is not very handy.
Now the crazy stuff begins:
I decided to automate rendernode-restart after every rendering.
----------
5. Restart all nodes via maxscript after every rendering
So i wrote a max-script that starts the restart.bat automatically after every render. This script checks if VRay is the current renderer and if "Distributed rendering" is checked (does not make sense to restart nodes if you are not rendering with the nodes). It has to be placed in the autostart-folder for scripts (restart max). The result is that when you render with DR all the slaves start rendering and when the rendering is finished all slaves are restarted automatically, so that they are clean and fresh for the next rendering.
Advantage: DR rendering successes for me in nearly 100%. Sometimes a node quits rendering immidiately, but it seems that this does not disturb the other nodes in the most cases. So the rendering is finished anyway but maybe with one node missing. After finishing all nodes are restarted and fresh again for the next rendering.
Disadvantage: The restart of every node needs some time. In my case I use 5 nodes and it lasts about 30 seconds to restart them. So I have to wait 30 seconds before I can start the next rendering. If you start too early maybe none of your nodes is ready and the workstation only renders. Maybe only some of them are ready, I think this can confuse the DR when another node signs up while processing the job. But I think the ready-ones are involved, the not ready - not... I think in most cases this is no disadvantage, because after finishing a rendering you start to work on your scene again and not many tweakings are done in 30 seconds.
------------
So in sum with this solution I am very very happy, because now I can use DR - without this kind of reliablility it wasn´t usable at all.
If you want to try this workaround: This is the "restart VRay-spawners.bat"-file I work with:
Code:
psservice \\Node01 -u MyUsername -p MyPassword restart "VRaySpawner 90" psservice \\Node02 -u MyUsername -p MyPassword restart "VRaySpawner 90" psservice \\Node03 -u MyUsername -p MyPassword restart "VRaySpawner 90" psservice \\Node04 -u MyUsername -p MyPassword restart "VRaySpawner 90" psservice \\Node05 -u MyUsername -p MyPassword restart "VRaySpawner 90" psservice \\Node06 -u MyUsername -p MyPassword restart "VRaySpawner 90"
Tip: The psservice.exe needs to placed in the Windows "System32"-folder.
...and this is the maxscript I use in the startup-folder:
Code:
/* damRestartVRaySpawners (c) Sascha Selent, 2008 mailbox@deepartmend.de Launches .bat-file to restart VRay Spawners via psservice */ fn damRestartVRaySpawners = ( ShellLaunch "C:\\restart VRay-spawners.bat" "" ) callbacks.addscript #postrender "if (classof renderers.current == V_Ray_Adv_1_50_SP2) and (renderers.current.system_distributedRender == true) do damRestartVRaySpawners()" id:#damRestartVRaySpawners
----------
So, may this can help some people to get the stuff to work and maybe it can give the people at ChaosGroup some ideas to do things better, but...
...I think DR is completely out when RealtimeRendering is in, eh? But who knows? Maybe some issues adress RealtimeRendering as well?
Best regards.
Sascha
EDIT: And just to mention the benefit of DR: My little cute testscene, rendered with DR = 4min1sec, without DR = 11min6sec
Comment