I'm certain many have thought of this before, and likely it's been discussed here. I haven't found it in the 10 most recent pages though and am not too keen on poring back through all 93, so I apologize if this is a redundant request.
I suppose the feasibility of it may depend on more integral changes to the core program than a renderer has access to but I think it's worth throwing out there.
I'm unaware of a popular term for it so I'll just call it "Frame Endlapping" or "Cascading" for now and it would deal with maximizing cpu use in animated sequences. The following explanation's pretty elementary and I'm sure everyone's familiar with it but I'm a fan of being thorough/explicit so I apologize if it's a bit of a belaboured read.
So I'm sure everyone's noticed that most/all bucketed renderers will, near the end of a frame, will let cpu cores sit idle once they've finished their last bucket and are waiting for other logical cores to finish. In some cases there are a few regions of a given frame that require far longer than others due to local geometric/ray/material/etc. complexities, extending the idle time for "finished" cores while these last buckets are handled. Once the frame's finished and saved to disk, the renderer prepares the scene/geometry for the next frame which seems to be limited to one core and, while fairly quick in most cases, leaves all other cpu resources idle. Next, depending on your GI solution of choice, the renderer goes through its prepass stages which have their own apparent intermediary idle times. Understanding that the beginning of one prepass is contingent on the previous one being complete, I'm less concerned about most of them in this request.
If you watch the time/usage graph in resource monitor, this shows up as one or more troughs between frames as more cores step down to low usage and only one gets used to prepare the next frame. While I'm sure our CPU's enjoy the short periods to breathe and cool off a degree or two, we'd like to keep them fully loaded throughout the whole sequence. Thus, what I'm wishlisting is a way to allow cores that have finished their final buckets to get a jump start on the next frame.
For example: My current scene, which use an IR/LC solution with precalculated vrmaps (LC disabled after the prepass) and a great deal of PFlow particles in flight at once. Roughly counting out loud the time between core 1 and 6 finishing their final buckets seems to range between 4 and 12 seconds, depending on the current point in the sequence, and show up in that time/usage trough. It appears, in the case of the shortest idle time, that the first core past the finish line on each frame could have the scene prepared, loaded the saved vrmaps for the next frame, and along with cores 2-5 (on my 6 core Thuban) have the first buckets of the rendertime prepass started by the time core 6 finishes its last final bucket render on the outgoing frame. Accounting for overhead in this cascading feature and remaining conservative in estimation, I figure that even shaving 3.6 seconds per frame by keeping all cores busier (if not 99%+) would reduce the total time for a sequence of 1000 frames by an hour. I reckon the same implementation in the IR animation prepass would save an additional lot of time.
Obviously this only applies to animated sequences and the time reduction is completely dependent on the scene and hardware but I think, in my limited experience, the cumulative gain in overall user-base render speed would be a worthwhile investment.
I understand that this would basically require twice the RAM to handle two frames during these overlap times but I suspect many of us already have a hard time saturating their system capacity with a large portion of their scenes. Personally, I usually find myself with plenty of my 16GB to spare in most scenes. I think a feature like this would help us make better use of our hardware investment.
So the other obstacles I wonder about is if a renderer can handle this on it's own or does a core program like Max require more fundamental changes to enable these cascading, triggered instances? What would it mean for the render dialogue window? Does V-RAY's integrated frame buffer make this more feasible than other renderers?
I'd love to hear others' and Chaos Group's thoughts on this idea.
I suppose the feasibility of it may depend on more integral changes to the core program than a renderer has access to but I think it's worth throwing out there.
I'm unaware of a popular term for it so I'll just call it "Frame Endlapping" or "Cascading" for now and it would deal with maximizing cpu use in animated sequences. The following explanation's pretty elementary and I'm sure everyone's familiar with it but I'm a fan of being thorough/explicit so I apologize if it's a bit of a belaboured read.
So I'm sure everyone's noticed that most/all bucketed renderers will, near the end of a frame, will let cpu cores sit idle once they've finished their last bucket and are waiting for other logical cores to finish. In some cases there are a few regions of a given frame that require far longer than others due to local geometric/ray/material/etc. complexities, extending the idle time for "finished" cores while these last buckets are handled. Once the frame's finished and saved to disk, the renderer prepares the scene/geometry for the next frame which seems to be limited to one core and, while fairly quick in most cases, leaves all other cpu resources idle. Next, depending on your GI solution of choice, the renderer goes through its prepass stages which have their own apparent intermediary idle times. Understanding that the beginning of one prepass is contingent on the previous one being complete, I'm less concerned about most of them in this request.
If you watch the time/usage graph in resource monitor, this shows up as one or more troughs between frames as more cores step down to low usage and only one gets used to prepare the next frame. While I'm sure our CPU's enjoy the short periods to breathe and cool off a degree or two, we'd like to keep them fully loaded throughout the whole sequence. Thus, what I'm wishlisting is a way to allow cores that have finished their final buckets to get a jump start on the next frame.
For example: My current scene, which use an IR/LC solution with precalculated vrmaps (LC disabled after the prepass) and a great deal of PFlow particles in flight at once. Roughly counting out loud the time between core 1 and 6 finishing their final buckets seems to range between 4 and 12 seconds, depending on the current point in the sequence, and show up in that time/usage trough. It appears, in the case of the shortest idle time, that the first core past the finish line on each frame could have the scene prepared, loaded the saved vrmaps for the next frame, and along with cores 2-5 (on my 6 core Thuban) have the first buckets of the rendertime prepass started by the time core 6 finishes its last final bucket render on the outgoing frame. Accounting for overhead in this cascading feature and remaining conservative in estimation, I figure that even shaving 3.6 seconds per frame by keeping all cores busier (if not 99%+) would reduce the total time for a sequence of 1000 frames by an hour. I reckon the same implementation in the IR animation prepass would save an additional lot of time.
Obviously this only applies to animated sequences and the time reduction is completely dependent on the scene and hardware but I think, in my limited experience, the cumulative gain in overall user-base render speed would be a worthwhile investment.
I understand that this would basically require twice the RAM to handle two frames during these overlap times but I suspect many of us already have a hard time saturating their system capacity with a large portion of their scenes. Personally, I usually find myself with plenty of my 16GB to spare in most scenes. I think a feature like this would help us make better use of our hardware investment.
So the other obstacles I wonder about is if a renderer can handle this on it's own or does a core program like Max require more fundamental changes to enable these cascading, triggered instances? What would it mean for the render dialogue window? Does V-RAY's integrated frame buffer make this more feasible than other renderers?
I'd love to hear others' and Chaos Group's thoughts on this idea.