Only certain operations can be efficiently run in multiple threads. In the case of your simulation, the next frame is dependent on the previous frame, so even if you were using multiple threads per frame, each thread would be blocked, waiting for the current frame to finish simulating.
For example, the (crude) timing diagram below shows how the calculations are distributed. F1, F2, F3, etc are when a given frame has finished simulating. The lines how when a CPU is high (calculating) or low (idling)
You can immediately see the problem, as each successive processor waits on the result of the previous. So 1 would be the initial, 2 would depend on 1, 3 on 2, 4 on 3 ... n on n - 1, until n is greater than the number of cores you have, in which case it jumps back to the initial.
Code:
F1 F2 F3 ... Fn
________
CPU 1: |______________________
__________
CPU 2: ________| |___________
__________
CPU 3: ___________________| |
:
CPU N
I hope this makes sense.
Imagination is more important than knowledge.
Last edited by NextDesign; 26-06-2012 at 11:29 PM.