What you're looking for is a problem known as "Depth from monocular cues". Computer scientists, neurologists, and artists have been working on this problem for years. We're nowhere close, even today.
If you even wanted to attempt this, you would need to have a lot of information about the scene, the camera, etc. Eg Focal length, film back, intrinsic and extrinsic camera parameters, etc; all found through common knowledge and complicated calibration. In short, there is no "simple" way.
Imagination is more important than knowledge.
Last edited by NextDesign; 12-03-2012 at 08:57 PM.