The current functions for ray_cast and closest_point_on_mesh on bpy.types.Scene and bpy.types.Object are good for quick casting of a few rays, but they become very slow when called many times. This is because the internal BVHTreeFromMesh methods, though caching the actual BVHTree instance, still allocate extra data on construction every time (for poly index mapping) and also rely on the DerivedMesh not being updated. The RNA system further does not allow creation of temporary non-DNA PyObjects.
The solution is the introduction of a new mathutils.bvhtree submodule (similar to mathutils.kdtree). A BVHTree instance can then be created in advance, used for many ray casts or closest-element lookups, and then discarded:
N = large_number
# previously
for i in range(N):
ob.ray_cast(start, end) # very slow due to constant overhead
# new
from mathutils import bvhtree
tree = bvhtree.DerivedMeshBVHTree(ob) # BMesh variant is also available
for i in range(N):
tree.ray_cast(start, end) # much faster
del tree # not strictly necessary, just explicit cleanupTwo variants of the BVHTree class are available at this point, one for object-based meshes (using DerivedMesh internally) and one for BMesh data.