Using this paper: Sven Woop, Watertight Ray/Triangle Intersection
http://jcgt.org/published/0002/01/05/paper.pdf
Main purpose of this change is to reduce the memory footprint by getting rid of
pre-computed storage. And in the BMV scene it's about 25% less memory, in the
secret agent walk cycle it's about 14% less memory and with the sheep test files
it's about 10% less memory usage.
Unfortunately, it's currently about 10% slower than the previous solution with
pre-computed triangle plane equations, but maybe with some smart tweaks to the
code (tests reshuffle, using SIMD in a nice way or so) we can avoid the speed
regression.
But perhaps smartest thing to do here would be to change single triangle / ray
intersection with multiple triangles / ray intersections. That's how Enbree does
this and it's watertight single ray intersection is not any faster that this.
Currently only triangle intersection is modified accordingly to the paper, in
the future we would also want to modify the node / ray intersection.