This patch adds an AVX implementation of Perlin noise in Cycles.
An avxi type was also added as a utility based on the respective
type in Intel Embree.
Only 3D and 4D noise were implemented, there is no benefit for
utilizing AVX in 1D and 2D noise. The SSE trilinear interpolation
function was used in the AVX implementation because there is no
benefit from using AVX in interpolating the last three dimensions.
I couldn't measure any actual performance gains on a Zen1 CPU.
It could be that the extra setup cost canceled with any gains.
But any pointers as to why this is the case would be appreciated.