Problem Statement
==============================================
The goal of Poisson disk sampling a set of random points where no 2 points are closer than a certain minimum distance. The radius of a Poisson disk set is half the distance between the 2 closest points of the set.
The current implementation samples the points in 2D (parallel to the XY plane) and then projects the along the Z axis onto the mesh. This creates the unexpected (and undesirable in most cases?) artefact that "steep" faces have less points.
This patch introduces Poisson disk sampling directly on the input mesh surface. However it is still very slow, which is why the original developers decided to do the sampling in 2D as a first step. Below I will explain why sampling directly on the mesh is slow and what we can do about it. Before that I will also first explain the underlying algorithm and the current 2D implementation in a bit more depth.
Currently implemented, 2D sampling + Z-projection:
{F9548872}
This patch, sampling directly on mesh surface:
{F9548873}
Yuksel's algorithm (weighted sample elimination)
=============================================
The both implementations are based on the algorithm described in: [[ http://www.cemyuksel.com/research/sampleelimination/sampleelimination.pdf | Sample Elimination for Generating Poisson Disk Sample Sets (Yuksel 2015)]]. The main idea of Yuksel's algorithm is to first generate 5x too many random points. Then calculate a weight for each point that captures how crowded its neighborhood is. Finally, remove the points with highest weight until you're happy with the result. In pseudocode the algorithm goes something like this:
```
points = rng.get_random_points(5 * desired_points_amount)
r_max = ...
kdtree = KDTree(points)
weights = []
// Initial weight assignment
foreach(point in points) {
weight = 0.0
neighbors = kdtree.find_neighbors_within_radius(point, r_max)
foreach(neighbor in neighors) {
distance = distance(point, neighbor)
weight += calculate_weight(distance, r_max)
}
}
// Point elimination
heap = Heap(points, weights)
while(heap.size() > desired_points_amount) {
point = heap.pop()
neighbors = kdtree.find_neighbors_within_radius(point, r_max)
foreach(neighbor in neighbors) {
distance = distance(point, neighbor)
weight = calculate_weight(distance, r_max)
heap.decrease_priority(neighbor, weight)
}
}
remaining_points = heap.items()
return remaining_points
}
```
Current implementation: sampling in 2D + projection
===================================================
The current (2D) implementation uses Yuksel's algorithm to generate 5000 points in a square. The size of this square depends only on the minimum distance specified by the user. When points need to be distributed over a larger area than the square, the square is simply tiled in the X and Y directions to cover the larger area. Then these tiled points are projected onto the input mesh.
Ok so how does this method handle the density input? To understand that we need to understand //progressive sample reordering//. Progressive sample reordering sorts the set of generated Poisson disk points such that when points are added in that order, each subset will look like a Poisson disk set (but with larger radius). After ordering the generated points in that way, all points at the end of the list are filtered to achieve the desired density.
What's nice about this approach is that it's very fast and produces stable results. For a given radius and random seed, it produces the same tiling of the XY-plane. These points are then sorted in the same way each time. When the user changes the density, we just take a smaller/larger part of the list. If the user resizes the mesh to have a larger XY bounding box, simply more tiles are added, without changing the existing tiles.
However for small radius + large mesh XY bounding box, tiling artefacts appear:
{F9557037}
I also want to note the following about the current 2D implementation: it does not guarantee the user specified minimum distance for high density. Below is an example of scattering spheres with radius 0.5 using minimum distance 0.5, the spheres overlap in several places.
{F9557046}
I mention this because Sebastian stated this as the reason why having a `Minimum Distance` input is desirable for users:
> The idea with these parameters is that you have models/assets that have a certain predetermined radius. So you plug in the the min radius of the assets in the Minimum Distance and tweak the max density to get as sparse/tightly packed distribution as you like <without worrying about overlap between the assets>.
(I added that last part for clarification.)
# Sampling directly on mesh
Generating a set of Poisson disk points directly on the mesh is conceptually simple. First generate 5x too many random points on the mesh (which was already implemented). Then just run Yuksel's algorithm with those points.
However, the table below shows the problem:
| **Amount of points** | **2400** | **4800** | **9600** | **19200** |
| Time to construct KD Tree | 4 ms | 6 ms | 12 ms | 21 ms |
| Initial weight assignment | 158 ms | 623 ms | 2395 ms | 9532 ms |
| Point elimination | 116 ms | 423 ms | 1679 ms | 6391 ms |
First, the algorithm takes more than a second to generate a set of less than 5000 points. But more importantly, generating 2x the amount of points requires 4x the time! This quadratic scaling behavior destroys all hope of scaling this algorithm to larger sets of points (e.g. those that would be required for grass or hair). What's to blame? From the table we can see that constructing the KDTree is surprisingly fast. The bulk of the time is spent assigning initial weights and eliminating points, which involves a lot of KDTree neighborhood queries. For N desired output points, 9*N neighborhood queries occur. The speed of these neighborhood queries depends on how large the radius of the neighborhood is. In Yuksel's algorithm neighborhoods have size `2 * r_max`. The `r_max` parameter seems innocuous, but it is crucial for both the quality and speed of generating the Poisson disk set.
Possible optimizations
--------------------------------
All optimizations in this section revolve around KDTree neighborhood queries.
### Better approximation of `r_max` for meshes (speeding up KDTree neighborhood queries)
`r_max` is used throughout Yuksel's algorithm. It determines the size of neighborhoods and influences the weights assigned to points. Yuksel mentions `r_max` is "the maximum possible distance for N samples to cover the sampling domain". This maximum distance is achievable with hexagonal packing:
{F9557084}
Yuksel gives formulas to calculate `r_max` for 2D and 3D:
{F9557086}
But this left me puzzled: which formula do I use for a 2D mesh surface in 3D space? `r_max` is such a crucial parameter and Yuksel doesn't even mention setting it form meshes? What I decided to do is use the 3D formula with the bounding box volume of the mesh, as this seemed as the safer option of the two. This will always give a conservative estimate of `r_max`, which means that the `r_max` value will always be too large. In Yuksel's source code I found this note:
> The dimensions parameter specifies the dimensionality of the sampling domain. This parameter would typically be equal to the dimensionality of the sampling domain (specified by DIMENSIONS). However, smaller values can be used when sampling a low-dimensional manifold in a high-dimensional space, such as a surface in 3D.
But, no further guidance on how to find better approximations of `r_max` for meshes. I was thinking that maybe we can calculate the total surface area of the mesh and then use the formula for 2D packing. This would however lead to underestimation of `r_max`, because points that could fit on the flat 2D surface of the mesh might not be possible when the mesh has its true 3D shape. I'm not sure whether this would lead to problems.
### Parallelizing KDTree neighborhood queries
Multi-threading can also be used to speed up the algorithm. For the initial weight assignment there's no dependencies on other points' weights, so that loop can be fully parallelized. The second part is more tricky as for each point removed, the points in the vicinity of the removed one need to have their weight values updated. Multi-threading this would require clever sectioning of points (threads should not be able to eliminate points whose weights are being updated by other threads). (suggested by Sebastian Parborg)
### Neighborhood indexing (reduce the amount of KDTree neighborhood queries needed)
Another optimization proposed by Jacques Lucke:
> We could preprocess the nearest neighbour indices for all points. That could be done in parallel. Then no kdtree is necessary while the main algorithm runs. It requires a bit more memory, but if we make sure that the total number of neighbours within the minimum distance is always low (e.g. <10), then it should be fine.
> The preprocessing can happen in parallel.
This idea has a lot of potential. In this form, it could greatly speed up the point elimination phase, which could make the algorithm as a whole about 40% faster. However, the preprocessing would still require 5N slow (but parallelizable) neighborhood lookups in the KDTree.
I wonder if we can take this idea further: what if we ditch the KDTree altogether? KDTrees support querying with an arbitrary range value, but we're not interested in that. All we want is a "neighborhood lookup table" for a fixed neighborhood size. I feel that by taking inspiration from how KDTrees are built, we can construct this "neighborhood lookup table" much faster than building the KDTree + doing 5N lookups. I'll do some research on this.