Page MenuHome

Geometry Nodes: Yuksel's sample elimination algorithm
AbandonedPublic

Authored by Victor-Louis De Gusseme (victorlouis) on Dec 29 2020, 11:52 AM.
Tags
None
Tokens
"Love" token, awarded by hzuika."Love" token, awarded by charlie."Love" token, awarded by pablovazquez."Love" token, awarded by SteffenD."Love" token, awarded by lone_noel."Love" token, awarded by eversimo."Love" token, awarded by aliasguru."Love" token, awarded by Slowwkidd."Love" token, awarded by kenziemac130."Love" token, awarded by wilBr."Love" token, awarded by kursadk."Love" token, awarded by RC12."Love" token, awarded by HooglyBoogly."Love" token, awarded by ace_dragon."Love" token, awarded by grosgood."Love" token, awarded by DimitriBastos."Love" token, awarded by Robonnet.

Details

Summary

Update: Jacques added a great Poisson Disk sampling method that also samples directly onto the mesh that will be incorporated into 2.92 and satisfies most use cases. His implementation is similar to what in the literature they call "dart throwing", but instead of progressively adding samples which would require many KDTree insertions (and balancing), he adds all samples upfront and then removes samples that violate the minimum distance. This is actually a very elegant approach because it circumvents the issue of having to update a growing datastructure for neighborhood lookups.

The method in this patch (and the old implementation) were based on Yuksel's sample elimination. While Yuksel's algorithm is also quite easy to implement, it is harder to understand e.g. setting a good value of r_max is a surprisingly difficult (where good means that the resulting distribution will be nice and that algorithms runs quickly). So for the moment the code in this patch is put in freezer. We might decide to revive Yuksel's method for two main reasons:

  1. It could produce Poisson Disks of higher quality = with denser packing for a given Poisson Disk radius than dart throwing (although probably slower). For some use cases this slightly denser packing/slightly smaller gaps might be desirable, so this alone might justify implementing this patch for 2.93. However I find the second reason even more appealling:
  2. Yuksel's algorithm can be used to generate different distributions based on the weight function. For example, in his paper he shows how you can easily get magenta noise. By cleverly parametrizing the weight function, we could expose a slider to the user called "Clustering" or something similar. At 0 cluster, you would get a Poisson Disk point set, at slightly higher values you would get magenta noise and at even higher values the points would completely separate into several "islands" or clustered points.

Below this point, the content of this post is outdated.


Parts of this post:

  • Problem statement
  • Yuksel's algorithm (weighted sample elimination)
  • Current implementation: sampling in 2D + projection
  • Sampling directly on the mesh
  • Patch status

Problem Statement

The goal of Poisson disk sampling a set of random points where no 2 points are closer than a certain minimum distance. The radius of a Poisson disk set is half the distance between the 2 closest points of the set.

The current implementation samples the points in 2D (parallel to the XY plane) and then projects the along the Z axis onto the mesh. This creates the unexpected (and undesirable in most cases?) artefact that "steep" faces have less points.

This patch introduces Poisson disk sampling directly on the input mesh surface. However it is still very slow, which is why the original developers decided to do the sampling in 2D as a first step. Below I will explain why sampling directly on the mesh is slow and what we can do about it. Before that I will also first explain the underlying algorithm and the current 2D implementation in a bit more depth.

Currently implemented, 2D sampling + Z-projection:

This patch, sampling directly on mesh surface:

Yuksel's algorithm (weighted sample elimination)

The both implementations are based on the algorithm described in: Sample Elimination for Generating Poisson Disk Sample Sets (Yuksel 2015). The main idea of Yuksel's algorithm is to first generate 5x too many random points. Then calculate a weight for each point that captures how crowded its neighborhood is. Finally, remove the points with highest weight until you're happy with the result. In pseudocode the algorithm goes something like this:

points = rng.get_random_points(5 * desired_points_amount)
r_max = ...
weights = []

// Initial weight assignment
foreach(point in points) {
  weight = 0.0
  neighbors = find_neighbors_within_radius(point, r_max)
  foreach(neighbor in neighors) {
    distance = distance(point, neighbor)
    weight += calculate_weight(distance, r_max)
  }
}

// Point elimination
heap = Heap(points, weights)
while(heap.size() > desired_points_amount) {
  point = heap.pop()
  neighbors = kdtree.find_neighbors_within_radius(point, r_max)
  foreach(neighbor in neighbors) {
    distance = distance(point, neighbor)
    weight = calculate_weight(distance, r_max)
    heap.decrease_priority(neighbor, weight)
  }
}
remaining_points = heap.items()
return remaining_points
}

Current implementation: sampling in 2D + projection

The current (2D) implementation uses Yuksel's algorithm to generate 5000 points in a square. The size of this square depends only on the minimum distance specified by the user. When points need to be distributed over a larger area than the square, the square is simply tiled in the X and Y directions to cover the larger area. Then these tiled points are projected onto the input mesh.

Ok so how does this method handle the density input? To understand that we need to understand progressive sample reordering. Progressive sample reordering sorts the set of generated Poisson disk points such that when points are added in that order, each subset will look like a Poisson disk set (but with larger radius). After ordering the generated points in that way, all points at the end of the list are filtered to achieve the desired density.

What's nice about this approach is that it's very fast and produces stable results. For a given radius and random seed, it produces the same tiling of the XY-plane. These points are then sorted in the same way each time. When the user changes the density, we just take a smaller/larger part of the list. If the user resizes the mesh to have a larger XY bounding box, simply more tiles are added, without changing the existing tiles.

However for small radius + large mesh XY bounding box, tiling artefacts appear:

I also want to note the following about the current 2D implementation: it does not guarantee the user specified minimum distance for high density. Below is an example of scattering spheres with radius 0.5 using minimum distance 0.5, the spheres overlap in several places.


I mention this because Sebastian stated this as the reason why having a Minimum Distance input is desirable for users:

The idea with these parameters is that you have models/assets that have a certain predetermined radius. So you plug in the the min radius of the assets in the Minimum Distance and tweak the max density to get as sparse/tightly packed distribution as you like <without worrying about overlap between the assets>.

(I added that last part for clarification.)

Sampling directly on the mesh

Generating a set of Poisson disk points directly on the mesh is conceptually simple. First generate 5x too many random points on the mesh (which was already implemented). Then just run Yuksel's algorithm with those points. Yuksel proposes using a KDTree for the neighborhood lookup. I tried it with blender's KDTree implementation, but this was much too slow (e.g. 20k points took 15s). Several optimizations were implemented that together achieved a 40x speedup.

Optimizations

All optimizations in this section revolve around the neighborhood queries. As these were the most expensive part of the algorithm.

Better approximation of r_max for meshes

r_max is used throughout Yuksel's algorithm. It determines the size of neighborhoods and influences the weights assigned to points. Yuksel mentions r_max is "the maximum possible distance for N samples to cover the sampling domain". This maximum distance is achievable with hexagonal packing.

Yuksel gives formulas to calculate r_max for 2D and 3D:


But this left me puzzled: which formula do I use for a 2D mesh surface in 3D space? On his website (section 7.) he states:

[...] since we are sampling a surface in 3D, the sampling domain should be considered 2D.

I find this a bit weird, because using the 2D formula with the surface area of the mesh will generally underestimate the achievable Poisson disk radius, I thought you ought to be conservative and always use an overestimation. However, lower values are faster (smaller neighborhoods to search) and the underestimation doesn't seem to give any problems so I decided to also use it. I actually even went a step further and used the min(r_max_2D, r_max_3D) where r_max_3D is calculated with the AABB of the mesh.

Grid-based neighborhood search

The KDTree neighborhood queries turned out to be a huge bottleneck, so the KDTree was replaced by a spatial grid with cell size = 2 * r_max. This guarantees that all neighborhoods for a point in grid cell (i, j, k) are in the neighboring cells (e.g. (i+1, j, k), (i+1, j+1, k), ...). So in total 27 cells have to be searched for each point. This might seem inefficient, but for is surprisingly fast: compared to the KDTree this already made the algorithm about 15x faster with better scaling.

Additionally, for each point its neighbors' ids are stored together with the weight for that neighbor. This way, we don't search the neighbors again/recalculate the weights in the point elimination stage of the algorithm.

Parallelizing the neighborhood queries

The algorithm consists of two expensive parts:

  1. Initial weight calculation
  2. Point elimination

For the initial weight assignment there's no dependencies on other points' weights that loop was fully parallelized.

The second part is more tricky as for each point removed, the points in the vicinity of the removed one need to have their weight values updated. Multi-threading this would require clever sectioning of points (threads should not be able to eliminate points whose weights are being updated by other threads).

Patch Status

In my opinion, the new Poisson Disk sampling on the mesh is certainly fast enough to be usable (generating 100K points takes about 1,8 seconds, but that's already a lot points, about the same as the average amount of hairs on a human head). In some cases I found it even to be faster than the 2D implementation, however with the same level of optimization, the 2D approach will always be faster because you can't beat the tiling :p


In the dropdown of Point Distribute node, I've used the name "Poisson Disk" for the method from this patch, and added an option "Poisson Disk 2D" for the other method:

Note that the new method currently only has a Density slider and no Minimum Distance like the 2D version. The reason is just that it was simpler to implementation wise, also I some ways I find it more user friendly. When using the 2D method I often find myself going back and forth between the Minimum Distance slider and the Density slider and being frustrated so I really think this design should be reconsidered (sorry for bringing this up again though). Maybe a solution is to 1) also use Density as the main driver for the 2D node 2) have a checkbox Ensure minimum distance for when you're scatter things with a known fixed radius that really should not overlap?

Weight painting:

Planned changes before patch is finished:

  • Currently I'm calculating the mesh bounding box and surface area just by looping over vertices/triangles. This is actually quite fast and even parallelizable if necessary, so it's not really a problem but I was wondering whether there's a more standard way to do this in Blender. Also for meshes with very high vertex/triangle counts this could become non-negligible.
    • ...

Known limitations:

  • The generated points are not stable, i.e. changing the density causes the points to "jump around". Note however that the points are stable during weight painting. Points smoothly pop into and out of existance, without the other points on the mesh moving around.

Future work:

  • Parallelize the Point Eliminationpart of the algorithm. This part is still single threaded, so when scattering a lot of points, this becomes the bottleneck. I've talked about this with Sebastian, what he proposes is splitting up space into parts that can be processed in parallel without interference. However combining the results and handling the borders does seem quite difficult so I don't have any concrete plans to implement this.
  • Think about how to speed up the algorithm if the user only weight paints small areas. Currently, points are always scattered over the entire mesh, then filtered.

Testing/reviews are very welcome!

Diff Detail

Repository
rB Blender
Branch
geometry-nodes-poisson-alternative (branched from master)
Build Status
Buildable 11957
Build 11957: arc lint + arc unit

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
Victor-Louis De Gusseme (victorlouis) retitled this revision from Nodes: add geometry socket type to Geometry Nodes: change Poisson Disk to sample on the surface of the input mesh.Dec 29 2020, 11:54 AM
Victor-Louis De Gusseme (victorlouis) edited the summary of this revision. (Show Details)
Victor-Louis De Gusseme (victorlouis) retitled this revision from Geometry Nodes: change Poisson Disk to sample on the surface of the input mesh to Geometry Nodes: change Poisson disk to sample on the surface of the input mesh.Dec 29 2020, 12:21 PM
Victor-Louis De Gusseme (victorlouis) edited the summary of this revision. (Show Details)

Thanks for implemeting this. The current implementation also creates symetrical distribution between the poles , most visible along the equator on a sphere

Victor-Louis De Gusseme (victorlouis) edited the summary of this revision. (Show Details)
  • Poisson disk on mesh via sample elimination
Victor-Louis De Gusseme (victorlouis) retitled this revision from Geometry Nodes: change Poisson disk to sample on the surface of the input mesh to Geometry Nodes: add method for Poisson Disk sampling on mesh.Jan 9 2021, 2:46 PM
Victor-Louis De Gusseme (victorlouis) edited the summary of this revision. (Show Details)
Victor-Louis De Gusseme (victorlouis) edited the summary of this revision. (Show Details)
Victor-Louis De Gusseme (victorlouis) edited the summary of this revision. (Show Details)
  • Grid for neighborhood lookup + parallelized weight calculation
  • Added distrubute method in dropdown and a little cleanup
Victor-Louis De Gusseme (victorlouis) edited the summary of this revision. (Show Details)
  • Filtering based on vertex weights
Hans Goudey (HooglyBoogly) requested changes to this revision.Jan 11 2021, 2:18 AM

It will be great to have some poisson distribution on surfaces, and nice job with the speedups! Just some quick comments on the code, nothing on the general algorithm or the big picture really.

I'm wondering about the organization of the files, maybe this new code should be in the poisson point distribution file?
Actually, maybe we should have a separate file for each distribution method and a utils file for shared code.

Also, index variables-- I was under the impression that the default here was int, and in cases where we need the possibility of very large numbers, int64_t. Even though the current code isn't really consistent in that respect, I'd rather see that applied here as long as it's in the style guide.

source/blender/makesrna/intern/rna_nodetree.c
465 ↗(On Diff #32589)

Typo. Anyway though, this could probably be more specific about what "some space" means, even if it makes the tool-tip a bit on the longer side.

source/blender/nodes/geometry/nodes/node_geo_point_distribute.cc
178

SCOPED_TIMER in BLI_timeit.hh should be a quicker way to do this, anyway, this stuff should be removed.

302

Personally I'm not fond of auto, and I think we generally try to avoid it.

306–307

What about float3::distance_squared()?

334

The comment style the style guide recommends is /* style comments, always ending with a period.

343

Personally I would rather see float3(FLT_MAX) here, it's easier to read IMO. But maybe Jacques has a different opinion.

343–348

This is a lot of repetition. I would suggest using float3 and adding whatever methods are necessary to that class. (min and max)

Alternatively, just using minmax_v3v3_v3 would simplify this.

393–395

Again, using float3 for the min and max will reduce the repetition here, and below in this function.

399–400

We usually give an r_ prefix to return arguments, so this is a bit confusing : P I would avoid using this naming here. It's not immediately clear that "r_" is supposed to mean anyway.

433

Comments like this should generally use the imperative. -- Find the neighbors of all points and calculate weights

In general though, I'd like to see more focus on why in comments rather than what. For example, just above it's easy to see that points are being inserted in the grid, but other things like how the indexing is done or what the point of the grid is are much less apparent. Those are just two examples, and maybe bad ones, but you can always count on someone eventually having to read this code and make sense of it without all the context we have now.

This applies to comments elsewhere in the patch.

441

TBB is an optional dependency, so there has to be a fallback here. parallel_for in BLI_task.hh has a helper for this.

463

Just declare this two lines down, inside the loop

464

I would prefer to see a for loop here, it doesn't look like this needs to be a while loop.

This revision now requires changes to proceed.Jan 11 2021, 2:18 AM
  • Added TBB in CMakeLists.txt, cleanup and documentation.
source/blender/makesrna/intern/rna_nodetree.c
465 ↗(On Diff #32589)

What do you think about this tooltip?

source/blender/nodes/geometry/nodes/node_geo_point_distribute.cc
306–307

Will change this when float3::distance_squared() is patched.

Victor-Louis De Gusseme (victorlouis) retitled this revision from Geometry Nodes: add method for Poisson Disk sampling on mesh to Geometry Nodes: Yuksel's sample elimination algorithm.

@Victor-Louis De Gusseme (victorlouis) Are you still interested in working on this? I still think there is room for different algorithms in the point distribute node, but a fair amount has changed since this patch was written.

If not, I think this should be closed, otherwise it will just sit around. The patch is always here if we want to look back on it either way.

kursad k (kursadk) added a comment.EditedAug 12 2021, 8:28 PM

The only shortcoming of this is that it only works with faces. Are there plans for supporting edges only distribution or objects without faces (like object mesh has only vertices and edges but no faces)?

The only shortcoming of this is that it only works with faces. Are there plans for supporting edges only distribution or objects without faces (like object mesh has only vertices and edges but no faces)?

This limitation was at the time intentional as the other methods distribution methods we had only worked on faces.

The sample elimination should work on anything that you can scatter random points on.

source/blender/nodes/geometry/nodes/node_geo_point_distribute.cc
326

Why are you doing this instead of using BLI_bvhtree_get_bounding_box ?

Is this patch still active?
The last activity question was asked a year ago and still unanswered...