Page MenuHome

GHash - code reorganization, performance enhancements, add a few missing utils to API.
ClosedPublic

Authored by Bastien Montagne (mont29) on Mar 13 2015, 1:14 PM.

Details

Summary

This patch is the root of the GHash rework, all other diff will be based on it:

Reduce average load from 3.0 to 0.75

This is the big performance booster part, e.g. makes tracing a dyntopo stroke between 25% and 30% faster.

Not much to say about it, aside that it obviously increase memory footprint (about 25% - 30% too).

Add optional shrinking

I.e. ghashes/gsets can now shrink their buckets array when you remove enough entries. This remains optional and OFF by default.

Add code to use masking instead of modulo

Buckets indices are obtained from hashes by “reducing” the hash value into the valid bucket range. This can be done either by bit-masking, or using modulo operation.
The former is quicker, but requires real hashes, while the later is slower (average 10% impact on ghash operations) but can also be used as a 'fake' hashing on raw values, like e.g. indices.

In Blender currently not all ghash usages actually hash their keys, so we stick to modulo for now (masking is ifdef’ed out), we may however investigate the benefits of switching to masking with systematic very basic hashing later…

Add various missing API helpers

I.e. a way to deep-copy a ghash/gset, a way to (re-)reserve entries (i.e. manually grow or shrink the ghash after its creation), and an 'add' function to ghash matching gset one (to have a way to 'add or nop' for ghashes as well).

Various code refactoring

  • Get rid of the 'hack' regarding ghash size when used as gset (it’s simpler and safer to have two structs defined here, and cast pointers as needed).
  • Various re-shuffle and factorization in low-level internal code.
  • Some work on hashing helpers, introducing some murmur2a-based hashing too.

Diff Detail

Repository
rB Blender

Event Timeline

Bastien Montagne (mont29) retitled this revision from to GHash - code reorganization, performance enhancements, add a few missing utils to API..
Bastien Montagne (mont29) updated this object.
Bastien Montagne (mont29) set the repository for this revision to rB Blender.
source/blender/blenlib/intern/BLI_ghash.c
246

It could be more clear to just have whole loop in both branches. A bit of code duplication but would make it easier to follow i think.

Campbell Barton (campbellbarton) requested changes to this revision.Mar 18 2015, 3:55 PM
Campbell Barton (campbellbarton) edited edge metadata.

In general, good... but commenting on some things which seem unnecessary.

source/blender/blenlib/BLI_ghash.h
258–263

These could be _ex functions? - sometimes you just want to know the quality.

source/blender/blenlib/intern/BLI_ghash.c
107

Why is the gset flag moved to a bool?

186

Why wouldn't there be a buckets_old ?

322

hash arg could be removed then.

341

hash arg could be removed then

This revision now requires changes to proceed.Mar 18 2015, 3:55 PM
source/blender/blenlib/BLI_ghash.h
258–263

Wasn't sure it was worth it, but yes, easy to add. Will do.

source/blender/blenlib/intern/BLI_ghash.c
107

Hmm… org idea was, checking whether a bool is true/false is quicker than checking a bitflag… Since size is not an issue here and this is checked pretty often…

Not an expert in those low-level stuff though, if you think a mere bitflag is better will switch it back to bitflag. :)

186

Because ghash_resize_buckets() is (indirectly) called from ghash_buckets_reset() (also used by ghash_new()), where gh->buckets == NULL (to pre-reserve a given amount of entries).

246

Yes, agree. Will do

322

Kept it for sake of consistency, but yes, we can remove it. Will do.

source/blender/blenlib/intern/BLI_ghash.c
107

The speed difference between checking a flag or a boolean is pretty much unmeasurable. (from my own tests).

Its not horrible to add a bool. just a bit odd to store some settings in bools, some in flags. The nice thing about flags is you can add more generally without worrying about padding or struct size.

Bastien Montagne (mont29) edited edge metadata.
Bastien Montagne (mont29) updated this object.

Addressed coments from review.

Campbell Barton (campbellbarton) requested changes to this revision.Mar 19 2015, 2:23 PM
Campbell Barton (campbellbarton) edited edge metadata.
Campbell Barton (campbellbarton) added inline comments.
source/blender/blenlib/BLI_ghash.h
73

Is this really needed for the patch, would rather check on alternatives to insert as a separate patch.

source/blender/blenlib/intern/BLI_ghash.c
372–373

Would rather args are left as is, pass const unsigned int entry_size, rather then is_gset, at least not sure what this really improves.

1766

Could just make the arg an unsigned short too?

1894–1908

This seems like an internal detail which shouldn't leak into the API.

This revision now requires changes to proceed.Mar 19 2015, 2:23 PM

Note: re BLI_ghash_buckets_size - if its useful to expose some real low level internals for GHash testing - we could have some #ifdef GHASH_INTERNAL_API defines that only tests use.

Answers inlined…

source/blender/blenlib/BLI_ghash.h
73

This is to make it consistent with GHash… Nothing mandatory, if you really do not like it we can keep it for later…

source/blender/blenlib/intern/BLI_ghash.c
372–373

No, think it’s much better to deffer dirty ghash/gset management to the lowest possible level of internal functions. Since anyway we need that info (ghash and gset now use different structs for entries), it’s better to only get actual size here, than in callers. And avoids an additional parameter, too.

1766

Nah, actually, since I reverted GHash->flag to integer, those conversions should be removed, sorry about that.

1894–1908

Those are in the "debug and instrospection" section - nice info to have in gtests eg.

Bastien Montagne (mont29) edited edge metadata.
  • Address coments from review.
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.

(cursing phab…)

Bastien Montagne (mont29) edited edge metadata.
  • Squashed commit of temp-ghash-experiments, minus the 'hash storage' part.
  • Remove gtests.
  • Remove set ops (union etc.).
  • Address coments from review.
  • Add comment about load factor, 'hide' stats ghash functions behind an ifdef in header.
  • Remove BLI_ghash_add for now.
  • Merge branch 'master' into temp-ghash-basis
  • Tweak GHASH_INTERNAL_API use
  • Remove dummy redefinitions of 'ghash internal API' functions, not needed anymore.
  • minor edits
  • Split expand/shrink into own functions
  • remove unneeded ghash strink calls & dont force expand/shrink to be inline
  • Shrink: Early return in case we are not allowed to shrink.
  • rename ghash_buckets_shrink -> ghash_buckets_contract
  • minor edits
  • pass flag directly to internal ghash_new function
  • fix for error clearning a gset, was removing the gset flag
  • Make ghash_copy do actual exact copy (same number of buckets).
  • rename bucket_hash -> bucket_index
This revision was automatically updated to reflect the committed changes.