Page MenuHome

GHash enhancements.
AbandonedPublic

Authored by Bastien Montagne (mont29) on Mar 3 2015, 10:02 AM.

Details

Summary

This patch adds several enhancements to our current GHash code:

  • Set-like checks (disjoint, equality, subset/superset).
  • Set-like operations (union, intersection, difference, 'symmetric difference').
  • Some murmur2a-based hashing helpers.
  • Better performances:
    • Reduce load from 3.0 to 0.75
    • Store hases in entries.
    • Add (commented out for now) masking instead of modulo to 'bring back' raw uint hashes into buckets indices.
  • Tests and performance tests.

For performances statistics/analysis, see matching task T43766.

Notes/opened issues:

  • Masking instead of modulo is disabled for now, because when using non-hashed keys modulo tends to give better results. This shall be investigated further, but would keep masking code, it only affects a few, low-level functions anyway.
  • Proposed patch increase significantly memory footprint (60% for GHash, 86% for GSet), think increased performances are worth it though.
  • murmur2a hashing did not show huge interest here, even with strings (though most of hashed strings in tests are rather short (mere words)). Not sure we want to keep that for now?
  • Hashing helpers in general could use more cleanup, probably even be moved to its own header/file. Would keep that for later though.
  • Not quite sure about 'symmetric difference' name, our func does more than that actually, but could not find canonical name for that operation...

Diff Detail

Repository
rB Blender
Branch
temp-ghash-experiments

Event Timeline

I would actually be quite skeptical about accepting 2x memory usage. For my knowledge ghash could be used for quite heavy meshes, in which case it's really problematic to increase memory consumption. It also feels a bit weird to have hash in the entry itself. It feels after all the changes you did there should be no so much hash/values comparisons and if it's still an issue i guess we need to look into particular hash/comparison functions and optimize them instead. If it's not possible and storing value will really help we can store the hash for those particular ghashes in user key?

There are also some rather nice and useful stuff, so i guess what we should do is:

  • Commit a set-like utilities (after the git is open!)
  • Commit gtests (as separate commit)
  • All the murmur2 and module could also go in i think
  • But again, the memory increasing changes we should be careful about. Try to get rid of hash stored in the key (since it's having negative affect on cache coherency as well). This seems would avoid most of the memory usage increase?

But all the details i'll leave to @Campbell Barton (campbellbarton) :)

Feedback:

Comments

  • GHASH_FLAG_ALLOW_SHRINK, good!
  • Utility functions (didnt read over in detail but seem generally good), would prefer this be separate patch.1
  • mm2a - no strong opinion, if it gives better results - don't see why not.

Concerns

  • re: storing hash, not really keen on this, while its nice speedup, there are areas which store many elements still (dyntopo), we have issues running out of memory a reasonable amount. Even if this is a lot faster in a micro-benchmark, it would be interesting to see if this gives noticeable speedup in real-world use-case,. Is dyntopo sculpting noticeably faster for eg?
  • re: Modulo hash, means we *must* have hash functions giving good bit distribution. While hash functions should do this anyway, not all do (some currently use simple pointer/int masks which wont necessarily give such evenly distributed bits). Unless this gives real good measurable speedup. I rather use existing modulo.
  • BLI_ghash_add, while its not 100% exact the same. looks like the places its used could also use BLI_ghash_reinsert. Note, We could have a lower level insertion function which inserts (if needed) and returns the pointer to the **value, so the caller can choose how to handle the value. Am not totally against this but think we could re-think how to handle different insertion cases since there are already quite a few functions and its getting harder to keep track of subtle differences in each.
tests/gtests/blenlib/BLI_ghash_performance_test.cc
319

Does it leak in memory here?

Also, tests need to be decoupled here by the number of ietration/size etc. So you'll have IntGHash12000 and IntGHash10000000 tests. Same applies to other tests in this file.

Bastien Montagne (mont29) edited edge metadata.
  • Merge branch 'master' into temp-ghash-experiments
  • Factorize gset/ghash add/reinsert functions.
  • GTests GHash performance cleanup (strictly one test per GTest).
  • Cleanup (reduce a bit passing size of entries around by storing it in ghash itself).
  • Making hash storage optional.
  • Merge branch 'master' into temp-ghash-experiments

I would actually be quite skeptical about accepting 2x memory usage. For my knowledge ghash could be used for quite heavy meshes, in which case it's really problematic to increase memory consumption. It also feels a bit weird to have hash in the entry itself. It feels after all the changes you did there should be no so much hash/values comparisons and if it's still an issue i guess we need to look into particular hash/comparison functions and optimize them instead. If it's not possible and storing value will really help we can store the hash for those particular ghashes in user key?

re: storing hash, not really keen on this, while its nice speedup, there are areas which store many elements still (dyntopo), we have issues running out of memory a reasonable amount. Even if this is a lot faster in a micro-benchmark, it would be interesting to see if this gives noticeable speedup in real-world use-case,. Is dyntopo sculpting noticeably faster for eg?

Storing hash is in fact huge benefit for expansive hashing/comparison funcs (typical example: strings), while it gains nearly nothing (or even makes lose a few percents of performance) with very fast hashing/comparison (typical example: pointers, integers…).

So in this new patch I made storing hash optional. This is not so easy, since it forces us to get rid of the dirty hack used for gset, and define four real entry structs, but think it's worth it.

PS: with the curve stroke in this test file,

, 'drawing' it with dyntopo enabled is about 36sec on master and 26.5 seconds on branch, for me (more than 25% speedup, and it’s measuring the whole 'draw stroke' call, which involves other things than GHash operations ;) ). This gain is nearly purely due to load reduced from 3.0 to 0.75.

Regarding memory usage, note that storing hash and reducing load have about the same impact here (about 8 more bytes per entry for each).

re: Modulo hash, means we *must* have hash functions giving good bit distribution. While hash functions should do this anyway, not all do (some currently use simple pointer/int masks which wont necessarily give such evenly distributed bits). Unless this gives real good measurable speedup. I rather use existing modulo.

Well, right now modulo code is used. In the 'draw dyntopo stroke' test above, enabling masking gives about one more second (so something like 3 or 4 % better than modulo) - pure hashing tests shows about 10% gain for this. So it’s not mandatory, but still would like to keep it, we can enable it later once we have checked that repartition issue (note however that even with horrible repartition, quality factor like 4, generated by masking specially-generated 'bad' integers, the lookup speed drop was not that terrible - think it would be worth investigating always doing a very basic, simple hashing in all cases, but this more like detail for later).

BLI_ghash_add, while its not 100% exact the same. looks like the places its used could also use BLI_ghash_reinsert. Note, We could have a lower level insertion function which inserts (if needed) and returns the pointer to the **value, so the caller can choose how to handle the value. Am not totally against this but think we could re-think how to handle different insertion cases since there are already quite a few functions and its getting harder to keep track of subtle differences in each.

Factorized code from GHash/GSet add/reinsert (would keep the naming though, or change it for both gset and ghash, maybe something like ..._insert_safe(GHash *gh, void *key, (void *val), bool update, etc.) ? But this forces to pass NULL funcs when you just want to add if not yet present. :/

tests/gtests/blenlib/BLI_ghash_performance_test.cc
319

No, ghash is freed by helper funcs (int_ghash_tests() in this case).

Have split tests as required.

What is the speedup (if any) of the dynamic topology if you disable use_store_hash ?

As said, gain on dyntopo is only due to load reduction, storing ghash has absolutely no (measurable) effect here.

In fact, storing hash only benefits for expansive hashing/comparison (i.e. key strings). We talked a bit about it yesterday with Campbell on IRC, issue is, in our string ghash usages currently, ghash never seems to have a huge impact on performances (e.g. with RNA prop lookup, python overhead is so huge that even a twice quicker lookup in ghash is unmeasurable).

I could try to check other string things (like pose bones), but have already spent quite a bit of time on this, and would not expect much different results. Think we'll probably rather remove that feature, at least for now. :/

Patch has been subdivided in D1178, D1179, D1180 and D1181.