Page MenuHome

Fix: Realloc copies too much from previous allocation
Needs RevisionPublic

Authored by Hans Goudey (HooglyBoogly) on Jun 3 2022, 7:05 PM.

Details

Summary

For small alignments (less than four), realloc copied too many bytes
from the previous reallocation when growing a struct. This was caused
by the allocator storing the requested size aligned to the next 4 bytes,
and using that to decide how many bytes to copy to the new location.

This patch fixes this by removing the aligment 4 bytes for the requested
allocation size. The consensus was that this alignment is not necessary,
and conflicts/is redundant with other alignments based on the type size.

Here is a simplified version of the code that lead to the error:

/* Any type smaller than four bytes will cause a problem. */
int8_t *data = (int8_t *)MEM_calloc_arrayN(54, sizeof(int8_t), __func__);
memset(data, 0, sizeof(int8_t) * 54);
data = (int8_t* )MEM_recallocN(data, sizeof(int8_t) * 72);
for (int i = 0; i < 72; i++) {
  BLI_assert(data[i] == 0);
}

The change is co-authored by Jacques Lucke.

Diff Detail

Repository
rB Blender

Event Timeline

Hans Goudey (HooglyBoogly) requested review of this revision.Jun 3 2022, 7:05 PM
Hans Goudey (HooglyBoogly) created this revision.
Ray Molenkamp (LazyDodo) added inline comments.
intern/guardedalloc/intern/mallocn_lockfree_impl.c
40

that's a hard no for MSVC , if you really need bit 63, you'll have to move to a #define as msvc refuses to go over 32 bit in C

40
  • for enums

[hit submit too soon]

intern/guardedalloc/intern/mallocn_lockfree_impl.c
40

We need a bit so large that it will never actually be used. But I suppose this won't work if size_t is only 32 bits anyway, otherwise a define would do it.

intern/guardedalloc/intern/mallocn_lockfree_impl.c
40

this won't help you, but since the answer is unexpected, I'll answer it anyhow, but I'll be upfront the answer is super architecture specific :

surprisingly the answer is 48 on x64, you'll never see a pointer use bits 48..63 on x64 ever , you could even technically store data there directly in the pointer (given you sign extend the pointer before reading from it) however.....best not to :)

Hans Goudey (HooglyBoogly) edited the summary of this revision. (Show Details)

Use define, choose bit depending on sizeof(size_t)

intern/guardedalloc/intern/mallocn_lockfree_impl.c
40

Right, agreed, but I'm talking about a bit inside of MemHead::len, not storing information in a pointer.

intern/guardedalloc/intern/mallocn_lockfree_impl.c
40

I'll admit i'm splitting hairs here, so feel free to ignore (unless sergey wants to join me in being uneasy with this) linux has a 3/1 user/kernel memory split, so it is possible to successfully do a (single) > 2GB allocation making that upper bit less safe on x86 than it would be on x64.

it seems like an unlikely event (the address space is likely too fragmented to accommodate it), but the odds are not 0

What does this fix exactly? Is there a unit test which will show the issue (possibly when running with ASAN)? Or is there a few-liner code which will demonstrate the issue when using valgrind (in the case ASAN is not strict enough)?

I also wonder whether we still need to have such allocation alignment.

intern/guardedalloc/intern/mallocn_guarded_impl.c
436

actual_len -> allocation_len/allocation_size.

I don't think we have officially dropped support for all CPUs that require alignment for SIMD, but it also wouldn't surprise me if things are already broken somewhere due to lack of testing. There can also still be performance benefits by aligning data but not sure how much it affects us in practice.

I'm not super worried about > 2 GB allocations on 32 bit, but we could use uint64_t in MemHead there.

Related to that, the size of MemHeadAligned could be reduced by packing the alignment into len. We allow alignment up to 1024, but really the only values that make sense are 16 (SSE), 32 (AVX) and 64 (AVX512). All allocations are automatically aligned to 8 already.

intern/guardedalloc/intern/mallocn_guarded_impl.c
881

This is true no matter what now, so can be left out.

Brecht, sorry. Should have been more clear: I was talking about aligning allocation length to 4 bytes, not about alignment of pointer of allocated memory.

The changes in the guarded allocator might be inevitable due to the MemTail, but the lockfree allocator I think we can keep simple and not go the bit packing route by "simply" using malloc() of the given length (without aligning the length).

Hans Goudey (HooglyBoogly) edited the summary of this revision. (Show Details)
Hans Goudey (HooglyBoogly) edited the summary of this revision. (Show Details)
Hans Goudey (HooglyBoogly) edited the summary of this revision. (Show Details)

Remove alignment to the nearest 4

Hans Goudey (HooglyBoogly) marked 6 inline comments as done.Jun 8 2022, 12:06 PM

Hi @Brecht Van Lommel (brecht) and @Sergey Sharybin (sergey). I believe I've made the changes we discussed above to the patch. The crash mentioned above with resizing customdata layers still causes problems, so it would be nice with commit this.

This revision is now accepted and ready to land.Jul 21 2022, 8:17 PM
Sergey Sharybin (sergey) requested changes to this revision.Jul 22 2022, 9:52 AM

guardedalloc regression tests fails with this change applied on my mac. Didn't have time to investigate details, but a closer look is needed before this can land. Also trigerred a build of this change in buildbot to see if it is something macOS specific: https://builder.blender.org/admin/#/builders/18/builds/544

On a code side the direction of change is what I was envisioning. Suggestion would be to add an assert to allocation function to ensure that the high bit is available.

This revision now requires changes to proceed.Jul 22 2022, 9:52 AM