Page MenuHome

Fix on-demand font loading/unloading in VSE
Needs ReviewPublic

Authored by Richard Antalik (ISS) on May 1 2020, 2:36 PM.

Details

Summary

Text strip data were freed each time during depsgraph update and BLF_load() had to run always before rendering.
BLF_load() was executed from non-main thread by render job, which shouldn't happen.

This patch loads fonts once and keeps them in font table until they are removed by user or file is closed.

There were attempts to load font on both original and COW scene, and various oddities in user counting.
I tried to rectify those - I don't want to change usercount during depsgraph update or during undo. I don't know how to check if I am freeing data because of undo properly.
I used hack to get around this problem:

// XXX This is only way I can tell if I should actually touch font usercount.
do_user = (scene->ed->cache) != NULL;
sh.free(seq, do_user);

Looking back at https://developer.blender.org/D3621?id=12926#inline-35582 where this scenario was discussed.
Proposed solution was to use BLF_load_unique. Technically I would need to account for 3 fonts per strip - for preview, prefetch and render. That is quite limiting.

I would probably want to clarify what is thread unsafe here:

  • VFont is used only because there is existing UI template for it. Only usage in code is reading VFont->name and usercount handling.
    • I am not quite sure if it is ok to use it like this without any further checks. Seems quite safe to me.
  • Unloading font may (likely) cause crash during rendering and prefetching.
    • Probably not hard to fix, Perhaps we could use FontBLF->glyph_cache_mutex here. But this is different problem.
  • I am not sure if using non-unique font is problematic either, because sequencer rendering is locked by mutex. But ok, there may be conflict between VSE and another code using same font.
    • If this is valid problem, perhaps I could replace global font table with ghash? or perhaps have sequencer font cache + interface to use fonts from local caches?

Diff Detail

Repository
rB Blender
Branch
T75822-2 (branched from master)
Build Status
Buildable 7932
Build 7932: arc lint + arc unit

Event Timeline

Richard Antalik (ISS) requested review of this revision.May 1 2020, 2:36 PM
Richard Antalik (ISS) created this revision.
Sergey Sharybin (sergey) added inline comments.
source/blender/blenkernel/intern/seqeffects.c
3850–3851

This seems weird to call BKE_sequencer_text_font_load() with do_id_user=true and then effectively cancel if out doing id_us_min.

3858–3870

Would say that BKE_sequencer_text_font_load() is to be avoided if doing copy for depsgraph.

source/blender/blenkernel/intern/sequencer.c
250–251

It is very weird to override do_user behavior here.

If it is to prevent CoW versions of strips from freeing vfont it can happen more explicitly by storing an ownership flag which will indicate whether strip shares the font with someone else or whether it owns it.

510

Unrelated.

source/blender/makesrna/intern/rna_sequencer.c
622

Generally, no heavy actions should happen in set().

You still be to be able to render sequencer right after opening a file, so I don't see a need to force load here.

source/blender/blenkernel/intern/seqeffects.c
3858–3870

I need to copy data from original strip, but not increase usercount when doing copy for depsgraph.
But in case of doing real copy by duplicating strips I have to increase usercounts.

source/blender/blenkernel/intern/sequencer.c
250–251

I don't have problem with VFont - that is fine to handle.
Problem is FontBLF, because you have to manage it's usercount, and also it is global, so if you don't do it correctly, they will be unremovable.

The issue here is, that I can't tell what is happening here. Is this CoW update / undo / loading new .blend file?
Only in case of CoW update I shouldn't modify usercount. Ideally I would say you should be able to check for these events?

I should double check this code because there BKE_sequencer_cache_destruct(scene); always runs before this.
But my point stands mostly I think.

source/blender/makesrna/intern/rna_sequencer.c
622

Issue here is, that BLF_load() was executed from non-main thread by render job, which shouldn't happen.

As complete alternative solution may be to make these BLF functions thread safe, that is still an option.
Otherwise it must be guaranteed, that BLF_load() will run from main thread.

source/blender/makesrna/intern/rna_sequencer.c
622

But if you setup VSE scene, save and reload, this code isn't run. So somewhere BLF_load() is to happen. So where is this case handled?

I'm not sure why this bug is marked as high priority since it's an assert failing where we have not seen an actual bug for users. The probability of a UI thread and sequencer thread loading/unloading fonts at the same time is not that high so I wouldn't expect that to be causing much trouble right now, to be considered a high priority bug for 2.83.

The issue is fundamentally that with BLF, it is not safe to load, unload or draw the same font from multiple threads. Adding mutexes to make load and unload safe is not that hard. Basically we would add a global mutex for the entire body of those BLF functions and it should be fine. As long as a thread or data structure is holding a reference to the font it will not be freed by another thread.

The bigger problems are in the drawing. One is that drawing loads glyphs on demand, and the implementation of that is not thread safe. This can be fixed similarly with mutexes. It's more performance sensitive but we can probably implement this without significant overhead. We can avoid mutex locking whenever the glyph is already cached.

The other is that the current drawing state (position, size, blur, ..) is stored in the font. Instead there should be a font drawing state per-thread and shared font. The font drawing state does not need to be stored persistently anywhere, it would be something you set up in a function that draws text and then discard.

So, this all requires some refactoring of the BLF module, and more than would be safe for 2.83 at this point. It could be backported to LTS if it's proven to be stable in master.

I'm not sure why this bug is marked as high priority since it's an assert failing where we have not seen an actual bug for users. The probability of a UI thread and sequencer thread loading/unloading fonts at the same time is not that high so I wouldn't expect that to be causing much trouble right now, to be considered a high priority bug for 2.83.

I somehow thought this will crash even release for some reason. I have re-checked - it doesn't, so I lowered priority.

The bigger problems are in the drawing. One is that drawing loads glyphs on demand, and the implementation of that is not thread safe. This can be fixed similarly with mutexes. It's more performance sensitive but we can probably implement this without significant overhead. We can avoid mutex locking whenever the glyph is already cached.

This should be safe actually (rBSa960dc451930) in a sense that it shouldn't crash, but as you said, another thread may change font size before drawing happens.

The other is that the current drawing state (position, size, blur, ..) is stored in the font. Instead there should be a font drawing state per-thread and shared font. The font drawing state does not need to be stored persistently anywhere, it would be something you set up in a function that draws text and then discard.

So, this all requires some refactoring of the BLF module, and more than would be safe for 2.83 at this point. It could be backported to LTS if it's proven to be stable in master.

Yes, I think it is more sensible to fix these issues in BLF. But font loading and unloading we have now is not correctly implemented either.

So TODO here is:

  • Make font loading/unloading thread safe
  • Extract font parameters from FontBLF and pass them explicitly to drawing function
  • Use (fix) on-demand font loading/unloading in VSE
source/blender/makesrna/intern/rna_sequencer.c
622

This will happen in BKE_sequence_base_dupli_recursive() during DEG_evaluate_on_refresh() which will happen on load. It shouldn't happen there, but it does now.

I should probably bring back SEQ_FONT_NOT_LOADED flag and just don't unload/load fonts during CoW update. So we will still trigger the assert, but fonts will be loaded on demand.

  • Revert changes in SEQ_FONT_NOT_LOADED usage
  • Load FontBLF on-demand, explicitly
  • Unload FontBLF explicitly in BKE_sequence_free_ex
  • Split and disambiguate Vfont and FontBLF loading/unloading
  • Fix bug effect copy bug in seq_dupli - seq_dst and seq_src was swapped
Richard Antalik (ISS) marked 2 inline comments as done.
  • Fix null deref when doing copy-paste
Richard Antalik (ISS) marked 2 inline comments as done.
  • Make change in BKE_sequence_free_ex more readable

Sorry for these small updates, some changes are better visible in side by side view. Should do that offline...

Richard Antalik (ISS) marked 2 inline comments as done.May 6 2020, 11:12 AM

Fix bug effect copy bug in seq_dupli - seq_dst and seq_src was swapped

Can this get its own patch? Will make it faster to review and get committed. Will also make this change smaller and easier.

Split and disambiguate Vfont and FontBLF loading/unloading

Not immediately clear to me why is this needed?

Can you also summarize what is the overall idea of the fix?

source/blender/blenkernel/intern/seqeffects.c
3802–3803

If data can indeed be NULL then it should be if (data == NULL || data->text_blf_id < 0). Otherwise you have NULL-pointer dereference few lines below.

Is also weird to have checks < 0 in some cases and comparison with SEQ_FONT_NOT_LOADED in others.

Richard Antalik (ISS) marked an inline comment as done.
  • Fix precondition, add comment to clarify case
Brecht Van Lommel (brecht) requested changes to this revision.May 6 2020, 11:51 PM
Brecht Van Lommel (brecht) added inline comments.
source/blender/blenkernel/intern/seqeffects.c
3802–3803

I guess having a #define SEQ_FONT_DEFAULT -1 could help here?

Then the test would be (data == NULL || ELEME(data->text_blf_id, SEQ_FONT_NOT_LOADED, SEQ_FONT_DEFAULT)

3806

I would expect data->text_blf_id = SEQ_FONT_NOT_LOADED; here.

3812–3813

This should be (data == NULL || data->text_blf_id != SEQ_FONT_NOT_LOADED)

3827–3845

I think these vfont load/unload functions should be removed, the name is really confusing since it does not actually load, unload.

Instead it should just do id_us_min/plus wherever these are called now.

source/blender/blenkernel/intern/sequencer.c
250–251

I don't understand why ID user counting is involved here.

What I would expect is that every sequence strip that has a text_blf_id holds a reference. If it is copied the reference count is increased. If it is freed the reference count is decreased. It shouldn't matter if that sequence strip is original, CoW or anything else?

source/blender/makesrna/intern/rna_sequencer.c
620–621

I think it should not be modifying BLF here at all, if it's just about vfont?

Because if there is a font loaded and it's just being cleared here, BLF font reference counting will be wrong.

This revision now requires changes to proceed.May 6 2020, 11:51 PM
Richard Antalik (ISS) marked 6 inline comments as done.
  • Fix precondition, add comment to clarify case

Oops wrong message - Only adressed inlines.

source/blender/blenkernel/intern/sequencer.c
250–251

I will increase usercount of FontBLF only if it is already loaded when copying strip and remove CoW exception.

Correct me if i'm wrong, but seems it is main thread which is supposed to load font before it can be used during render?

At this point, isn't it easier to "just" cover BLF functions with mutex? Shouldn't be introducing measurable slowdown for the interface, but will allow to render text strips form any initial state of the sequencer.

Correct me if i'm wrong, but seems it is main thread which is supposed to load font before it can be used during render?

At this point, isn't it easier to "just" cover BLF functions with mutex? Shouldn't be introducing measurable slowdown for the interface, but will allow to render text strips form any initial state of the sequencer.

Putting locks in individual BLF functions will not work, because there are separate function calls to set position, color, etc. So that state needs to become per thread. Or each thread needs to put a lock around setting up that state and drawing, but at that point I think we might as well go with the proper solution.

@Brecht Van Lommel (brecht), ah, you're right. So the proper fix is really to go and refactor BLF in a way that position/color is not stored in the font.
Regarding to this patch, I'm still confused. Does it fix the issue or silences the assert? Not sure how all that is making BLF usage more thread safe.

Richard Antalik (ISS) retitled this revision from Fix T75822: Assert fails when rendering text strip with non-default font to Fix on-demand font loading/unloading in VSE.Jul 7 2020, 10:03 AM

@Brecht Van Lommel (brecht), ah, you're right. So the proper fix is really to go and refactor BLF in a way that position/color is not stored in the font.
Regarding to this patch, I'm still confused. Does it fix the issue or silences the assert? Not sure how all that is making BLF usage more thread safe.

Originally I tried to silence assert with this patch, now honestly I don't remember if it will silence it or not. I would consider this a fix/cleanup for not-so-optimal existing code.