Page MenuHome

BLF: Remove Fixed Pitch Code
Needs ReviewPublic

Authored by Harley Acheson (harley) on Dec 19 2021, 2:25 AM.

Details

Summary

Simplification of code by removing redundant functions that are
specific to the drawing of fixed-pitch (monospaced) text.


We have some unnecessary duplicated font functions. We have BLF_draw and we also have BLF_draw_mono with the latter used only for fixed-pitched use, while the former works for either fixed or proportional fonts. The same goes for blf_font_draw and blf_font_draw_mono. You might think there is some optimization with monospace text, but there really is not. In order to print any character we need to load a GlyphBLF and it contains an integer advance that works for fixed or monospaced and is no slower to access than reading this from somewhere else.

To illustrate this confusion we currently call BLF_draw_mono four times. Yet we call BLF_draw with the *"blf_mono_font"* 19 times. So we are far more likely to draw mono text using the regular call, probably because the "mono" version was not noticed. But they work the same anyway. It is nice to remove this complication and extra code.

As part of this I also removed text_font_draw_character_utf8 as this was a duplication of text_font_draw_character, but specific to single codepoints specified by a UTF-8 sequence and with slightly different arguments. This patch just makes text_font_draw_character work with any single character or sequence.

Summary showing how much code is removed:

source/blender/blenfont/BLF_api.h             |  1 -
source/blender/blenfont/intern/blf.c          | 18 -----------
source/blender/blenfont/intern/blf_font.c     | 37 ---------------------
source/blender/blenfont/intern/blf_internal.h |  4 ---
source/blender/editors/space_info/textview.c  |  4 +--
source/blender/editors/space_text/text_draw.c | 46 +++++++++------------------
6 files changed, 17 insertions(+), 93 deletions(-)

Diff Detail

Repository
rB Blender

Event Timeline

Updated to current state of master.

Updated to the current state of master.

Mono-spaced drawing uses BLI_wcwidth unlike drawing regular fonts.
This is needed so drawing matches other font width calculations that call BLI_wcwidth internally (e.g. BLI_str_utf8_char_width_safe).

From a quick check this causes selection & cursor not to match the text displayed (some text in https://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html for e.g).

Also the patch causes this text below to render cramped (as well as not matching the cursor position).

Chinese: 我能吞下玻璃而不伤身体。
Chinese (Traditional): 我能吞下玻璃而不傷身

So I think it's best the functionality is left working as-is.


Requesting changes although any refactoring to this area would probably be better submitted as a separate patch.

Campbell Barton (campbellbarton) requested changes to this revision.Feb 4 2022, 3:42 AM
This revision now requires changes to proceed.Feb 4 2022, 3:42 AM
Harley Acheson (harley) planned changes to this revision.Feb 4 2022, 7:27 PM

campbellbarton
Mono-spaced drawing uses BLI_wcwidth unlike drawing regular fonts.
...
So I think it's best the functionality is left working as-is.

Good catch. Although quite sure these problems are worked out in later plans. When I've tested with fallback fonts along with variable glyph widths I had nice mixing of Latin letters, Chinese characters, and symbols in Text Editor with proper selection, etc.

Lets ignore this for now and maybe circle back in six months or a year or whatever. Setting to "Plan Changes" so I don't lose track.

Just adding some notes to clarify what is meant by "Mono-spaced drawing uses BLI_wcwidth unlike drawing regular fonts. This is needed so drawing matches other font width calculations that call BLI_wcwidth internally (e.g. BLI_str_utf8_char_width_safe)."

The following capture is from Text editor with Chinese characters taking up twice the width of monospaced Latin characters:

So this would have to be properly supported by the code for variable fonts and for fallback stack before this code line can be removed.

@Campbell Barton (campbellbarton)

This version gets rid of the duplicate "mono" code, but also supports BLI_wcwidth (fixed-width columns of variable width). It does it in a way that fits with my (hopefully) future changes. This means that the Chinese characters could come from a different font entirely or have widths that differ from multiples of the Latin fixed width. So if the Latin text is drawing at a fixed-pitch width of 20 pixels and the same size Chinese (separate) font has a width of 35 it will be centered, or if 45 will be slightly resized to 40.

There is still a bug where the cursor and displayed fonts don't line up: see: https://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html

While I see there can be some benefits in this change (we might even want to include the additional flag), I don't see much/any benefit in removing BLF_draw_mono.

The mental model is simple, drawing is done in a grid, the mono-spaced drawing function returns how many columns were used, which can then be multiplied by the fixed width.

That this patch causes the cursor to get out of sync with drawing (while it can be fixed) is an indication that merging mono-spaced drawing into regular font drawing is causing unnecessary complexity.

It seems a bit like this patch is attempting to solve a problem we don't have, while introducing code-paths that are likely to diverge causing bugs which are difficult to solve.


All that said, it would be possible to keep BLF_draw_mono as-is, while sharing draw logic internally, this would allow word wrap and other features to be used with mono-spaced fonts, so it's not necessarily all or nothing.

source/blender/editors/space_text/text_draw.c
86

This removed the c_len argument (which is still used by callers).

Campbell Barton (campbellbarton) requested changes to this revision.Feb 10 2022, 1:29 AM
This revision now requires changes to proceed.Feb 10 2022, 1:29 AM

@Campbell Barton (campbellbarton) -It seems a bit like this patch is attempting to solve a problem we don't have, while introducing code-paths that are likely to diverge causing bugs which are difficult to solve...

Yeah, I'm not too in love with this one. But working on this today helped me fix a bug in another patch (D12622) so it isn't all a loss. LOL

Harley Acheson (harley) added a project: Core.

Updated to the current state of master. Many changes in my earlier version are now in master, so this now is a nice reduction in code and complexity.

@Campbell Barton (campbellbarton) - There is still a bug where the cursor and displayed fonts don't line up: see: https://www.w3.org/2001/06/utf-8-test/UTF-8-demo.html

Prior changes to master fixed the issues with double-width monospaced characters.

This removed the c_len argument (which is still used by callers).

Fixed.

Updated to the current state of master.

Harley Acheson (harley) edited the summary of this revision. (Show Details)Jul 31 2022, 4:17 AM

Updated to the current state of master.

@Campbell Barton (campbellbarton) - How are you feeling about this one? You had objections earlier but that was when it was a lot messier. With subsequent cleanup in BLF this patch is now dead-simple and is a nice code removal.

I think we'll have a hard time getting to bidirectional text and complex shaping without consolidation of our print output code, as that is going to be very disruptive.

Checking on this patch again and discovering more issues, while are some benefits to reducing code duplication, having an explicit mono-spaced font drawing function ensures the characters remain in a grid layout, even for low quality fonts or some situations (where they might not be *exactly* mono-spaced). I don't see this as a problem we necessarily need to avoid.

Now if there are unexpected widths in a mono-spaced font, the problem of ensuring fixed with grid-drawing is moved to the caller (Blender's code) where we are forced to step by each character and draw to ensure expected spacing.
Even though we can always claim there is an error in the font not being properly mono-spaced, from the users perspective it will seem like a bug in Blender whenever the cursor or selection gets out of sync with the text.

With this patch applied Python's console isn't displaying text properly for some characters. This screenshot shows before (left) / after (right). Notice how the text runs off the screen (on the right hand side).


Since quite some effort has gone into this patch and there are still issues... and it's not actually fixing a pressing problem, I'd rather keep existing code as-is.

@Campbell Barton (campbellbarton) - Python's console isn't displaying text properly for some characters....

Actually, your screenshots illustrate master and the patch BOTH failing that test document, just in different ways. Those sample items are supposed to look like this: à=à; é=é; î=î, õ=õ; ū=ū, ő=ő; č=č ç=ç; ḅ=ḅ; ḏ=ḏ, ḙ=ḙ ǖ=ǖ=ǖ, ǡ=ǡ=ǡ ǭ=ǭ=ǭ, ó̷=ǿ=ǿ and every single one fails in master (and in older versions of Blender too).

I'd much rather fix that than simply favor the existing failing result.

In master you see that variable-width text output handles those strings correctly, while monospaced does not:

And the reason why you are seeing a different result in Console versus Text Editor is that the former uses blf_font_draw_mono while the latter uses blf_font_draw with the mono font. This is exactly why these output functions should be consolidated.

Harley Acheson (harley) updated this revision to Diff 54589.EditedAug 10 2022, 10:36 PM

Now correctly applying Unicode combining characters, unlike current code (as illustrated earlier in thread). Following shows the sample text displaying correctly in Text Editor (left) and in Console (right):

The problem mentioned previously remains for wide characters

.

More generally, I would like to see you testing your own patches more thoroughly, each time I look at this patch it's not taking me long to discover regressions. Of course some oversights can be hard to avoid - but modifying the internals of text rendering is a significant enough change it warrants thorough testing before proposing it to be included.
This also points to the need for font rendering tests since validating correct output should be automated.

My opinion of the this patch hasn't changed since my last reply, however it seems your update to blf_glyph.c might be investigated as a fix separate from this patch.

@Campbell Barton (campbellbarton)

Monday evening I found an issue with the cache subsystem that I didn't want to leave in for days since I was going the next day to Siggraph. But that revert broke font fallback in that it could not find a glyph in a font that did not have its face - was missing an blf_ensure_face. I just committed that one-line fix.

The problem mentioned previously remains for wide characters

.

I don't see any problems that were previously mentioned about wide characters.

You have mentioned support for double-width characters with proper selection and that remains working:

You sent the link to that UTF8 test page, but all of that appears to be perfect. One section:

Another:

And that test with combing characters works. It would be difficult to anticipate needing to support a feature that we've never had working correctly.

With this last set, I'm not certain what you are wanting, but this is what I see. We are missing some of these glyphs, but they appear to be gothic characters in U+10330 to U+1034F range. Are you wanting us to support these? Would be easy enough. One of the characters in your sample text is an "aegean weight fourth subunit" (U+1013B)

After adding a gothic font:

it's not taking me long to discover regressions

It would be nice to have a list of features that we are meant to support here. Otherwise hard to guess we support double-wide, combining characters (but not really), or that you want "aegean weight fourth subunit". I'm guessing that you know the features we support because you wrote them, but I can only guess about things I cannot see.

it seems your update to blf_glyph.c might be investigated as a fix separate from this patch.

Yes, for sure. I'll make a separate patch for that.

Quick follow up on unicode_mono_issues.txt as you weren't able to redo the different display, unfortunately with these characters I was able to get different display before/after the patch, but in simplifying the example it removed the differences.

This file, isn't simplified & is some generated random unicode characters.

Running this in the python console shows them: open('unicode_mono_issues_v3.txt').read()

This is only to show that this patch displays text differently, with this example I'm not sure the current monospaced drawing is working properly either, so some investigation is needed to what is changing and if that's useful.

If the current code isn't working properly though, I'd rather first fix it before considering this patch though.

@Campbell Barton (campbellbarton) - This file, isn't simplified & is some generated random unicode characters.

Not that random. Almost all the characters are Arabic and the majority of the page contains a single character that is pretty special...

U+FDFD is a part of the Arabic Presentation Forms-A. It is actually an entire phrase turned into a single glyph. Because it is actually multiple words you will see it quite differently in different fonts. Some will show it as a single wide character, others over two lines, and some a three lines of text inside a single square

This is that character as show in whatever font is now showing this text: ﷽

Here it is in a font that draws this single character as if it is three lines of text in a box:

This is how I see that same single character in Windows notepad:

This is only to show that this patch displays text differently

Yes, I want it to display differently. This is how that document looks in Windows Notepad:

This is how it looks in Blender 2.93

This is how I can get it looking in Blender 3.4

Above is with D15659: BLF: Editing Text with Combining Characters and with a change to BLI_wcwidth so that 0xfdfd returns a width of 3. And fixing a bug in the fallback code that is not looking in our supplied Arabic font for Arabic Presentation Forms-A characters.

However, it is currently impossible to get to anywhere close to perfect with current code and Arabic, so this test cannot pass completely with small measures.

Arabic is right to left. And it is complex and shaped. So in order to do that correctly we have to get to a point where we allow a difference between the logical characters that are in our text buffer and the visual glyphs that are displayed.

For RTL text this means that they are stored in our buffers as LTR but are shown RTL. And a word might start with a particular codepoint in our buffer but will use a different one when shown because of the position of that character in the word. That same codepoint inside a word will use yet another one to display. Some language also swap order mid-word depending on context too.

Our current text loops are very simple. We grab bytes at a time from the input string so we can process a UTF32 character at a time. We assume that is always a codepoint to display, get the glyph index for it, show that glyph. repeat.

But do something really simple, like just supporting ligatures. In English many fonts have a single glyph ligature for the combination of letters "fi", because in some fonts the hood of the "f" interferes with the the dot of the "i". This means we cannot process a single character at a time. We might have "fit" in our buffer, so {ox66, 0x69, 0x74} but will actually print out the codepoints 0xFB01 and 0x74.

To do all this right we have to change our processing to work with the entire string at a time, not per character.

We change the utf8 string into an array of UTF32 codepoints. We give that FriBiDi and it gives us back another array of codepoints that are in the correct visual order (note that we can have a single string that contains languages in different orders like "Hello اَلْعَرَبِيَّةُ". Then we have to break this array us into runs of characters that have the same base direction and script ("Hello " separate from "اَلْعَرَبِيَّةُ" in this example.) We give each run to Harfbuzz to "shape", so ligatures are inserted, codepoints are changed depending on location in word and context. It gives us back an array of glyph index (for a single font) and positioning information that we use to output correctly.

This all seems difficult but doable to do in most of our UI cases. But what looks extremely difficult will be properly handling text selection, text caret insertion, and text editing like backspace and delete. As a simple taste, just try selecting different parts of the following: "Hello اَلْعَرَبِيَّةُ Hello"

These big changes to how we deal with text is why I keep trying to remove redundancy in our code like this patch and in D15462: BLF: Consolidate Text Output / Processing. Rather than make these changes all over the place I'd rather do this in less places.