Page MenuHome

UTF-8 support for text editor
Closed, ArchivedPublicPATCH

Description

This patch adds UTF-8 support for space_text. Tested under Windows and Linux. No changes for python api or GHOST.

It adds do_versions block, because currently blender handles not only ASCII characters, but also http://en.wikipedia.org/wiki/ISO/IEC_8859-1 characters which should be converted to two-byte utf-8.

This patch also fixes 2 problems:
1) If cursor is placed after the third character in line txt_move_left always moves cursor straight to the beginning of line.
2) I line contains 2 spaces and cursor located at the end of line, txt_move_left moves cursor to the beginning of line, but ctrl+z moves cursor right only over 1 character.

How to read this patch:
For most of the functions for space_text int variables can be separated into 2 parts: so called "memory" or "view". E.g. in text_draw_wrapped you can see:
int basex, i, a, start, end, max, lines; /* view */
int mi, ma, mstart, mend; /* mem */

These "memory" variables can not be simply incremented, decremented, compared or assigned to "view" variables. It's like assigning a length to a mass. "Memory" variable can be incremented with BLI_str_utf8_size, decremented with BLI_str_prev_char_utf8, compared and assigned to other memory variable. If this rule is followed, everything should be ok.

There are some changes for undo stack too. There is no reason to allocate 4-bit for every ascii character in undo stack, so I created 4 versions of insert, backspace and delete opcodes: for ascii, 2 bytes utf-8, 3 bytes utf-8, and 4 bytes unicode. This can save a lot of memory while using only BMP unicode characters.

Event Timeline

I did some testing on Mac and patch seems to be working fine, found one issue:

* Open text editor, enable syntax highlighting, paste "#§èà", type enter => Memoryblock SyntaxFormat: end corrupt in console
* Same steps as above, on quit => Error: Not freed memory blocks: 1, strdup len: 8 0x1041b17a8

Nice catch, I've missed one early return statement before the time I called MEM_freeN. And I've completely forgotten to change some lines in syntax highlighter.

I fixed it in the second version of the patch. And special thanks for testing on Mac: I was afraid of non-working copy&paste (like it was with X11 and Win32 GHOST).

Made just quick glance at code. Can name one thing which remains unsolved -- blender monospace font doesn't support much languages. Maybe if International Fonts is enabled in userprefs monospace font might also be changed to something with better unicode coverage; or maybe just configurable fonts...

Well, more tests and comments about code tomorrow :)

Uploaded patch to codereview for easier review/discussion http://codereview.appspot.com/5541053/
Comments about things i'm not sure about are there.

And some reports. You'll need line indented by real tab character and place cursor to the beginning of this line (better to use arrows for navigation).

- Right arrow wouldn't skip tab character fully, 4 clicks on right arrow button are needed to move cursor.
- When you'll try to place cursor to the middle of that line using mouse, you'll have a deadlock.

Codereview is now at http://codereview.appspot.com/5541055/ , so now I have permission to upload new patch sets.

Both of the problems are fixed in there now.

Patch applied on rev. 43427.

Sv. Lockal (lockal) changed the task status from Unknown Status to Unknown Status.Jan 16 2012, 5:25 PM