Page MenuHome

Better support for loading OCIO configuration from a path with non-ASCII chars on Windows
Needs RevisionPublic

Authored by Tamito Kajiyama (kjym3) on Jun 18 2014, 7:04 AM.

Details

Summary

Better support for loading OCIO configuration from a path with non-ASCII chars on Windows.

The present "workaround" based on BLI_get_short_name() may fail depending on
inputs because BLI_get_short_name() does not guarantee that the output string is
free of non-ASCII chars.

The proposed solution is to convert the given input file name (in UTF-8) into the
Windows default ANSI code page (CP_ACP as in Windows.h).

Diff Detail

Event Timeline

Overall seems reasonable changes which removes legacy hack to deal with non-ascii paths. Couple of minor notes tho.

source/blender/blenlib/intern/fileops.c
245

Need function prototype in header file.

260

I would use stack variable in the callee and pass maximal size of buffer to here instead of using malloc.

Would make function smaller as well.

This revision now requires changes to proceed.Jan 14 2015, 4:40 PM
source/blender/blenlib/intern/fileops.c
245

Eh oops. Must have had some glitch, didn't see the prototype in the first review ,but see it now.

So false alarm i think.

Campbell Barton (campbellbarton) requested changes to this revision.Jan 14 2015, 4:46 PM
Campbell Barton (campbellbarton) edited edge metadata.

General comments...

Windows path handling with non ascii characers is quite an issue and we already have ufopen, uaccess... etc.

A concern I have with this is we have multiple solutions for the same thing, for eg, we want to access a file on windows. we could use BLI_fopen() or and pass in a FILE * if the external API supports that, or we could use BLI_get_filename_acp.

We can do both of course, and it seems like maybe its needed, but its just not something I like to do unless we really have to.


Code maintainability

Everywhere BLI_get_filename_acp is used will need an ifdef WIN32,... so this API call isn't really for general use, its for interfacing external libs which only take char * paths as input.
So I think maybe we sould make this part of utf_winfuncs.h.

Then API interfaces that need it can include the low level library.

Possible Bugs?

One more thing, after the conversion strcmp() on the paths wont match up, so if we use if for more areas, we cant search for existing paths, this means I rather not scatter this through our code (in BLI_) but restrict usage to opening files via external libs.

@Campbell Barton (campbellbarton), OCIO does not support FILE* so you either need to pass proper ascii file path or use std::istream. First one is what this patch does, second one i'm not sure how you're gonna to do assuming we're pure C in there.

We can probably try constructing istream from a proper ascii file path or from FILE*, but that i'm not sure is possible.

We can also probably just to do this ascii part in ocio_capi, which would keep blender code a bit more clear.

Conclusion can be: we accept this and add to utf_winfuncs.h, with comment to use only when API's dont provide alternative access methods, and take care where strcmp of filepaths is involved (such as storing paths in DNA)

Hi @Campbell Barton (campbellbarton) and @Sergey Sharybin (sergey), thanks for the code review. Fine to me to move BLI_get_filename_acp() to intern/utfconv/utf_winfunc.h and leave notes there for restricted use of the function. I will duly update the patch.

source/blender/blenlib/intern/fileops.c
260

@Sergey Sharybin (sergey),
The most reliable way here is to let WideCharToMultiByte() calculate the required output buffer size. The number of output bytes is arbitrary and it is not safe to assume a fixed number here.

(For instance I am reasonably familiar with Shift JIS, that is the default ANSI code page in Japanese editions of Windows, so I could use a fixed-size buffer here for code brevity or some other reason. But it is difficult at least for me to check out all other ANSI code pages and make sure the fixed-size buffer is always long enough.)

Another option could be to turn BLI_get_filename_acp() into a thin wrapper of WideCharToMultiByte() and have the caller calculate the buffer size as follow:

int bufsize = BLI_get_filename_acp(filename_utf, NULL, 0); /* 0 for calculating buffer size */
char *filename_acp = MEM_mallocN(bufsize + 1, "temporary file name buffer");
BLI_get_filename_acp(filename_utf, filename_acp, bufsize);

But I guess this is not what you meant.

@Tamito Kajiyama (kjym3), for BLI we should stick to whatever is native DNA. And it uses FILE_MAX which is 1024. You can't do any operation with files if they don't fit in there.

And really currently i don't see reason for this function in BLI, it's mainly fitting to utf_winfuncs.h. So if i was working on this patch i would:

  • Move the function you wrote to utf_winfuncs.h. It could allocate memory so it's really portable etc.
  • Make OCIO C-API converting path using this function.
  • Keep blender's side really clean and not worrying about this windows path codename thingie issues.