MultiFormatWriter: fix encoding binary std::string#599
MultiFormatWriter: fix encoding binary std::string#599markusfisch wants to merge 1 commit intozxing-cpp:masterfrom
Conversation
While `MultiFormatWriter::encode(std::wstring)` can encode binary data
just fine, `MultiFormatWriter::encode(std::string)` corrupts the data
by trying to interpret it as UTF-8.
This may also break immediately because not all byte combinations are
allowed in UTF-8.
So when `_encoding` is set to `CharacterSet::BINARY`, the `std::string`
needs to be converted to a `std::wstring` of unsigned values, because
`TextEncoder::GetBytes(std::wstring)` calls `ToUtf8()` which cannot
handle negative values.
For example, without this commit, try:
auto writer = MultiFormatWriter(BarcodeFormat::QRCode)
.setEncoding(CharacterSet::BINARY).
auto bitmap = writer.encode(std::string("\x7e\x7f\x80\x81"), 200, 200);
Which will result in: "ValueError: Unexpected charcode".
Please note ZXingWriter is NOT affected by this!
This works just fine:
$ printf "\\x7e\\x7f\\x80\\x81" > file
$ example/ZXingWriter -binary QRCode file out.png
Because:
1) it's calling `MultiFormatWriter::encode(std::wstring)` and
2) builds the `std::wstring` from `uint8_t`
|
Your fix/workaround is exactly what I did in |
|
Okay, I look forward to the new API then 😉 |
|
@axxel DataMatrix seems to especially problematic here, probably because of this transformation. What surprised me a little is that even UTF-8 sequences apparently cannot be encoded at the moment (at least with the stock Here's a little sample for trying On the other hand, So before taking action I thought it would be wise to check your opinion on that. |
Exactly. This has indeed come up already earlier this year (see here).
Very wise ;). I have to admit that my progress on the new Writer API has been stalled recently. I have an early prototypish hack lying around but there was a lack of time and there are still quite a few open questions regarding the types used in the API (as mentioned in #332) If you need this 'now' I see two options:
|
|
Thanks for the quick reply! I think I'll try to hack something together then as it is a bit urgent, but of course, I would always happily be the first to alpha-test your solution! 😉 If you maybe already have a concrete idea, please tell me so 😉 |
While
MultiFormatWriter::encode(std::wstring)can encode binary data just fine,MultiFormatWriter::encode(std::string)corrupts the data by trying to interpret it as UTF-8.This may also break immediately because not all byte combinations are allowed in UTF-8.
So when
_encodingis set toCharacterSet::BINARY, thestd::stringneeds to be converted to astd::wstringof unsigned values, becauseTextEncoder::GetBytes(std::wstring)callsToUtf8()which cannot handle negative values.For example, without this commit, try:
Which will result in: "ValueError: Unexpected charcode".
For a QRCode,
MultiFormatWriter::encode(std::string("\x7e\x7f\x80\x81"))will run:ToUtf8(str)strwill now be0x7e, 0x7f, 0xff, 0xbf, 0xbe, 0x80, 0xff, 0xbf, 0xbe, 0x81Please note ZXingWriter is NOT affected by this!
This works just fine:
Because:
MultiFormatWriter::encode(std::wstring)andstd::wstringfromuint8_t