JSC fix utf8 string creation#140
Conversation
There was a problem hiding this comment.
Pull request overview
This pull request attempts to fix a range error caused by invalid byte sequences in UTF-8 string creation by replacing the deprecated std::wstring_convert and std::codecvt_utf8_utf16 approach with a simpler implementation using JSStringCreateWithUTF8CString directly. The change removes the UTF-8 to UTF-16 conversion path and instead creates a null-terminated copy of the input string.
Changes:
- Removed
<locale>and<codecvt>includes - Replaced UTF-8 to UTF-16 conversion with direct UTF-8 string creation
- Added comments explaining the rationale for avoiding deprecated APIs
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
|
|
There was a problem hiding this comment.
The new implementation will incorrectly truncate strings that contain embedded null bytes. When a length is specified, the std::string constructor correctly includes all bytes (including embedded nulls), but JSStringCreateWithUTF8CString treats the input as null-terminated and will stop at the first null byte it encounters.
The previous implementation converted to UTF-16 using JSStringCreateWithCharacters which accepts a length parameter and can handle embedded nulls correctly. While the std::wstring_convert approach is deprecated, a better fix would be to implement manual UTF-8 to UTF-16 conversion (similar to the utf16toUTF8 function in V8InspectorUtils.cpp) and continue using JSStringCreateWithCharacters, or use a non-deprecated UTF-8 to UTF-16 conversion library.
Fix range error caused by bytes sequence considered invalid. Also, this api was deprecated since c++17.