clore::support::ensure_utf8
Owner: Module support
Declaration: support/logging.cppm:75
Definition: support/logging.cppm:405
Declaration: Namespace clore::support
Implementation
cpp
auto ensure_utf8(std::string_view text) -> std::string {
std::string normalized;
normalized.reserve(text.size());
for(std::size_t offset = 0; offset < text.size();) {
auto sequence_length = valid_utf8_sequence_length(text, offset);
if(sequence_length == 0) {
normalized += kUtf8Replacement;
++offset;
continue;
}
normalized.append(text.substr(offset, sequence_length));
offset += sequence_length;
}
return normalized;
}函数 clore::support::ensure_utf8 遍历输入字符串的每个字节偏移,核心控制流依赖于调用 valid_utf8_sequence_length 来判断从当前位置开始的字节序列是否构成合法的 UTF-8 编码。若该函数返回非零长度,则直接将对应子串追加到结果中;若返回零(表示无效起始字节或编码错误),则追加一个预定义的替换字符 kUtf8Replacement 并仅将偏移前进一个字节,以容忍非法序列。内部使用 reserve 预分配与输入等长的内存以避免多次重分配。该实现不依赖任何外部状态,完全基于对每个字节的局部解码决策。
Side Effects
No observable side effects are evident from the extracted code.
Reads From
textparametervalid_utf8_sequence_lengthfunctionkUtf8Replacementconstant
Writes To
- returned
std::string
Usage Patterns
- called by
write_utf8_text_fileto ensure valid UTF-8 before writing - called by
truncate_utf8to sanitize input before truncation
Calls
- function
clore::support::(anonymous namespace)::valid_utf8_sequence_length
Called By
- function
clore::support::truncate_utf8 - function
clore::support::write_utf8_text_file