Why when using the Length () method to a string in C++, each symbol from Cyrillic is considered as two characters? The result, as if Length (), believes bytes.
# include & lt; iostream & gt;
#Include & lt; String & GT;
Using Namespace STD;
INT MAIN () {
String Name;
CIN & GT; & GT; Name;
COUT & LT; & LT; name.Length () & lt; & lt; '\ n';
Return 0;
}
Answer 1, Authority 100%
Cyrillic symbols in the UTF-8 encoding occupy 2 bytes.
For calculation, it is proposed to use the MBSTOWCS function here here https://stackoverflow.com/questions/ 5117393 / UTF-8-Strings-Length-In-Linux-C
Answer 2
Using utf8-cpp :
size_t lengthofutf8string (const std :: string & amp; utf8_string) {
Return utf8 :: Unchecked :: Distance (UTF8_String.begin (), UTF8_String.end ());
}
STD :: COUT & LT; & LT; Lengthofutf8String (Name) & lt; & lt; STD :: ENDL ();
Answer 3
Who said that two characters? This is if UTF-16, then two, and if UTF-32, then four, and if UTF-8, then in general the variable value …
A simple class of work with a string using C++ 11 / Boost here: https://github.com/sitev/cjcore/blob/master/src/object.cpp
Find the String class there is just enough to understand …