How to make python work with utf-8 encoding?
A file.pyc is generated (or not generated, I didn’t understand it myself) with win1251 headers from utf file.py
The actual source file. Perhaps it will help.
File is utf-8 encoded without BOM
#! / usr / bin / python # - * - coding: utf-8 - * - import string import sys print ('' 'Content-type: text / html' '') print ('' ' & lt; html & gt; & lt; head & gt; & lt; title & gt; Check Python & lt; / title & gt; & lt; / head & gt; & lt; body & gt; Russian text - Nerusskii text & lt; br & gt; & lt; ul & gt; '' ')
And in the browser I get a page with headers
Content-Type: text / html; charset = UTF-8
and content. (The browser looks in utf-8, and the file is win1251)
& lt; html & gt; & lt; head & gt; & lt; title & gt; Python & lt; / title & gt; & lt; / head & gt; & lt; body & gt; - Nerusskii text & lt; br & gt; & lt; ul & gt;
UP (Many years later)
The problem ended up in Apache.
AddDefaultCharset UTF-8 SetEnv PYTHONIOENCODING utf8
Answer 1, authority 100%
Try adding py to your header
## - * - coding: utf-8 - * -
Answer 2, authority 60%
- In what program is the code itself written?
- Have you tried experimenting, for example, specify the encoding as win1251 in the headers. It is also possible that the font used in the browser does not know anything about Russian letters in UTF (I remember earlier in windows XP because of such gestures, otherwise, like in Vista, I had to hard-code the font in the system, since many applications not only did not know about Russian but also about many other languages.)
- It is also worth trying to hardcode the print functions about the encoding used for the string.
- It is also not entirely clear where the print function outputs your messages, it is possible that what the output receives does not know about Russian letters. (when the “system” does not know about Russian letters and receives at least a letter in the form of a set of bytes, it splits it into two (remember SMS, you can write more in English characters than in Russian) and we get 2 completely independent characters at the output, and since they exist, then naturally in further processing they go as 2 completely different characters and not our Russian letter.Although in your case it looks like this is not quite the same, since the number of characters here is not 2 times more, but I think the principle is clear to you.)