Tell me the regular, which should cut any characters, except for the comma (only Cyrillic and Latinnia should remain). Made outline, it seems to work, but I don’t know the comma how to leave.

$ Key_Words = MB_STRTOLOWER ($ Key_Words);
$ Key_Words = Preg_replace ("/ [^ a-za-i \ s] / iu", "", $ Key_Words);

Answer 1, Authority 100%

If you need to remain only letters (Cyrillic + Latina) and comma, the code should have a look:

preg_replace ("/ [^ a-yaya-z,] / iu", ", $ target);


It is worth paying attention to that the regular expression above corresponds only to parts Cyrillic, namely, Russian alphabet . In general, the concept of Cyrillic is essentially wider, and includes the characters used not only in the modern Russian alphabet, but also in a number of other Slavic languages. Cyrillic Unicode characters can be seen, for example, here .

The same, concerns latice. Formally, in addition to the symbols of the English alphabet (/ [A-Z] / I ), there are many other characters, for example, ñ from the Spanish alphabet. The regular expression above, corresponds only to the symbols English alphabet, and not to all Latin symbols.


If you really want to leave the string characters that formally enter Latin or Cyrillic, then you can use the following code:

preg_replace ("/ [^, \ p {cyrillic} \ p {latin}] / ui", ", $ target);

This regular expression leaves very a large set of characters (for example, Cyrillic characters ). I do not think that it is worth using it at all, but for those who formally apply to the concepts of Latin and Cyrillic , it may be interesting.

Answer 2, Authority 36%

I think, if you use Unicode, then the classes of signs need to use from there:

$ Key_Words = preg_replace ("/ [^ \ p {l}, \ s] / u", "", $ Key_Words);

Here we leave \ p {l} – all letter characters, , – comma, \ s – Blind symbols (for Blank need to be replaced by Symbol of the Blank ).

Test example https://regex101.com/r/wf4gz2/1

