I am writing a regular expression for preg_split () so that the string is split at spaces + all possible punctuation marks. While it looks like this:
$ pattern = "/ [\s,;.\!\-\?\:\(\)>+/ ";
-
How to add an em dash, all possible quotes, etc. to an expression. (maybe you forgot any other punctuation marks)?
-
preg_split ()
with this expression adds an empty element to the end of the array. It is clear that it can be cut out later, but I do not like this solution. How can I correct the expression so that there is no empty element?
Thank you!
Answer 1, authority 100%
-
- It is necessary to decide what exact result is needed. If only punctuation marks, then you have to completely enumerate all desired characters in the character class, including
em dash, all possible quotes, etc.
– there is no generic character class to describe this. -
If you want to achieve a result that divides by all literals that are not spaces and letters, that is, the character class
[: punct:]
. In a regular expression, it will look like this:/ [\ s [: punct:]] + /
- It is necessary to decide what exact result is needed. If only punctuation marks, then you have to completely enumerate all desired characters in the character class, including
-
To exclude empty elements, just pass the
PREG_SPLIT_NO_EMPTY
flag to the functionpreg_split
preg_split ("/ [\ s [: punct:]] + /", $ text, -1, PREG_SPLIT_NO_EMPTY)