php - How to split a string into array elements depending on the PHP character type

Please help me write a universal regular expression to split a string into numbers, a combination of letters in a row and a space.

There is a line corresponding to the United Kingdom postal code. It is necessary to split it into parts and form a common array from them in the same order in which they are written into the string. Each part must match:

a letter or several letters in a row, without numbers and other characters, in particular a space, up to the first NOT matching character
one or more consecutive digits (number) up to the first NOT matching character
another character (in my example this is only one likely space, I represent it as not a letter or a number)

By convention, there can be an indefinite number of array elements, but they are limited by the kingdom postal code template. Here is a picture that perfectly describes the template:

^{Borrowed from www.getthedata.com}

where:

L – letter
N – digit
A – number or letter
SPACE – space
A Optional mark above a symbol means that it will not necessarily be present.
1. Example 1:
  
  The code SW1A 1AA should after transformation become an array with
  with the following set of elements:
```
['SW', '1', 'A', '', '1', 'AA']
```
2. Example 2:
  
  A piece of code SW1A should become an array with the following set
  elements:
```
['SW', '1', 'A']
```
3. Example 3:
  
  A piece of code SW22 3 should become an array with the following set
  elements:
```
['SW', '22', '', '3']
```

Postscript : Particularly difficult for me is the fact that the whole postal code can be broken into parts, as well as a separate part of it, down to an empty line. Not being an expert in regular expressions, this task was too much for me: (Perhaps there is a simpler universal solution than using the preg_split function?

Answer 1, authority 100%

You can tokenize a string like this (see online demo ):

$ strs = ["SW1A 1AA", "SW1A", "SW22 3"];
foreach ($ strs as $ s) {
 if (preg_match_all ('~ [a-zA-Z] + | \ d + | [^ \ da-zA-Z] + ~', $ s, $ chunks)) {
  print_r ($ chunks [0]);
 }
}
// - Array ([0] = & gt; SW [1] = & gt; 1 [2] = & gt; A [3] = & gt; [4] = & gt; 1 [5] = & gt; AA)
// - Array ([0] = & gt; SW [1] = & gt; 1 [2] = & gt; A)
// - Array ([0] = & gt; SW [1] = & gt; 22 [2] = & gt; [3] = & gt; 3)

Regular expression here (see demo )

[a-zA-Z] + | \ d + | [^ \ da-zA-Z] +

It finds

[a-zA-Z] + – one or more Latin letters
| – or
\ d + – one or more digits
| – or
[^ \ da-zA-Z] + – – one or more characters other than numbers and Latin letters

If you need Unicode support, use '~ \ p {L} + | \ p {N} + | [^ \ p {N} \ {L}] + ~ u' .

How to split a string into array elements depending on the PHP character type

Answer 1, authority 100%

Programmers, Start Your Engines!

Recent questions

yandex cards disappear labels with zoom

Embarcadero C++ Builder 10.3 does not give prompts by code

Found input variables with inconsistent numbers of samples error

Return to previous page

Lua C++ error handling