Home php How to split a string into array elements depending on the PHP...

How to split a string into array elements depending on the PHP character type

Author

Date

Category

Please help me write a universal regular expression to split a string into numbers, a combination of letters in a row and a space.

There is a line corresponding to the United Kingdom postal code. It is necessary to split it into parts and form a common array from them in the same order in which they are written into the string. Each part must match:

  • a letter or several letters in a row, without numbers and other characters, in particular a space, up to the first NOT matching character

  • one or more consecutive digits (number) up to the first NOT matching character

  • another character (in my example this is only one likely space, I represent it as not a letter or a number)

By convention, there can be an indefinite number of array elements, but they are limited by the kingdom postal code template. Here is a picture that perfectly describes the template:

 UK Postal Code Template

Borrowed from www.getthedata.com

where:

  • L – letter
  • N – digit
  • A – number or letter
  • SPACE – space
  • A Optional mark above a symbol means that it will not necessarily be present.


    1. Example 1:

      The code SW1A 1AA should after transformation become an array with
      with the following set of elements:

      ['SW', '1', 'A', '', '1', 'AA']
      
    2. Example 2:

      A piece of code SW1A should become an array with the following set
      elements:

      ['SW', '1', 'A']
      
    3. Example 3:

      A piece of code SW22 3 should become an array with the following set
      elements:

      ['SW', '22', '', '3']
      

Postscript : Particularly difficult for me is the fact that the whole postal code can be broken into parts, as well as a separate part of it, down to an empty line. Not being an expert in regular expressions, this task was too much for me: (Perhaps there is a simpler universal solution than using the preg_split function?


Answer 1, authority 100%

You can tokenize a string like this (see online demo ):

$ strs = ["SW1A 1AA", "SW1A", "SW22 3"];
foreach ($ strs as $ s) {
 if (preg_match_all ('~ [a-zA-Z] + | \ d + | [^ \ da-zA-Z] + ~', $ s, $ chunks)) {
  print_r ($ chunks [0]);
 }
}
// - Array ([0] = & gt; SW [1] = & gt; 1 [2] = & gt; A [3] = & gt; [4] = & gt; 1 [5] = & gt; AA)
// - Array ([0] = & gt; SW [1] = & gt; 1 [2] = & gt; A)
// - Array ([0] = & gt; SW [1] = & gt; 22 [2] = & gt; [3] = & gt; 3)

Regular expression here (see demo )

[a-zA-Z] + | \ d + | [^ \ da-zA-Z] +

It finds

  • [a-zA-Z] + – one or more Latin letters
  • | – or
  • \ d + – one or more digits
  • | – or
  • [^ \ da-zA-Z] + – – one or more characters other than numbers and Latin letters

If you need Unicode support, use '~ \ p {L} + | \ p {N} + | [^ \ p {N} \ {L}] + ~ u' .

Programmers, Start Your Engines!

Why spend time searching for the correct question and then entering your answer when you can find it in a second? That's what CompuTicket is all about! Here you'll find thousands of questions and answers from hundreds of computer languages.

Recent questions