Validating phone numbers with regular expressions

Regular expressions are powerful tools for describing patterns in text, including phone numbers. You can use this function to validate whether a collected phone number is well-formed.

In this tutorial, we'll look at how to build up a regular expression step-by-step to validate collected phone numbers.

Understanding the domain

Before writing a regular expression, it's important to understand the expected format of the text. In this example, we'll look at validating mobile phone numbers from Democratic Republic of the Congo (DRC).

In the DRC, the numbering scheme for mobile phones is as follows:

  • Vodacom: 81X XXX XXX or 82X XXX XXX
  • Supercell: 80X XXX XXX
  • Congo-Chine Telecom: 84X XXX XXX
  • Celltell: 97X XXX XXX, 98X XXX XXX. or 99X XXX XXX

Where "X" is any digit, 0-9.

We can use the patterns above to write a validation rule that tests for a valid mobile number.

A simple regular expression

Let's start by matching the Vodacom numbers. Each Vodacom mobile number starts with "81" or "82" and another digit, and then two groups of three digits:

REGEXMATCH(PHONE_NUMBER, "8[12][0-9] [0-9][0-9][0-9] [0-9][0-9][0-9]")

Here we use the character class notation to specific a range of characters:

  • [12] matches either "1" or "2"
  • [0-9] matches any character between "0" and "9"

We can shorten this expression by using repeating character classes:

REGEXMATCH(PHONE_NUMBER, "8[12][0-9] [0-9]{3} [0-9]{3}")

The {3} indicates that the character before must be matched exactly three times.

Matching any of the mobile providers

The big difference between the different providers is the first group of digits:

Provider Pattern of first group
Vodacom 8[12][0-9]
Supercell 80[0-9]
Congo-Chine 84[0-9]
Celltell 9[789][0-9]

We can match any of the providers above by grouping them together using parentheses, and using the pipe operator (|) to specify alternatives:

REGEXMATCH(PHONE_NUMBER, "((8[12][0-9])|(80[0-9])|(84[0-9])|(9[789][0-9])) [0-9]{3} [0-9]{3}")

Actually, the only difference between the first three providers is the second digit, so we could further simply to:

REGEXMATCH(PHONE_NUMBER, "((8[0124][0-9])|(9[789][0-9])) [0-9]{3} [0-9]{3}")

However, this will match a phone number wherever it occurs in the text. So "801 343 933" will match, but so will "TEL: 801 343 933". We'd prefer that only the phone number is entered into the text field, so we can use ^ symbol to require matching at the beginning of the field, and $ to require matching at the end:

REGEXMATCH(PHONE_NUMBER, "^((8[0124][0-9])|(9[789][0-9])) [0-9]{3} [0-9]{3}$")

Calculating the name of the mobile providers

We can also use the regular expression to calculate the name of the mobile provider from the phone number using the IF function together with the REGEXMATCH function:

IF(REGEXMATCH(PHONE_NUMBER, "^8[12][0-9]"), "Vodacom",
 IF(REGEXMATCH(PHONE_NUMBER, "^80[0-9]"), "Supercell",
IF(REGEXMATCH(PHONE_NUMBER, "^84[0-9]"), "Congo-Chine",
     IF(REGEXMATCH(PHONE_NUMBER, "^9[789][0-9]"), "Celltell", 

Building the form

Now that we have our two formulas, we can add a data collection form in ActivityInfo.

Add the form

  • Navigate to your database.
  • Click "Add Form"
  • Enter the form name "Registration"

Add the phone number field

  • Add a text field with the name "Phone Number" and the code PHONE_NUMBER.
  • Check "Set Validation Rules"
  • Paste the validation formula above into the validation rules box, or open the formula editor

Screenshot of phone number field
Screenshot of phone number field

Add the mobile provider field

  • Add a calculated field called "Mobile Provider"
  • In the formula editor, paste the IF-statement from above.

Screenshot of mobile provider field
Screenshot of mobile provider field

Testing the form

When you open the form in data entry, typing a phone number that doesn't match the format for DRC should result in a validation error:

Screenshot of form with validation error
Screenshot of form with validation error

When the input does match, the field is validated, and the calculated field shows the correct mobile provider:

Screenshot of valid form
Screenshot of valid form

You can try the form yourself.

Further reading

Next item
Validating writing scripts