To extract the first number before a text string, use a formula based on the VALUE and REGEXEXTRACT functions.
This tutorial is a part of our definitive guide to Excel formulas.
How to extract the first number before a text string
Steps to extract the first numeric value from the beginning of a text:
- Select cell.
- Type =VALUE(REGEXEXTRACT(B3, “\d+”))
- Press Enter.
- The formula extracts the first numeric value from the text in the specified cell.
Example
In this example, we have a dataset in the range B3:B7 containing various text strings, some of which include numerical values. The goal is to extract the first numeric value from each text string and display it in column D.
Formula:
=VALUE(REGEXEXTRACT(B3, “\d+”))
The formula in cell D3 uses the REGEXEXTRACT function to locate and isolate the first occurrence of numeric characters within the text, converting it to a numerical format using the VALUE function. This method ensures that the first number in each string is correctly identified and converted into a number for further calculations.
Explanation
Let us evaluate the formula from the inside out.
REGEXEXTRACT(B3, “\d+”): The REGEXEXTRACT function extracts the first match of a text pattern from the referenced cell (B3 in this case). The pattern “\d+” is a regular expression where: “\d” matches any numeric digit (0-9) and “+” ensures it matches one or more consecutive digits. Together, \d+ extracts the first group of one or more numbers from the text in B3. In the text “45 apples, 23 bananas, and 5 peaches”, the first group of digits is 45.
VALUE(REGEXEXTRACT(…)) converts the extracted text (from REGEXEXTRACT) into a numeric value. If the extracted value is already numeric (e.g., “45”), it simply converts it from text format to number format. This step is important because Excel reflects any data extracted by REGEXEXTRACT as text by default.
Extract the first numeric value with TEXTSPLIT and TEXTJOIN
If your Excel does not support regex functions, you can use a workaround.
Formula:
=TEXTBEFORE(TEXTJOIN(“,”, TRUE, IF(ISNUMBER(VALUE(SUBSTITUTE(TEXTSPLIT(B3, {“,”,” “,”:”}, TRUE), “$”, “”))), SUBSTITUTE(TEXTSPLIT(B3, {“,”,” “,”:”}, TRUE), “$”, “”), “”)), “,”)
In this formula, TEXTSPLIT(B3, {“,”,” “,”:”}, TRUE) breaks the text in cell B3 into multiple components based on delimiters (comma, space, and colon). This step isolates individual parts of the text string for evaluation.
SUBSTITUTE(TEXTSPLIT(…), “$”, “”) removes the dollar sign ($) from any split text components. This ensures numeric values with $ symbols (like monetary amounts) are correctly processed as numbers. If the text is: “The bill was $12”, this transforms $12 into 12.
VALUE(SUBSTITUTE(…)) converts the cleaned-up text into numeric values (if possible) to determine which components are valid numbers. Non-numeric text will result in errors. For example, “45” becomes 45 (a number), while “apples” remains non-numeric.
ISNUMBER(VALUE(…)) checks if each element is a number to create a logical array (TRUE/FALSE) identifying numeric values in the text.
IF(ISNUMBER(…), SUBSTITUTE(…), “”) filters the array, retaining only numeric values (as text) and discarding non-numeric parts to separate just the numbers while ignoring everything else.
TEXTJOIN(“,”, TRUE, IF(…)) combines all retained numeric values into a single string, separated by commas to simplify the output into one manageable string for further processing.
TEXTBEFORE(…, “,”) extracts the text portion before the first comma to return only the first numeric value in the original text string.
Related formulas
- Extract the last number from a string
- Get a number from any position in a string
- Extract the nth number from a text string
- Extract all numeric values from a text