3 Complex Excel Extraction Problems Solved and Explained
Many people struggle with extracting information from complex cells in Microsoft Excel. The many comments and questions in response to my article on How to Extract a Number or Text from Excel with this Function proves it. Apparently, it’s not always clear how to isolate the desired data from an Excel sheet.
We’ve chosen a few of the questions from that article to walk through here, so you can see how the solutions work. Learning Excel quickly isn’t easy, but using real-world problems like these helps a lot.
1. Extraction Using a Separator
Reader Adrie asked the following question:
I would like to extract the first set of numbers from a list, i.e. (122,90,84,118.4,128.9)
Any ideas on what formula I can use?
The fact that the numbers to be extracted are different lengths makes this a bit more complicated than just using the MID function, but by combining a few different functions, we can get rid of the letters on the left as well as the “X” and the numbers on the right, leaving only the desired numbers in a new cell.
Here’s the formula we’ll use to solve this problem:
Let’s start in the middle and work our way out to see how it works. First, we’ll start with the FIND function. In this case, we’re using FIND(“X”,A1). This function looks through the text in cell A1 for the letter X. When it finds an X, it returns the position that it’s in. For the first entry, COIL112X2.5, for example, it returns 8. For the second entry, it returns 7.
Next, let’s look at the LEN function. This simply returns the length of the string in the cell minus 4. Combining that with the RIGHT function, we get the string from the cell minus the first four characters, which removes “COIL” from the beginning of every cell. The formula for this part looks like this: RIGHT(A1,(LEN(A1)-4)).
The next level out is the LEFT function. Now that we’ve determined the position of the X with the FIND function and gotten rid of “COIL” with the RIGHT function, the LEFT function returns what’s left. Let’s break this down a little further. Here’s what happens when we simplify those two arguments:
The LEFT function returns the three leftmost characters of the string (the “-5” at the end of the argument ensures that the right number of characters are eliminated by accounting for the first four characters in the string).
Finally, the VALUE function ensures that the number returned is formatted as a number, instead of as text. This example isn’t as fun as building a working game of Tetris in Excel , but it does serve as a great example of how to combine a few functions to solve a problem.
2. Pulling Numbers Out of Mixed Strings
A similar question was asked by reader Yadhu Nandan:
I want to know how to separate the numerical and alphabets.
Example is – 4552dfsdg6652sdfsdfd5654
I want 4552 6652 5654 in different cells
Another reader, Tim K SW, provided the perfect solution to this particular problem:
Assuming that 4552dfsdg6652sdfsdfd5654 is in A1 and you want these numbers extracted into 3 different cells. 4452 in say cell B1 and 6652 in C1 and 5654 in D1 you’d use these.
The formulas are fairly simple, but if you’re not familiar with the MID function, you might not understand how it works. MID takes three arguments: a cell, the character number where the result starts, and the number of characters that should be pulled. MID(A1,10,4), for example, tells Excel to take four characters starting at the tenth character in the string.
Using the three different MID functions in three cells pulls the numbers out of this long string of numbers and letters. Obviously, this method only works if you always have the same number of characters — both letters and numbers — in each cell. If the number of each varies, you’ll need a much more complicated formula.
3. Getting a Number from a Mixed String with Spaces
One of the most difficult requests posted on the article was this one, from reader Daryl:
Hi, I just want to ask how to separate very unformatted entry like:
Rp. 487.500 (Nett 50% OFF)
Rp 256.500 Nett 40% OFF
Rp 51.000/orang Nett (50% Off)
I spent some time working on this one, and wasn’t able to come up with a solution myself, so I took to Reddit. User UnretiredGymnast provided this formula which is long and very complex:
As you can see, decoding this formula takes quite a while, and requires a pretty solid knowledge of a lot of Excel functions . To help figure out exactly what’s going on here, we’ll need to look at a few functions you might not be familiar with. The first is IFERROR, which catches an error (usually represented with a pound sign, like #N/A or #DIV/0) and replaces it with something else. In the above, you’ll see IFERROR(SEARCH(“%”,A1)-4,LEN(A1)). Let’s break this down.
IFERROR looks at the first argument, which is SEARCH(“%”,A1)-4. So, if “%” appears in cell A1, its location is returned and four is subtracted from it. If “%” does not appear in the string, Excel will create an error, and the length of A1 is returned instead.
The other functions you might not be familiar with are a bit simpler; SUMPRODUCT, for example, multiplies and then adds elements of arrays. LARGE returns the largest number in a range. ROW, combined with a dollar sign, returns an absolute reference to a row.
Seeing how all of these functions work together isn’t easy, but if you start in the middle of the formula, and work outwards, you’ll start to see what it’s doing. It’ll take a while, but if you’re interested in seeing exactly how it works, I recommend putting it into an Excel sheet and playing around with it. That’s the best way to get an idea of what you’re working with.
(Also, the best way to avoid a problem like this is to import your data more neatly — it’s not always an option, but it should be your first choice when it is.)
One Step at a Time
As you can see, the best way to solve any Excel problem is one step at a time: start with what you know, see what it gets you, and go from there. Sometimes you’ll end up with an elegant solution, and sometimes you’ll get something that’s really long, messy, and complex. But as long as it works, you’ve succeeded!
And don’t forget to ask for help . There are some extremely talented Excel users out there, and their help can be invaluable in solving a tough problem.
Do you have any suggestions on solving tough extraction problems? Do you know of any other solutions to the problems we tackled above? Share them and any thoughts you have in the comments below!