Written by Debra Dalgleish from Contextures Blog
How can you count specific words in a text string, using an Excel formula? For example, a cell has this text – air, hair brush ,hot air, hair. The word “air” should be counted once. It shouldn’t be counted if it’s part of other words or phrases.
Cells with Text Strings
Here is the sample data that I used for this problem. To get the sample file, with the sample data and the Excel formulas, go to the Count Specific Text in Cell page on my Contextures site.
There are four cells with the following comma-separated words and phrases. There are extra spaces in some of the items, before or after the commas.
- air, hair brush ,hot air, hair
- house, dig, air , hair , air, dig
- house ,air flow,hair, air
At the top of the sheet, you can enter a search word or phrase in cell B2.
In column C, formulas count the number of times that exact word or phrase is found between the commas.
Rules for Counting Words
For this challenge, we don’t want to count the search word/phrase if it’s part of another word or phrase. We only want to count it when it’s on its own.
For example, there’s only 1 “air” in row 6. We won’t count the “air” in hair brush, hot air, or hair.
NOTE: If you DO want to count each occurrence, use the LEN/SUBSTITUTE formula shown on my site.
Google Sheets SPLIT Function
Before we look at Excel formulas, let’s take a quick look at how easy this solution is in Google Sheets.
Unlike Excel, Google Sheets has a SPLIT function.
Combine SPLIT with SUBSTITUTE, to remove the spaces, and COUNTIF, to get the count:
- =COUNTIF(SPLIT(SUBSTITUTE(B5,” “,””),”,”),$B$2)
Excel Formulas to Count Words
Unfortunately, Excel doesn’t have a SPLIT function, so we need to use a longer formula to get a count of the words.
- Excel 2013 or later: Use the FILTERXML formula, shown below
- Earlier versions: Use LEN and SUBSTITUTE. Go to my website for two solutions using these functions.
- Use a really long, all-in-one formula
- Or, to make the formula easier to understand, do some of the work in a helper column.
Separate Text with FILTERXML
While looking for ways to split text in Excel, I found a really creative way to do that – use the FILTERXML function. The first place I saw this trick was in a comment on Chandoo’s forum.
Splitting text isn’t what FILTERXML was designed for! Usually, it’s used with the WEBSERVICE function, to to extract specific items from text that’s already in XML format. You can see an example of that type of FILTERXML formula on the Microsoft site.
Anyway, some clever person realized that you could build the XML on the fly, from comma-separated text. Here’s what it would look like, all tidied up. The angle brackets mark the beginning and end of each item.
XML Data for FILTERXML Function
Here’s how we’ll use FILTERXML to split the text, so we can get a count of matching items.
The FILTERXML function has 2 arguments – XML and XPath.
- For the XML argument, we’ll use SUBSTITUTE to change comma-separated text into an XML structure.
- In the sample file, there are XML marker strings at the top of the sheet, in cell E1, F1 and G1.
This formula creates our XML code for the FILTERXML function:
- $E$1 & SUBSTITUTE(B6,”,”,$F$1) & $G$1
It’s not pretty, but the FILTERXML function will recognize those angle bracket start and end markers.
XPath for FILTERXML Function
The tricky part of this formula was the XPath argument. After many hours of digging through the internet, and experimenting with XPath variations, I finally created an XPath that did what I need.
The XPath is created with a formula in cell H1, at the top of the worksheet.
- =”//i[normalize-space()='” & B2 & “‘]/@n”
This XPath will return the “n” value for the items that match our search word (in cell B2).
There are more details on XPath further down this page, if you’re interested.
The FILTERXML Formula
Now that we have the XML and the XPath, here is the FILTERXML part of final formula.
The XML is in bold red, and the XPath is in bold italic blue.
- FILTERXML($E$1 &SUBSTITUTE(B5,”,”,$F$1) &$G$1,$H$1)
The result will be a list of 1s, or a zero, so next, we need to get a total of those numbers.
Get the Count from FILTERXML
To get the count, we can use the SUM function. Each matching item returns a 1, so 3 items will give us a sum of 3 – which is equal to the count that we need.
Finally, the formula is wrapped with IFERROR. It will return a zero, if no matches are found.
Here is the completed formula:
- =IFERROR(SUM(FILTERXML($E$1 &SUBSTITUTE(B5,”,”,$F$1) &$G$1,$H$1)),0)
I’ve highlighted the matching items in the comma-separated text, and the counts are all correct.
Experiments with FILTERXML XPath
- NOTE: You don’t need to read any further, unless you want to hear about the long and winding journey I took , while building this formula. Maybe you’ve been on similar trips!
Creating the XML data for the FILTERXML function was the easy part. The challenge was building an XPath that did what I needed.
The first FILTERXML example I found, in Chandoo’s forum, used a simple XPath – //i – to create a list of items.
In my first experiment, I used FILTERXML in a worksheet cell, with that simple XPath in cell E4
With this formula in cell E6, it created a list of all the items from cell B6:
- =FILTERXML($E$1 & SUBSTITUTE(B6,”,”,$F$1) & $G$1,E4)
To count the matching items, I put this formula in cell E5:
That worked nicely, and showed the correct count of 1.
COUNTIF and FILTERXML
Next, I needed a single formula, so I combined the 2 steps:
- =COUNTIF(FILTERXML($E$1 & SUBSTITUTE(B6,”,”,$F$1) & $G$1,E3),$B$2)
Excel was not impressed! Right – COUNTIF doesn’t work with formula arrays.
SUMPRODUCT and FILTERXML
The SUMPRODUCT function is more forgiving, so I tried it next, using this formula in cell E5:
That worked correctly, so I combined it with the FILTERXML formula:
- =SUMPRODUCT((FILTERXML($E$1 & SUBSTITUTE(B6,”,”,$F$1) & $G$1,E4)=B2)
* (FILTERXML($E$1 & SUBSTITUTE(B6,”,”,$F$1) & $G$1,E4)<>””))
The combined formula worked correctly, but wow, that’s a long formula!
COUNT and FILTERXML
Another test showed that COUNTA works with formula-created arrays. However, I’d need an XPath that filters out the items that match the search word.
Believe me when I tell you that it’s hard to find XPath examples for that on the internet.
Finally, I found this example – //i[.=’air’] – which worked, but I had to add another SUBSTITUTE, to remove spaces.
- =FILTERXML($E$1 & SUBSTITUTE(SUBSTITUTE(B7,” “,””),”,”,$F$1) & $G$1,E4)
Also, that formula created a #VALUE! error, if the search word wasn’t found.
And it counted the error! That’s no good.
And, if I wrapped the formula with IFERROR, to return an empty string, it counted the empty string.
Return a Number
What if the XPath could return a number, instead of the text item? If it could, I’d return a zero in IFERROR, and use SUM to total the results!
On this new mission, I went deeper and deeper into the XPath rabbit hole, but couldn’t find what I needed.
Finally, I remembered seeing an XPath example that filtered based on an attribute. Maybe I could add an attribute of “1” to each item, and return that attribute for the filtered items.
So, I changed the data/item text strings at the top of the sheet, in cells E1 and F1.
Now, each item has an “n” attribute with a value of “1”.
Change the XPath
More Googling found an XPath setting that would return those attributes. The XPath ended with the AT symbol, and the attribute name:
Also, I found an XPath function to trim the leading, trailing, and duplicate spaces:
And here is the final XPath string that I used for this project, after days of wandering through the internet:
And this is the final formula, that counts the matching items:
=IFERROR(SUM(FILTERXML($E$1 &SUBSTITUTE(B5,”,”,$F$1) &$G$1,$H$1)),0)
It was an exhausting journey through the FILTERXML and XPath rabbit holes, but rewarding in the end!
Get the Sample File
To get the sample file, go to the Count Specific Text in Cell page on my Contextures site.
The zipped file is in xlsx format, and does not contain any macros.
There is also a link to the Google Sheets version of the sample data and formulas. That file is VIEW ONLY, but you can create a copy to edit.
How to Count Specific Words in Excel Text String
How to Count Specific Words in Excel Text String is a post from Contextures Blog and is not allowed to be copied to other sites
Original source: https://contexturesblog.com/archives/2020/12/10/how-to-count-specific-words-in-excel-text-string/?utm_source=rss&utm_medium=rss&utm_campaign=how-to-count-specific-words-in-excel-text-string