Is there a way to count specific types of characters in a text cell?

Dataiker, Alpha Tester Posts: 535 Dataiker
I am trying to enrich a dataset containing product names and descriptions and I would like to extract the number of words / letters capitalized and non-capitalized / numbers in certain columns.
Is there any way to do this easily?
Is there any way to do this easily?
Best Answer
Hi Vincent,
One way to do it is to use a Custom Python Script in Analyze. You can easily implement your logic this way. For example, if you want to test for specific values in a string, you could do the following:
import json
def process(row):
# Initialize counters
_uppers = 0
_lowers = 0
_commas = 0
_digits = 0
for character in row['name']:
if character.isupper(): # check for uppercase values
_uppers = _uppers + 1
if character.islower(): # check for lowercase values
_lowers = _lowers + 1
if character == ',': # check for commas
_commas = _commas + 1
if character.isdigit(): # check for numbers
_digits = _digits + 1
return json.dumps({
'count_uppercase_values': _uppers,
'count_lowercase_values': _lowers,
'count_commas': _commas,
'count_digits': _digits,
})The cool thing is that you output as many counts as you want and pass it to a Flatten JSON processor to create your columns.