The Dataiku Frontrunner Awards have just launched to recognize your achievements! Submit Your Entry

alphanumeric sting identification

PARA
Level 1

how to write a code to identify alphanumeric values in a column using visual recipe.

0 Kudos
3 Replies
arnaudde
Dataiker
Dataiker

Hello Para,
You can use the "Extract with regular expression" processor to check if a column contains only alphanumerical characters.

In a prepare recipe you can click on "Add a new step", search for regex, and click on "Extract with regular expression".

If you want to check if the column contains only alphanumerical content you can :
- put "^[a-zA-Z0-9_]*$" in the regular expression field  
- and check 'create a special found column'  


If you want to extract all alphanumerical occurrences :
- you can put "([a-zA-Z0-9_]+)" in the regular expression field  
- and check "extract all occurrences" 

Hope it helps,
Arnaud

PARA
Level 1
Author

this is not working the new column says false for this values "ABC000006498287"

0 Kudos
arnaudde
Dataiker
Dataiker

Hello Para, 
I can't reproduce your issue. As you can see in the screenshot, the regex works well for a cell with only "ABC000006498287" in it. 
Now if you have other characters in the cell it will fail. For instance, I added a trailing space in line 2.

Screen Shot 2021-01-22 at 09.24.13.png

Hope it helps

0 Kudos
A banner prompting to get Dataiku DSS