Simplify text in Prepare recipe having a bug?

jp1
Level 2
Simplify text in Prepare recipe having a bug?

Hello community, 

Has sorting option in simplify text step of prepare recipe having a bug? I could see sorting is not giving expected result .
I've taken this example below

live poultry ducks breeding ducklings

after applying stemwords,clearstopwords,sortwords alphabetically options I'm getting output as 

breed duckl duck live poultri

but it's the expected result , below is the expected result 

breed duck duckl live poultri--> duck should sort first and then duckl should come next 

can anyone help me out how these sorting happening? stemwords,clearstopwords,sortwords alphabetically are these step sequentially?? if yes then why not it's giving expected result? or it's sorting the data first later steemming , clearing stopwords happening? 

Anyone aswer this as soon as possible?

Thanks in advance for investing time on this!!

0 Kudos
1 Reply
SarinaS
Dataiker

Hi @jp1,

Thank you for reporting the sorting issue you found when using the "Simplify text" processor including "sort words alphabetically". I indeed can reproduce the issue you are reporting. I will pass this along to our engineering team for further investigation. 

In the meantime, adding a secondary "Simplify text" processor step with "sort words alphabetically" seems to lead to expected results, so I would suggest adding the processor twice for your use case:

Screenshot 2023-12-15 at 12.31.35 PM.png


Thanks,
Sarina 

 

0 Kudos