Compare multiple columns and return the most frequent word

OllieCramer · August 2019

Hi,

I'd like to write a formula that compares multiple columns and returns the most frequent word. I am aiming to aggregate multiple machine learning models to see if it improves accuracy. As an example, based on the image below the 1st row would return "Handling" as this is the most common word in the SVC, RandForest and LogisticRegression. The 2nd row would return "Handling - Operations" - ignore TicketRootCause as this is the real answer.

I have done this in excel with the formula below but can't find the functions in DSS. Any ideas of how I could do this? Either based off converting the excel function below into DSS or another method?

=INDEX(F2:N2,MODE(MATCH(F2:N2,F2:N2,0)))

Thanks,

Ollie

Alan_Fusté · August 2019

Hi Ollie,

You probably have to add a "Python function" step with "Add a new cell for each row".

Then you can find the most common word with python code. For that you can find different options at https://stackoverflow.com/questions/48606406/find-most-frequent-value-in-python-dictionary-value-with-maximum-count (for me the best answer is using collections.Counter).

Compare multiple columns and return the most frequent word

Answers

Categories

Setup Info

Tags