Alex,
Traditional clustering, e.g. unsupervised learning that is provided in DSS clustering recipes, is very different from the type of fuzzy matching that is being done here. The problems you describe above, though, are exactly why we need to be able to do this programmatically. The link that Sam posted: http://www.padjo.org/tutorials/open-refine/clustering/ shows the OpenRefine text facet clustering that appears to be the capability that DSS is leveraging within prepare recipes.
So the question is simply whether it is possible to make calls from Python to the OpenRefine server that we believe to be running with DSS (as this shows: https://doomicile.de/story/simple-text-analysis-using-python-identifying-named-entities-tagging-fuzzy-string-matching-and ), or whether we need to install our own OpenRefine server or seek a different programmatic solution.
Thank you for your time and help.
Best,
John