Object functions of the formula language have some more advanced capabilities, https://doc.dataiku.com/dss/latest/advanced/formula.html?highlight=parsehtml#object-functions
To do better you will need to use code, the easiest is to use Python, the package BeautifulSoup will help you.