Extract tables from PDF

art271
art271 Registered Posts: 3

Hello community, to perform RAG, I want to extract tables from PDFs. I would like to do this using Dataiku plugins, but the quality is not what I expect. Do you know of other methods to do this? Thanks !

Answers

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 330 Neuron

    We are using the Azure Document Intelligence solution for converting PDFs. It is working quite well. It converts tables to HTML format. We haven't looked specifically at how well the table conversion is working but on an overall basis the conversion seems to be quite accurate.