Connect to big MS Access file

Romana Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 15 ✭✭✭✭


we are trying to pull data from MS Access files using UCanAccess 4.0.4. This works for most of the tables.

However we have a problem when extracting even small tables (12 MB) from bigger MS Access files (bigger than 500 MB). Connection crashes with no specific error. Is there any way how to access big MS Access files? For example by setting up some parameters in the connection? Especially when we are only interested in 1 relatively small table from the MS Access database that contains many others?

Thank you for any suggestion!



  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron

    As I was thinking about your problem. A work around that you may have considered came to mind. For now, you might go into ms access and just export the table as .csv or ,XLSX then import the table to your dataiku project. This will help make sure your data is useful to your project and will help to confirm that you are not dealing with a corruption in the ms access file.

  • Liev
    Liev Dataiker Alumni Posts: 176 ✭✭✭✭✭✭✭✭

    Hi @Romana

    Thank you for your question. Support for MS access is achieved using the driver you mentioned and it's indeed the latest tested that seems work.

    Re parameters, you could check in their docs whether some optimisation might be achieved, but this would be outside the realm of DSS.

    The suggestion by @tgb417
    is not bad at all if you only need a small table, then perhaps extracting it and querying it separately might be best. If you're unable to do this due to the MS access database being in use and potentially changing, then you may consider changing your design in order to either:

    - switch from MS access (to MS Sql for example) or

    - splitting into two MS access databases and linking them. with this approach DSS would not be querying one monolithic database, but smaller ones.

    Good luck!

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭

    Hey @Romana
    sounds like you may have already seen this. However, just incase here is a Knowledge Base article that you may find helpful: .

    Your in good hands regardless and best of luck to you!

  • Romana
    Romana Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Registered Posts: 15 ✭✭✭✭

    Thank you all for your answers, however, I can't modify the MS Access DB before loading it into Dataiku. Neither can I export tables as csv/xlsx directly from MS Access as we run on linux.

    Luckily, there is a python code that exports all tables from .accdb/.mdb file into .csv using UCanAccess 4.0.4. This is the only workaround so far that can be incorporated in automated scenario in Dataiku.

Setup Info
      Help me…