File format conversion

Telman
Telman Registered Posts: 7 ✭✭✭

Hey Dataiku users,

I just wanted to know how I can convert a very big binary data file to a human readable file like xml/ csv or anything that I can see the decoded data?

Thank you!


Operating system used: Windows

Best Answer

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron
    Answer ✓

    Well you need to ask whoever is producing these files to tell you what binary format they have. Then look for Python libraries that support reading these files.

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @TelmanM
    ,

    Welcome to the Dataiku community. We are so pleased to have you join us.

    Regarding ingesting binary files into Dataiku. It will really depend on what type of binary file you are trying to ingest. And what type of data is represented in the binary file. (Images, text, tabular, audio…) There are a bunch of built in formats and connections that dataiku can import.

    https://doc.dataiku.com/dss/latest/connecting/index.html

    This list is extended by plugins

    https://www.dataiku.com/product/plugins/

    And finally if you can program just a little bit, and can find a python or R library that can read the type of file you are working with you can create a code recipe to import your data through a code recipie

    https://knowledge.dataiku.com/latest/code/getting-started/concept-code-recipes.html

    It is my guess with one of those three methods you can import almost any kind of binary file.

    That all said, if you are comfortable sharing a bit more about your use case, as to the type or types of files you are trying to import, and something about the nature of the stored data. There may be someone here in the community with some experience in that use case.

    Have a great day and welcome to the community.

  • Telman
    Telman Registered Posts: 7 ✭✭✭

    @tgb417

    Thank you for your prompt response.

    Basically I do not know the type of binary file but I know that it comes with *.dmb extension and based on my very limited knowledge it contains very high frequency of data recorded by an electrical board. There is a way in matlab software by reading a *.m file to read the file but I am trying to do it from dataiku platform by uploading that file, though by uploading it into dataiku I got this error "Missing format type". I assume it is a tabular data with multiple features binary coded.

    Many thanks,

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron

    Are you sure it's *.dmb and not *.mdb or *.dmp?

  • Telman
    Telman Registered Posts: 7 ✭✭✭

    The spelling is right.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @TelmanM

    After a quick look on the internet. I see that these file may be some kind of game file.

    https://file.org/extension/dmb

Setup Info
    Tags
      Help me…