File format conversion

Solved!
TelmanM
Level 2
File format conversion

Hey Dataiku users,

I just wanted to know how I can convert a very big binary data file to a human readable file like xml/ csv or anything that I can see the decoded data? 

Thank you!


Operating system used: Windows

0 Kudos
1 Solution
Turribeach

Well you need to ask whoever is producing these files to tell you what binary format they have. Then look for Python libraries that support reading these files. 

View solution in original post

6 Replies
tgb417

@TelmanM ,

 Welcome to the Dataiku community.  We are so pleased to have you join us.

Regarding ingesting binary files into Dataiku.  It will really depend on what type of binary file you are trying to ingest. And what type of data is represented in the binary file.  (Images, text, tabular, audioโ€ฆ) There are a bunch of built in formats and connections that dataiku can import.

https://doc.dataiku.com/dss/latest/connecting/index.html

This list is extended by plugins

https://www.dataiku.com/product/plugins/

And finally if you can program just a little bit, and can find a python or R library that can read the type of file you are working with you can create a code recipe to import your data through a code recipie

 https://knowledge.dataiku.com/latest/code/getting-started/concept-code-recipes.html

It is my guess with one of those three methods you can import almost any kind of binary file.

That all said, if you are comfortable sharing a bit more about your use case, as to the type or types of files you are trying to import, and something about the nature of the stored data.  There may be someone here in the community with some experience in that use case.

 

Have a great day and welcome to the community.  

--Tom
TelmanM
Level 2
Author

@tgb417 

Thank you for your prompt response.

Basically I do not know the type of binary file but I know that it comes with *.dmb extension and based on my very limited knowledge it contains very high frequency of data recorded by an electrical board. There is a way in matlab software by reading a *.m file to read the file but I am trying to do it from dataiku platform by uploading that file, though by uploading it into dataiku I got this error "Missing format type". I assume it is a tabular data with multiple features binary coded.

Many thanks,

0 Kudos
Turribeach

Are you sure it's *.dmb and not  *.mdb or  *.dmp?

TelmanM
Level 2
Author

The spelling is right.

 

0 Kudos
Turribeach

Well you need to ask whoever is producing these files to tell you what binary format they have. Then look for Python libraries that support reading these files. 

tgb417

@TelmanM 

After a quick look on the internet.  I see that these file may be some kind of game file.

https://file.org/extension/dmb

 

--Tom