-
Re: How to run Dataiku flow parallel for multiple different parameters
The design I proposed was a sample. You can break down the flow in many different ways. For instance you could break it by season so you will end up with two branches. Also you seem to have ignored m…1 · -
Re: How to save a Pyspark DataFrame to a managed folder
A Pyspark DataFrame is by definition a dataframe that only exists on your Pyspark engine so in order to save it in Dataiku you first need to bring to memory. You can do that by calling the toPandas()…1 · -
Re: File format conversion
Are you sure it's *.dmb and not *.mdb or *.dmp?1 · -
Re: File format conversion
Well you need to ask whoever is producing these files to tell you what binary format they have. Then look for Python libraries that support reading these files.1 · -
Re: How to run Dataiku flow parallel for multiple different parameters
You can't run your flow concurrently, it is not supported. To be able to do that you can use Dataiku Applications: https://knowledge.dataiku.com/latest/mlops-o16n/dataiku-applications/concept-da…1 ·