How to save a Pyspark DataFrame to a managed folder
stephl
Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8 ✭
- Hello, Community.
May I know how I can use Pyspark recipe to save my pyspark dataframe as csv file to a output managed folder?
I have searched in the community, but most of posts cover pandas dataframe only....
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,170 Neuron
A Pyspark DataFrame is by definition a dataframe that only exists on your Pyspark engine so in order to save it in Dataiku you first need to bring to memory. You can do that by calling the toPandas() method.