How to save a Pyspark DataFrame to a managed folder

Options
stephl
stephl Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 8
  • Hello, Community.

May I know how I can use Pyspark recipe to save my pyspark dataframe as csv file to a output managed folder?

I have searched in the community, but most of posts cover pandas dataframe only....

Best Answer

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,708 Neuron
    Answer ✓
    Options

    A Pyspark DataFrame is by definition a dataframe that only exists on your Pyspark engine so in order to save it in Dataiku you first need to bring to memory. You can do that by calling the toPandas() method.

Setup Info
    Tags
      Help me…