LEFT JOIN - how to change the target file

Options
davidhernandez
davidhernandez Registered Posts: 19 ✭✭✭✭

I am attempting to do a LEFT JOIN between two different excel CSV files (to join both files into one). I know that I have to change the target file to filesystem folders to store the data in tables, but how do I change the target file type?

Answers

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Options

    Hi,

    In the Join recipe, go to the input/output tab and change the output file. It allows you to change to a different connection in "Store into". See my example below.

    I hope this helps.

    Best regards,

    Manuel

  • davidhernandez
    davidhernandez Registered Posts: 19 ✭✭✭✭
    Options

    Yes! @Manuel
    , thank you! I tried that, but now the Output File says: "Root path of the dataset does not exist. This error is typically caused by either (1) A dataset configuration issue. You need to modify the affected settings to fix the issue. (2) The dataset needs to be rebuilt, if it is a target in the Flow.

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Options

    Did you re-run the recipe?

  • davidhernandez
    davidhernandez Registered Posts: 19 ✭✭✭✭
    Options

    @Manuel
    Yes I did. I uploaded two different excel spreadsheets, then clicked on JOIN WITH in an attempt to merge them. I had this problem before, and my DSS Admin told me I need to be working with CSV/excel files as the target file. So in the Output file I just created a "new name" for the Output file - and select "filesystem folder"...

  • davidhernandez
    davidhernandez Registered Posts: 19 ✭✭✭✭
    Options

    @Manuel
    ERR_FSPROVIDER_ROOT_PATH_DOES NOT_EXIST: Root Path of the dataset or folder does not exist. The root path of the dataset or folder does not exist. DSS is trying to access data that is not there.

    Remediation: Check the settings of the dataset or managed folder. The error message details what path was not found in the connection. You may need the assistance of someone with write access to that storage to create the folder for you. In some cases, the configuration issue can be at the connection level, in which case it must be fixed by your DSS Administrator.

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Options

    Hi, I noticed that you had a similar thread in the community around the same issues.

    Can you confirm what engine you are using? Please try the recipe with the DSS engine.

    The fact that both input are in CSV does not condition the connection for the output dataset. If you have access to another connection (like a database), try changing the output to that.

    In summary, the flow that you describe is not complex and it always works for me, so there must be something else at play.

    I hope this helps.

Setup Info
    Tags
      Help me…