Merge of two spreadsheets to combine
I am trying to do a merge between two spreadsheets so that I can end up with one file. I am clicking on "New Join Recipe" and I click on the existing Dataset in the project. But how do I add the second spreadsheet that is not in the project? It seems like my only options are to add things that are part of the project?
Best Answer
-
Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
Hi @davidhernandez
,I think the problem you are seeing is related to your DSS installation / configuration, and it will require further investigation or support from the dataiku team.
For that you'll need to share the log with the full error message so people can help you. Maybe a support ticket might be needed.
Are you using your own DSS installation? Or an instance provided by your institution/organization/company?
EDIT: I just saw you created a new post. @tgb417
answer is giving you all the details in order to find a solution. I think it would be safe to close this post so there is no duplication.
Answers
-
Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
Hi @davidhernandez
,You are correct, you can only join datasets that are in the same project. One solution, if you don't want to duplicate data, is to share the dataset of the other project with the project where you are making the join:
Hope this helps and solves your doubt!
-
I created my very first project. So what I did was upload one excel spreadsheet and named it. And then uploaded another excel spreadsheet and named it. I made sure that at least one of the columns in both spreadsheets was the same. So Dataiku will look for the same column. I selected "Join With" to merge both of the datasets, selected Input Datasets and Output Datasets, then clicked on Create Recipe. I am not sure what else I need to do, because I keep receiving an error code when I try to retrieve the output file:
"Oops: an unexpected error occurred. Job aborted due to stage failure: Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in stage 4.0 (TID 25...." etc...