Dataiku Recipe that can copy specific columns from dataset

Options
GlaiTana
GlaiTana Registered Posts: 2

Hi Dataiku Community!

I am a new user and would like to seek your help if there is a specific recipe I can use to copy specific columns from a dataset? The table I am currently working on consists of 200++ fields, an I do not need them all to proceed with the project. This is also causing slowneed when I perform sync and filter, I believe.

Hope to receive your feedback soon, thanks in advance.

Tagged:

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,727 Neuron
    Options

    Hi, the most common recipe, the Prepare recipe, allows you to easily delete unwanted columns, reorder them and also rename them, among a lot of other transformations. If you switch to the columns view (next to Display, top right) you can select all columns by clicking on the top check box and then click on Action to delete them all. And then add the ones you want manually by deleting the relevant delete step. This will be quicker than deleting one by one if you just want a few columns.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @GlaiTana

    Welcome to the Dataiku community. We are glad to have you join us.

    I would agree with @Turribeach
    that the Prepare recipe is very helpful in trimming columns.

    When it comes to overall performance, I would also suggest that you look at the type of storage you are using for your data. Although the built in file based storage is quick and easy to start out. I have found moving to a database to be helpful with scaling. Even on the smallest Dataiku instances I have set up, I tend to use a PostgreSQL server. I have also found that the amount of RAM memory you have is helpful. More is better.

Setup Info
    Tags
      Help me…