Ignoring input subfolder

KevinHart
KevinHart Registered Posts: 6 ✭✭✭

Hi All,

We are trying to load a daily updating dataset, we are using partitioning to do this.

However in the input folder a subfolder _delta_log is present which we want to ignore. The required data is in the date subfolders (see screenshot).

In the advanced options tab I see it is possible to exclude files but I cannot find any documentation on this subject. Can someone help me with an expression to ignore the delta log subfolder and all its contents?


Operating system used: Windows

Tagged:

Best Answer

  • Ignacio_Toledo
    Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron
    edited July 17 Answer ✓

    I think for the expression, you just need

    _delta_log

    dropping the single quotes that I used in the comment (which were to indicate the expression only).

    You could also try:

    _delta_log/*

    Hope this helps

Answers

  • Ignacio_Toledo
    Ignacio_Toledo Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 415 Neuron

    Hi @KevinHart

    I believe you are talking about this advanced option?

    exclude.png

    It means you have to give either a glob or regex expression to setup the rule. I'm not an expert on this kind of expressions (usually testing by trial or error, or using some kind of online tool), but if your only folder to avoid is _delta_log, then select Glob and then add '_delta_log'.

    Hope that helps.

  • KevinHart
    KevinHart Registered Posts: 6 ✭✭✭

    Hi @Ignacio_Toledo
    ,

    Many thanks for your reply!

    Unfortunately this did not help The sample file is still a file in the _delta_log subfolder, see screenshot.

    Perhaps a regex would work better in this case indeed.

  • KevinHart
    KevinHart Registered Posts: 6 ✭✭✭
    edited July 17

    Hi @Ignacio_Toledo
    ,

    Awesome!

    _delta_log/*

    Did the trick for me!

Setup Info
    Tags
      Help me…