Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Ignoring input subfolder

Solved!
KevinHart
Level 2
Ignoring input subfolder

Hi All, 

We are trying to load a daily updating dataset, we are using partitioning to do this.

However in the input folder a subfolder _delta_log is present which we want to ignore.  The required data is in the date subfolders (see screenshot).

In the advanced options tab I see it is possible to exclude files but I cannot find any documentation on this subject.  Can someone help me with an expression to ignore the delta log subfolder and all its contents?


Operating system used: Windows

0 Kudos
1 Solution
Ignacio_Toledo

I think for the expression, you just need

_delta_log

dropping the single quotes that I used in the comment  (which were to indicate the expression only).

You could also try:

_delta_log/*

Hope this helps

View solution in original post

0 Kudos
4 Replies
Ignacio_Toledo

Hi @KevinHart 

I believe you are talking about this advanced option?

exclude.png

 

 

 

 

It means you have to give either a glob or regex expression to setup the rule. I'm not an expert on this kind of expressions (usually testing by trial or error, or using some kind of online tool), but if your only folder to avoid is _delta_log, then select Glob and then add '_delta_log'.

Hope that helps.

0 Kudos
KevinHart
Level 2
Author

Hi @Ignacio_Toledo ,

Many thanks for your reply!

Unfortunately this did not help 😞 The sample file is still a file in the _delta_log subfolder, see screenshot.

Perhaps a regex would work better in this case indeed.

0 Kudos
Ignacio_Toledo

I think for the expression, you just need

_delta_log

dropping the single quotes that I used in the comment  (which were to indicate the expression only).

You could also try:

_delta_log/*

Hope this helps

0 Kudos
KevinHart
Level 2
Author

Hi @Ignacio_Toledo,

Awesome! 

_delta_log/*

 

Did the trick for me!