Folder connected to S3 bucket is enumerating all files by default - memory leak

Tags
Dataiku DSS Core Designer, Registered Posts: 3 ✭✭✭

Hi,

I'm having a Folder in my pipeline that is connected to an S3 bucket containing Millions of files.

I noted an odd behaviour while running a python recipe: every time this folder is an input of a recipe, it will enumerate all its files by default (even if I don't create the object in code). Since there are Millions of files, the build will take ages before running into a memory leak!

Anyone ones how to suppress this default behaviour from S3 buckets?

read_bucket_ng.png

Best regards,
Talb27


Operating system used: Windows 10

Best Answer

Answers

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.