File existence check

MSL
Level 2
File existence check

Hi folks, I am trying to check the existence of a file in a particular path using python code in dataiku.

I am able to access the file manually but when I am trying to check the existence it is not giving me the expected result. The code should return -1 when the file exists and 0 when it is not.

Thanks in advance for the inputs.

 

0 Kudos
3 Replies
Turribeach

Hi, there might be a better way to achieve what you want. What exactly is your goal? Do you want to refresh a flow when a file arrives or changed in a Dataiku Managed Folder? Also please post your code snippet using a code block (the </> icon in the toolbar). 

0 Kudos
MSL
Level 2
Author

Hi this is the code I am using where it checks the file existence and updates the value accordingly

import pandas as pd
import os

# Load the dataset containing file paths
df = dataiku.Dataset("formula_25").get_dataframe()

# List of column names containing file paths
file_columns = ['file1', 'file2', 'file3', 'file4']

# Function to check file existence
def check_file_existence(file_path):
return -1 if os.path.exists(file_path) else 0

# Iterate over each file column
for col in file_columns:
# Create a new column to store existence status
existence_col = col + "_exists"
# Check file existence for each row in the column
df[existence_col] = df[col].apply(check_file_existence)

# Save the updated DataFrame back to the result dataset
dataiku.Dataset("file_exist").write_with_schema(df)

0 Kudos
Turribeach

Please post your code snippet using a code block (the </> icon in the toolbar). If you don't then the padding is lost and the code can't be executed when you copy/paste it as Python is strict about padding. 

With regards to you issue you can't access the file system directly, you need to use a Dataiku Managed Folder:

https://knowledge.dataiku.com/latest/code/managed-folders/concept-managed-folders.html

 

0 Kudos