Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi folks, I am trying to check the existence of a file in a particular path using python code in dataiku.
I am able to access the file manually but when I am trying to check the existence it is not giving me the expected result. The code should return -1 when the file exists and 0 when it is not.
Thanks in advance for the inputs.
Hi, there might be a better way to achieve what you want. What exactly is your goal? Do you want to refresh a flow when a file arrives or changed in a Dataiku Managed Folder? Also please post your code snippet using a code block (the </> icon in the toolbar).
Hi this is the code I am using where it checks the file existence and updates the value accordingly
import pandas as pd
import os
# Load the dataset containing file paths
df = dataiku.Dataset("formula_25").get_dataframe()
# List of column names containing file paths
file_columns = ['file1', 'file2', 'file3', 'file4']
# Function to check file existence
def check_file_existence(file_path):
return -1 if os.path.exists(file_path) else 0
# Iterate over each file column
for col in file_columns:
# Create a new column to store existence status
existence_col = col + "_exists"
# Check file existence for each row in the column
df[existence_col] = df[col].apply(check_file_existence)
# Save the updated DataFrame back to the result dataset
dataiku.Dataset("file_exist").write_with_schema(df)
Please post your code snippet using a code block (the </> icon in the toolbar). If you don't then the padding is lost and the code can't be executed when you copy/paste it as Python is strict about padding.
With regards to you issue you can't access the file system directly, you need to use a Dataiku Managed Folder:
https://knowledge.dataiku.com/latest/code/managed-folders/concept-managed-folders.html