File existence check
Hi folks, I am trying to check the existence of a file in a particular path using python code in dataiku.
I am able to access the file manually but when I am trying to check the existence it is not giving me the expected result. The code should return -1 when the file exists and 0 when it is not.
Thanks in advance for the inputs.
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron
Hi, there might be a better way to achieve what you want. What exactly is your goal? Do you want to refresh a flow when a file arrives or changed in a Dataiku Managed Folder? Also please post your code snippet using a code block (the </> icon in the toolbar).
-
Hi this is the code I am using where it checks the file existence and updates the value accordingly
import pandas as pd
import os# Load the dataset containing file paths
df = dataiku.Dataset("formula_25").get_dataframe()# List of column names containing file paths
file_columns = ['file1', 'file2', 'file3', 'file4']# Function to check file existence
def check_file_existence(file_path):
return -1 if os.path.exists(file_path) else 0# Iterate over each file column
for col in file_columns:
# Create a new column to store existence status
existence_col = col + "_exists"
# Check file existence for each row in the column
df[existence_col] = df[col].apply(check_file_existence)# Save the updated DataFrame back to the result dataset
dataiku.Dataset("file_exist").write_with_schema(df) -
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,160 Neuron
Please post your code snippet using a code block (the </> icon in the toolbar). If you don't then the padding is lost and the code can't be executed when you copy/paste it as Python is strict about padding.
With regards to you issue you can't access the file system directly, you need to use a Dataiku Managed Folder:
https://knowledge.dataiku.com/latest/code/managed-folders/concept-managed-folders.html