DSS Managed Folder Issues
I get the following error when I run the python below in DSS
Error: Job failed: Error in Python process: At line 14: <class 'AttributeError'>: 'DSSManagedFolder' object has no attribute 'list_objects'
See actual Python script below
import dataiku import pandas as pd client = dataiku.api_client() project = client.get_project('PREDICTIVE_ANALYTICS_OF_RAW_MATERIALS_1') source_folder_name = "qQNv9CBS" target_folder_name = "HMAn88LX" source_folder = project.get_managed_folder(source_folder_name) target_folder = project.get_managed_folder(target_folder_name) # List all paths (files and folders) in the source folder source_paths = source_folder.list_objects() # Part numbers to filter part_numbers = ['100061', '6000303', '6004910', '6000662', '6002238', '6002963', '6002965', '6004488'] # Iterate through the paths in the source folder for source_path in source_paths: # Check if the path is a file if source_path['type'] == 'File': # Use the dataiku.Dataset class to read the content of the file source_dataset = dataiku.Dataset(source_folder.get_path() + '/' + source_path['name']) df = source_dataset.get_dataframe() # Process the data as needed df['Material'] = df['Material'].astype(str) # Check if 'Material' column exists in the DataFrame if 'Material' in df.columns: # Filter rows with specific part numbers in 'Material' column filtered_df = df[df['Material'].isin(part_numbers)] # Check if any rows match the criteria if not filtered_df.empty: # Define the target path target_path = target_folder.get_path() + '/' + source_path['name'] # Write the filtered DataFrame to the target folder target_dataset = dataiku.Dataset(target_path) target_dataset.write_with_schema(filtered_df, dropAndCreate=True) # Copy the entire folder to the output folder future = source_folder.copy_to(target_folder) future.wait_for_result()
Operating system used: Windows 10
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,124 Neuron
Let me guess, ChatGPT wrote that code snippet. You get object has no attribute 'list_objects' because there is no method called list_object(). Your GenAI is having an hallucination. So what exactly are you trying to do?
-
ChatGPT wrote part of it
I'm trying to read the files in the source folder -
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,124 Neuron
Please do not paste GenAI code without explicitly warning people your code was generated by a GenAI bot. It's one thing to try to understand and fix someone else's code and completely different thing is to do the same with GenAI code. And if you are posting GenAI code please include the prompt you used to generate it, so people looking at the code can try to understand what it was asked to do.
In this example you can real Dataiku Python API code in which files are read from one folder and copied to another one:
If you need further help please explain clearly your requirement. "read the files in the source folder" doesn't say much and your GenAI code seems to be doing much more than that.
-
CH007 Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 ✭
Hi there, I think you’re stuck trying to connect to a managed folder, list the file contents of the managed folder and then upload a particular file into your Notebook instance and perform some custom filters, etc. Below I've included some sample code that you may have to tweak for your actual project, but it'll point you in the direction to resolve the error you have with connecting to your managed folder.
<class ‘Attribute Error’>: object has no attribute - Whenever you see “object has no attribute”, this occurs when you try to access an attribute or method of an object that doesn’t exist. When working with managed folders in Dataiku you have to retrieve its handle first and then you can manipulate the object.
Your code is missing the handle to manipulate the file, instead you are just leveraging the source folder name. You also need to specify the partitions of the folder as well to list the contents.
Additional documentation can be found below for handling managed folders
https://doc.dataiku.com/dss/latest/connecting/managed_folders.html