DSS Managed Folder Issues

Kay
Level 1
DSS Managed Folder Issues

I get the following error when I run the python below in DSS

Error: Job failed: Error in Python process: At line 14: <class 'AttributeError'>: 'DSSManagedFolder' object has no attribute 'list_objects'

See actual Python script below

 

import dataiku
import pandas as pd

client = dataiku.api_client()
project = client.get_project('PREDICTIVE_ANALYTICS_OF_RAW_MATERIALS_1')

source_folder_name = "qQNv9CBS"
target_folder_name = "HMAn88LX"

source_folder = project.get_managed_folder(source_folder_name)
target_folder = project.get_managed_folder(target_folder_name)

# List all paths (files and folders) in the source folder
source_paths = source_folder.list_objects()

# Part numbers to filter
part_numbers = ['100061', '6000303', '6004910', '6000662', '6002238', '6002963', '6002965', '6004488']

# Iterate through the paths in the source folder
for source_path in source_paths:
    # Check if the path is a file
    if source_path['type'] == 'File':
        # Use the dataiku.Dataset class to read the content of the file
        source_dataset = dataiku.Dataset(source_folder.get_path() + '/' + source_path['name'])
        df = source_dataset.get_dataframe()

        # Process the data as needed
        df['Material'] = df['Material'].astype(str)

        # Check if 'Material' column exists in the DataFrame
        if 'Material' in df.columns:
            # Filter rows with specific part numbers in 'Material' column
            filtered_df = df[df['Material'].isin(part_numbers)]

            # Check if any rows match the criteria
            if not filtered_df.empty:
                # Define the target path
                target_path = target_folder.get_path() + '/' + source_path['name']

                # Write the filtered DataFrame to the target folder
                target_dataset = dataiku.Dataset(target_path)
                target_dataset.write_with_schema(filtered_df, dropAndCreate=True)

# Copy the entire folder to the output folder
future = source_folder.copy_to(target_folder)
future.wait_for_result()

 


Operating system used: Windows 10

0 Kudos
4 Replies
Turribeach

Let me guess, ChatGPT wrote that code snippet. You get object has no attribute 'list_objects' because there is no method called list_object(). Your GenAI is having an hallucination. So what exactly are you trying to do?

0 Kudos
Kay
Level 1
Author

ChatGPT wrote part of it
I'm trying to read the files in the source folder

0 Kudos
Turribeach

Please do not paste GenAI code without explicitly warning people your code was generated by a GenAI bot. It's one thing to try to understand and fix someone else's code and completely different thing is to do the same with GenAI code. And if you are posting GenAI code please include the prompt you used to generate it, so people looking at the code can try to understand what it was asked to do.

In this example you can real Dataiku Python API code in which files are read from one folder and copied to another one:

https://community.dataiku.com/t5/Using-Dataiku/Listing-and-Reading-all-the-files-in-a-Managed-Folder...

If you need further help please explain clearly your requirement. "read the files in the source folder" doesn't say much and your GenAI code seems to be doing much more than that. 

0 Kudos
CH007
Level 2

Hi there, I think youโ€™re stuck trying to connect to a managed folder, list the file contents of the managed folder and then upload a particular file into your Notebook instance and perform some custom filters, etc. Below I've included some sample code that you may have to tweak for your actual project, but it'll point you in the direction to resolve the error you have with connecting to your managed folder. 

<class โ€˜Attribute Errorโ€™>: object has no attribute - Whenever you see โ€œobject has no attributeโ€, this occurs when you try to access an attribute or method of an object that doesnโ€™t exist. When working with managed folders in Dataiku you have to retrieve its handle first and then you can manipulate the object.

Your code is missing the handle to manipulate the file, instead you are just leveraging the source folder name. You also need to specify the partitions of the folder as well to list the contents. 

code_2.PNG

 

Additional documentation can be found below for handling managed folders 

https://doc.dataiku.com/dss/latest/connecting/managed_folders.html

 

0 Kudos