Azure Blob

Options
Sourabh
Sourabh Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 8 ✭✭✭

Hi, I have created one append recipe in dataiku & i am storing the output dataset in Azure blob storage dataset.

I want to know the location of this azure blob storage, and is it possible to access this location in jupyter notebook ? If Yes, what is the process to do it ?


Operating system used: Windows


Operating system used: Windows

Answers

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 293 Dataiker
    edited July 17
    Options

    Hi @SourabhJ
    ,

    When setting up a connection, for instance, to an Azure Blob container, you can add a default path (where files from DSS are written). Unfortunately, you cannot directly read from your Azure blob with DSS Python APIs. As a workaround, you can create a Managed Folder pointing to your Azure blob container and use DSS Python APIs to list the contents.

    From within DSS:

    import dataiku
    client = dataiku.api_client()
    # Get a handle to the Dataiku project, must use Project Key, take it from Project Home URL, must be all in uppercase
    project = client.get_project("PROJECT_KEY")
    folder = dataiku.Folder("insert_folder_ID_or_name")
    contents = folder.list_paths_in_partition()

    Outside DSS:

    import dataikuapi
    
    # Set Dataiku URL and API Key
    host = "https://your_dss_URL"
    apiKey = "paste your API here"
    
    # Create API client
    client = dataikuapi.DSSClient(host, apiKey)
    
    # Ignore SSL checks as these may fail without access to root CA certs
    client._session.verify = False
    
    # Get a handle to the Dataiku project, must use Project Key, take it from Project Home URL, must be all in uppercase
    project = client.get_project("PROJECT_KEY")
    
    # Get a handle to the managed folder, must use Folder ID, take it from URL when browsing the folder in Dataiku. Case sensitive!
    managedfolder = project.get_managed_folder("folder_id")
    
    #Lists paths in folder
    contents = managedfolder.list_contents()

    I hope this helps!

    Thanks,

    Jordan

Setup Info
    Tags
      Help me…