Advanced Designer Data Pipeline Build Mode

Options
MikeAries
MikeAries Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 3

Hello, I am currently doing the Build Modes tutorial in the Data Pipeline training of the learning path advanced designer. I made it to the Build a Flow zone. However, at this level, when I arrive at step 3 "Click Preview", I am shown the following message:

Source dataset 'TUT_VISUAL_RECIPES.tx_joined' is not ready, can't build, caused by: DataStoreIOException: Root path of the dataset tx_joined does not exist
More information might be available in full log (Actions > View full job log)

Can you explain to me why the preview is not working? How to solve this problem ?

Answers

  • Sean
    Sean Dataiker, Alpha Tester, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer Posts: 168 Dataiker
    Options

    Hi @MikeAries
    , I wasn't able to reproduce your error. I'd start with the error message. The root path of tx_joined does not exist. I'd open this dataset and see if it's there and move backward up the pipeline to see where the problem might begin.

    Was this previous build successful? The "Build upstream" section also requires tx_joined and so I'm not sure how that build could be successful if tx_joined had a problem.

    If I had to guess, you may have made an error in the initial step of clearing datasets?

  • MikeAries
    MikeAries Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 3
    Options

    Hi @SeanA
    . When I launch the tx_windows (NP) build, build upstream, review then run, it gives me the following error message:

    Illegal URL
    Failed to enumerate source { "useGlobalProxy": true, "providerType": "URL", "params": { "path": "https://downloads.dataiku.com/public/website-additional-assets/data/merchant_info .csv.zip", "consider404AsEmpty": true, "fallbackHeadToGet": true, "trustAnySSLCertificate": false } }, caused by: CodedIOException: in act.download_to_merchants_NP: URI is invalid, caused by: UnknownHostException: downloads.dataiku.com : Name or service not known
    This error is typically caused by a configuration issue. You need to modify the affected settings to fix the issue.

  • Sean
    Sean Dataiker, Alpha Tester, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer Posts: 168 Dataiker
    Options

    Hi @MikeAries
    when you open the merchants dataset, is there data there? I think you may have cleared it. If that's the case, you'll need to recreate the project (+New Project > DSS tutorials > Advanced Designer > Build Modes).

  • MikeAries
    MikeAries Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 3
    Options

    Hi @SeanA
    . I found the Build Mode projects to download and I just figured something out. In fact, when I do the build upstream part of the tutorial and I preview the build upstream, I don't know why, but the download_to_merchants source is added below job preview with compute_cards_prepared, compute_tx_joined, compute_tx_prepared, compute_tx_windows.

  • Sean
    Sean Dataiker, Alpha Tester, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer Posts: 168 Dataiker
    Options

    Hi @MikeAries
    , I think what's happening here is a bit subtle, but doesn't matter too much in the end. I also found download_to_merchants as an activity in the job preview for an upstream build. You'll notice on the right it says "This activity is required because it was never executed". When I do run it though, that activity takes 0 seconds. So it's not really re-computing this recipe. Then, when I clear the downstream datasets in the Flow, and do another upstream build preview, download_to_merchants is not an activity. Does that make sense? We'll try to make this a bit clearer.

Setup Info
    Tags
      Help me…