Get complete jobs history

Options
mallisundaresan
mallisundaresan Dataiku DSS Core Designer, Registered Posts: 6
edited July 16 in Using Dataiku

Hi, I saw in this link :https://doc.dataiku.com/dss/latest/operations/disk-usage.html that the job logs are not garbage collected and are retained in the DSS for arbitrary time. But when I use the following code

jobs = project.list_jobs()

I get only the most recent 100 jobs information. Is it possible to get the complete jobs history using the Python API?

Tagged:

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @mallisundaresan

    I’ve looked at the documentation

    https://developer.dataiku.com/latest/api-reference/python/projects.html

    And I see no documentation about a limit for this call.

    However, it is my experience that the Dataiku documentation can be incomplete. On the same page I see other calls that take a parameter of count= followed by a number or item_count= followed by a number. I’d simply run the test to see if there is an undocumented parameter for list_jobs(). I might also try job_count = Or count = . See if that has the desired effect.

    if none of that helps. I invite you to consider two other items.

    1. Open a support ticket and ask. The support team can be very helpful.

    2. If you determine that this feature does not exist. I’d like to invite you to create a new product feature suggestion:

    https://community.dataiku.com/t5/Product-Ideas/idb-p/Product_Ideas

    Let us know how you get along with this challenge.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,740 Neuron
    Options

    @tgb417
    wrote:

    I’d simply run the test to see if there is an undocumented parameter for list_jobs().


    @tgb417
    You can always check the public API Github to see if there are any additional undocumented parameters:

    https://github.com/dataiku/dataiku-api-client-python/blob/master/dataikuapi/dss/project.py#L1272

  • mallisundaresan
    mallisundaresan Dataiku DSS Core Designer, Registered Posts: 6
    Options

    Thanks for the suggestions. I will check the API code to see if we can pass any parameter to the function. I will update here if I find anything substantial.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,740 Neuron
    Options

    Are you hosting the internal database externally? If you are not you should. This sort of stats become much more easy to access as you can query the internal database directly. The jobs data is on the JOBS table.

Setup Info
    Tags
      Help me…