Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on May 8, 2024 11:46AM
Likes: 0
Replies: 4
Hi, I saw in this link :https://doc.dataiku.com/dss/latest/operations/disk-usage.html that the job logs are not garbage collected and are retained in the DSS for arbitrary time. But when I use the following code
jobs = project.list_jobs()
I get only the most recent 100 jobs information. Is it possible to get the complete jobs history using the Python API?
I’ve looked at the documentation
https://developer.dataiku.com/latest/api-reference/python/projects.html
And I see no documentation about a limit for this call.
However, it is my experience that the Dataiku documentation can be incomplete. On the same page I see other calls that take a parameter of count= followed by a number or item_count= followed by a number. I’d simply run the test to see if there is an undocumented parameter for list_jobs(). I might also try job_count = Or count = . See if that has the desired effect.
if none of that helps. I invite you to consider two other items.
1. Open a support ticket and ask. The support team can be very helpful.
2. If you determine that this feature does not exist. I’d like to invite you to create a new product feature suggestion:
https://community.dataiku.com/t5/Product-Ideas/idb-p/Product_Ideas
Let us know how you get along with this challenge.
@tgb417
wrote:I’d simply run the test to see if there is an undocumented parameter for list_jobs().
@tgb417
You can always check the public API Github to see if there are any additional undocumented parameters:
https://github.com/dataiku/dataiku-api-client-python/blob/master/dataikuapi/dss/project.py#L1272
Thanks for the suggestions. I will check the API code to see if we can pass any parameter to the function. I will update here if I find anything substantial.
Are you hosting the internal database externally? If you are not you should. This sort of stats become much more easy to access as you can query the internal database directly. The jobs data is on the JOBS table.