Using Python API to retrieve last_build_time
I am trying to retrieve info about last_build_time_start (end) of a dataset using APIs. On the Docs, in class dataikuapi.dss.dataset.DSSDataset, there is a function named get_info to get object DSSDatasetInfo , but I could not call it.
Is it a bug or this class has been removed? If then, there are any other way to get these kind of information? There is a topic with a solution mentioning 'Internal Stats Dataset' but I don't know how to use that dataset.
Here is the information of the DSS I am using on my Chrome
Operating system used: debian (9)
Answers
-
Hi @Viet
,I was able to retrieve last_build_time_start using below code:
import dataikuapi host="http://localhost:11200" apiKey = "*************" project_key="*********" def test_get_info(project_key): client = dataikuapi.DSSClient(host,apiKey) dataset = client.get_project(project_key).get_dataset("test") settings = dataset.get_settings() print(dataset) print(dataset.get_info()) print(dataset.get_info().get_raw()) test_get_info(project_key)
The output of this code is:
% python3.9 testdatasetDSS.py <dataikuapi.dss.dataset.DSSDataset object at 0x10d022af0> <dataikuapi.dss.dataset.DSSDatasetInfo object at 0x10c1e85e0> {'type': 'UploadedFiles', 'name': 'test', 'analyses': [], 'charts': [], 'notebooks': [], 'worksheets': [], 'partitioned': False, 'dataset': {'type': 'UploadedFiles', 'managed': False, 'featureGroup': False, 'name': 'test', 'projectKey': 'MANAGEDFOLDERS', 'formatType': 'csv', 'checklists': {'checklists': []}, 'checks': [], 'customMeta': {'kv': {}}, 'flowOptions': {'virtualizable': False, 'rebuildBehavior': 'NORMAL', 'crossProjectBuildBehavior': 'DEFAULT'}, 'readWriteOptions': {'preserveOrder': False, 'writeBuckets': 1, 'forceSingleOutputFile': False, 'defaultReadOrdering': {'enabled': False, 'rules': []}}, 'formatParams': {'style': 'excel', 'charset': 'utf-8', 'separator': ',', 'quoteChar': '"', 'escapeChar': '\\', 'dateSerializationFormat': 'ISO', 'arrayMapFormat': 'json', 'hiveSeparators': ['\x02', '\x03', '\x04', '\x05', '\x06', '\x07', '\x08'], 'skipRowsBeforeHeader': 0, 'parseHeaderRow': True, 'skipRowsAfterHeader': 0, 'probableNumberOfRecords': 5, 'normalizeBooleans': False, 'normalizeDoubles': True, 'readAdditionalColumnsBehavior': 'INSERT_IN_DATA_WARNING', 'readMissingColumnsBehavior': 'DISCARD_SILENT', 'readDataTypeMismatchBehavior': 'DISCARD_WARNING', 'writeDataTypeMismatchBehavior': 'DISCARD_WARNING', 'fileReadFailureBehavior': 'FAIL', 'compress': ''}, 'partitioning': {'ignoreNonMatchingFile': False, 'considerMissingRequestedPartitionsAsEmpty': False, 'dimensions': []}, 'versionTag': {'versionNumber': 1, 'lastModifiedBy': {'login': 'admin'}, 'lastModifiedOn': 1659620057647}, 'creationTag': {'versionNumber': 0, 'lastModifiedBy': {'login': 'admin'}, 'lastModifiedOn': 1659620057476}, 'tags': [], 'params': {'uploadConnection': 'Default (in DSS data dir.)', 'notReadyIfEmpty': False, 'filesSelectionRules': {'mode':
I checked Dataiku API release 10 code and this contains get_info() method.
Can you please share more code that shows how you obtained the DSSDataset object?
What version of Dataiku API client do you use? Please confirm the version by running the following command using your version of Python:
python3.9 -m pip list |grep dataiku-api-client
-
Emma Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 52 Dataiker
Hey @Viet
,In addition to the solution provided by Catalina, you asked about the Internal Stats dataset.
This dataset is accessed through the Flow > + Dataset > Internal > Internal Stats. The info you're interested in is found in the "Objects state" view. See the screenshot below:
More information in the docs: https://doc.dataiku.com/dss/latest/connecting/internal-stats.html
Hope that helps,
Emma
-
Hi Catalina,
Here is the codes.
I was unable to run the code
But I have tried and succeeded with get_last_metric_value().
-
Great! Thanks
-
Hi @VietThanks for the details! The reason why you are not able to retrieve information about dataset last build is because you are using DSS version 10.0.2 and get_info function was added in DSS version 10.0.6. Therefore you will need to upgrade DSS to a version higher than 10.0.6 in order to use this function.