I don't know which version of DSS you are using. However, it appears that Google Bigquery is natively supported by DSS only in the paid version. If you are using the community edition this feature is not directly available. (That said you might be able to use a Python, or R library to "roll your own".)
Tom, might one possibility for users of the community edition be to pull BigQuery data (for example, their Google Analytics dataset) into AWS, and then pull the data from AWS into Dataiku?
Yes, that might be possible.
What I was thinking of was using a python library like the one described here:
With python code somewhat like this.
from google.cloud import bigquery client = bigquery.Client() # Perform a query. QUERY = ( 'SELECT name FROM `bigquery-public-data.usa_names.usa_1910_2013` ' 'WHERE state = "TX" ' 'LIMIT 100') query_job = client.query(QUERY) # API request rows = query_job.result() # Waits for query to finish for row in rows: print(row.name)
Here is some further documentation https://cloud.google.com/bigquery/docs/reference/libraries#python
Thanks a lot @tgb417 for your response !
I have Version 7.0.2.
Planning to create a 'Python Function' which has a python function to interact/manage BQ.
Additionally it should be able to interact with an existing webapp and accept input json from this.
Hope this is doable in dataiku ?
Writing into Bigquery, however, is a complicated topic. You cannot simply write one record after another. BigQuery is an analytical database designed for very-large-scale analytics workloads, not at all for online transaction processing (i.e. modifying records one by one).
The only way to add data to BigQuery is to add said data to a "Google Cloud Storage" kind of dataset and then to sync this GCS dataset to BigQuery, which will use fast load capabilities of BigQuery.
Writing to a GCS dataset is covered by the regular dataset write APIs of Dataiku.