Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi. I have a straightforward workflow that is scheduled to run hourly. The issue I'm currently experiencing is that the output tables become unqueryable / unusable while the workflow is running. Any information on why this happens and how I can fix this? Thanks!
The workflow goes: SQL Query --> prepare recipe --> save to google cloud storage --> sync to BQ table historical --> sql query recipe --> BQ table historical daily (getting the recent snapshot only -- so a subset of the historical table).
Hi
for the flow you describe, at each run the output BQ table will be dropped, created anew, and filled again with a INSERT INTO from the historical table. So while the insert is running, the output BQ table is empty (and/or locked). If you need a 0 downtime scheme, you need to have a way to "swap" the table that you then query (outside DSS?). For example you can use views: load to table_in_the_back1, when it's ready (re)create view on table_in_the_back1, and use view for querying. On the next run, load to table_in_the_back2, when it's ready drop and recreate view on table_in_the_back2. That way the downtime is only the time needed to change the view. Then you alternate between table_in_the_back1 and table_in_the_back2.