Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on September 2, 2022 4:13AM
Likes: 0
Replies: 1
Hi. I have a straightforward workflow that is scheduled to run hourly. The issue I'm currently experiencing is that the output tables become unqueryable / unusable while the workflow is running. Any information on why this happens and how I can fix this? Thanks!
The workflow goes: SQL Query --> prepare recipe --> save to google cloud storage --> sync to BQ table historical --> sql query recipe --> BQ table historical daily (getting the recent snapshot only -- so a subset of the historical table).
Hi
for the flow you describe, at each run the output BQ table will be dropped, created anew, and filled again with a INSERT INTO from the historical table. So while the insert is running, the output BQ table is empty (and/or locked). If you need a 0 downtime scheme, you need to have a way to "swap" the table that you then query (outside DSS?). For example you can use views: load to table_in_the_back1, when it's ready (re)create view on table_in_the_back1, and use view for querying. On the next run, load to table_in_the_back2, when it's ready drop and recreate view on table_in_the_back2. That way the downtime is only the time needed to change the view. Then you alternate between table_in_the_back1 and table_in_the_back2.