Implement Sampling > Random as Engine:In-Database(SQL) for Snowflake

Currently if I select  Sampling method: Random (approx. ratio) or Random (approx. nb. records) the only allowed engine is DSS which will require downloading the input dataset to dss. 

It's possible to do sampling at the Snowflake side, with https://docs.snowflake.com/en/sql-reference/constructs/sample

For Random(approx. nb.records) I believe it would be as easy as generating the following SQL

 

select * from input_table sample row (10 rows) seed (99);

 

 

 

 

 

 

 

1 Comment
ecerulm
Level 4

SAMPLE/TABLESAMPLE is supported in many databases: 

 


* Postgres

* teradata

* MySQL

* Google BigQuery

* Microsoft sql server (Transact SQL)

* Oracle

 

 

 

SAMPLE/TABLESAMPLE is supported in many databases: 

 


* Postgres

* teradata

* MySQL

* Google BigQuery

* Microsoft sql server (Transact SQL)

* Oracle