In Dataiku DSS when working with Snowflake there is an option to use a stage. This apparently speeds up performance by increasing the number of different types of processes one can do inside Snowflake without having to ship data back to the DSS server for processing. Are folks using this feature? What has your experience…
What I know is to use pyspark --packages com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.1 so that I could access s3 file, how do I configure it on dss?