Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi Dataiku Community,
I hope you're all doing well. I wanted to reach out with some questions regarding our current implementation, where we are utilizing EMR Serverless with TBs of data flowing between Snowflake and S3.
I appreciate any insights or advice from the community. Your experiences and recommendations will be invaluable in guiding our decisions. Thank you in advance!
This question is too long and too wide to be answered directly. I think that you should engage with Dataiku Profesional Services as the depth and breath required to answer all your points is quite substantial and the sort of decisions you are making based on the information you are looking for will have a big impact on your architecture and the outcomes it provides. In other words this is not the sort of questions you want to leave to people in a community forum, no matter how good the quality of the forum is which in this case is quite high.
I would say that in general you shouldn't be looking to replace your ETL/DWH/Data Lake/Big Data store with Dataiku. If you aim for that you will most likely fail. Dataiku is an excellent end-to-end ML platform but it's not a silver bullet for everything else. If you have something that works already then leave all of that complexity outside Dataiku and just bring the data in its most ready state possible to be used for Machine Learning inside Dataiku. That's where you are going to get the most value out of Dataiku.