Assistant Professor (Adjunct)
New York University
Founded in 1831, NYU is one of the world’s foremost research universities and is a member of the selective Association of American Universities. The first Global Network University, NYU has degree-granting university campuses in New York and Abu Dhabi, and has announced a third in Shanghai; has a dozen other global academic sites, including London, Paris, Florence, Tel Aviv, Buenos Aires, and Accra; and sends more students to study abroad than any other U.S. college or university. Through its numerous schools and colleges, NYU conducts research and provides education in the arts and sciences, law, medicine, business, dentistry, education, nursing, the cinematic and performing arts, music and studio arts, public administration, social work, and continuing and professional studies, among other areas.
I have designed and have been teaching a 2-semester course on Applied Data Analytics at New York University. The course has been running in the last few years. It starts with basic topics on statistics and simple visuals, and ends the 2nd semester with Deep Learning and AI frameworks. We start with Excel and Excel Data Analysis and we move to Python and Python Data Science packages.
The challenge has always been to instill students with the necessary curiosity so they can master the basics and learn how to approach data science problem solving in a way that they own the answers.
Typically, we go through learning what the concepts mean while practicing using tools and code.
Dataiku comes into this learning journey after students have learned how to solve data science problems manually, the “harder” way. By design of the course, Dataiku DSS is employed at the time that students know how to answer these challenges. They are expected to have mastered the theory and they know how to practice solving such problems in the lab.
Having a plethora of related capabilities, Dataiku creates a “wow” effect. It shows them how they can go through the pipeline faster and more thoroughly. A quote by my student this past 2021 Spring semester was;:“So now, by using Dataiku, I can complete the course project in a matter of a few days instead of a few weeks?” My answer was a simple “yes”, knowing from their homework submissions that they knew how to complete the project without Dataiku.
The course uses Python, Python Statistical packages and Data Science/Machine Learning/Deep Learning packages, Excel and Excel Data Analysis add-ons, as its core tools to practice the concepts. At a high level, concepts that we cover start with theory of data and analytics, then we move to the basic use of spreadsheets and visualizations. At the same time, we touch upon basic Python programming and move quickly to related packages. Next we do statistics and probability theory, followed by more practice using both tool categories while we continue with sampling, estimation, and statistical inference.
After these foundational ideas are mastered, and we cover prescriptive analytics thoroughly, we move to predictive and prescriptive analytics concepts while we introduce machine learning. A good amount of time is spent learning about how a good number of algorithms work, the ins and outs of related math, while practicing each of them with the appropriate dataset (sort of a mini project in the form of a team homework). In the 2nd half of the 2nd semester, we review frameworks, cloud computing, big data and we move to Deep Learning, Deep Learning architectures and related packages, and close the course by touching upon machine learning operations.
Most of these concepts can be seen playing on the Dataiku user interface. When my students learn to use Dataiku, it becomes the ‘aha’ moment, where they see that once they know what data science means, they can use tools to help them execute a project faster and more thoroughly.t