Join us on July 16th as we explore real-world Reinforcement Learning Learn more

Ask Me Anything on DSS 7 with Sunny Porinju

Dataiker
Dataiker
Ask Me Anything on DSS 7 with Sunny Porinju

Hi Dataiku Community! This is Sunny and I am here to answer your questions about Dataiku DSS 7. We recently launched DSS 7 with a number of new features to help improve your data journey. 

I know you may have a lot of questions about this release (and  maybe future releases) so I will do my best between now and April 3rd to answer all of your questions and hear all of your feedback. 

Before getting started, if you are not familiar with DSS 7 I ask that you review this blog post that provides a high level overview, as well as the DSS 7.0 Release notes. Also if you are new to AMA’s, please review the Ask Me Anything Guidelines and the Dataiku Community Guidelines

To participate, simply hit reply, and craft your question. Be sure to tag me, @sunnyporinju, so I can be notified of your post. I’ll be keeping an eye out as well, so not to worry if you forget to tag me. (But it’s good practice!) 

Let the questions begin!

 

IMG_0664.jpg  Sunny Porinju is a Senior Product Marketing Manager at Dataiku.

15 Replies
Dataiker
Dataiker

@sunnyporinju Can you guys comment on the “how” of the localized feature importance? Does it use LIME, or Shapley values? Is this available when Spark is the execution engine? Or is it just applicable in local execution mode?

Dataiku
Dataiker
Dataiker
Author

@SeanA Thanks for the question.  For Individual Prediction Explanations: It is either based on ICE (Individual Conditional Expectations) or based on Shapley values.

It is run in-memory. As other ‘in memory’ recipes, it can be executed in a container.

Level 6

Can I use the new git integration to share projects and libraries through git hub and other git-ish tools like bitbucket?  

I would like to build DSS projects in one instance of DSS and have colleagues at other non-profits be able to pick up the project via GitHub or bitbucket.  

I also contribute to a few open-source projects. Can I use the git integration to allow me to develop a python library in my instance of DSS and post my pull request to GitHub?

--Tom
Dataiker
Dataiker

Hi @tgb417,


The new git integration for projects allows you to synchronize your DSS project with any git-based-hosted solution including GitHub, GitLab or Bitbucket.
That being said, heavy artifacts like datasets or saved models can't be versioned in Git, so the Git integration can't be used for the purpose of sharing a project across multiple DSS instances. The appropriate way to go for this purpose is to use project exports.
However, once the project has been synced with a remote git repository, the source code can be cloned outside of DSS, so other people can edit the code recipes or the project libraries from any code editor and then push the changes to the remote Git repository that DSS will then be able to pull to update the project.


The project-libraries-level git integration doesn't allow to push changes to a remote git repository. For now, it is only possible through the project-level git integration. Thus so far it's only possible to leverage the DSS git integration for importing libraries from open source projects, but it's not possible to use DSS for contributing.

Level 6

Is the library level sharing with the use case of contributing to open source under consideration?

--Tom
Dataiker
Dataiker

Yes, the ability to commit and push to git repositories, at the project libraries level, is being considered. This would enable users to contribute to any open source libs directly from DSS. 

Level 2

Can we get an update when Pandas will be updated? In 2018 it had an indication it would be "nearby".

Dataiker
Dataiker
Author

Hi @casper,

We did update Pandas in 2018. We are looking into the next update

 

Dataiker
Dataiker

Hi @sunnyporinju my question is: For SharePoint support, what does DSS ingest?

Dataiker
Dataiker
Author

Hi @taraku,

The SharePoint support gives DSS users read and write access for files and lists on SharePoint.

Dataiker
Dataiker

@sunnyporinju a question from a prospect who saw the new Statistics component. 
"I see that you are doing univariate and bivariate analysis and those are pretty simple. I'm interested to know on the Statistical Tests, PCA, Fit Curves, and Correlation Matrix did Dataiku leverage specific packages to do these (like you have done with Python Sci-Kit for ML) or did Dataiku implement these on their own?" 

Dataiker
Dataiker

Hi @GCase,

Statistical features are built using well-known Python packages such as scipy, scikit-learnnumpy and statsmodels.

Dataiker
Dataiker

Hi @sunnyporinju,

For the interpretability of models, is it working for all visual models?

Dataiker
Dataiker
Author

Hi @bencoffey ,

Yes, it works for all visual models.

Community Manager
Community Manager

Just a reminder that you have until this Friday, April 3rd to get all of your questions in!

Don't forget to mark as "Accepted Solution" when someone provides the correct answer to your question.