Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

Add pin packages functionality in the Packages to install screen

Dataiku users can create and manage Python code environments in Dataiku, which is great. As part of managing a Python virtual code environment users need to be careful in managing the different package versions they use to avoid package incompatibilities or breaking their code by installing newer or older versions of packages. In the beggining people would use "pip freeze" to pin the packages versions in requirements.txt. The problem with this approach is that "pip freeze" will pin all of the pip installed packages, including their dependancies. This in turn can then lead to problems when you update one of the "initially required" packages as the dependancies are pinned and therefore can't be upgrades as well causing incompatibilities (see here for an example). 

The solution to this issue is to only pin the "initially required" packages, not the resulting installed ones. This product idea is to allow users to have a "Pin requested packages" check box on the "Packages to install" screen to only pin the required packages. The way this check box will work when selected and the Update button is clicked then Dataiku will check on the "Currently installed packages" screen what version of the "Packages to install" was installed and then pin the version in the "Packages to install". So for instance the user requests the package ldap in "Packages to install", then checks the "Pin requested packages" check box and clicks update. Dataiku installs ldap3 v2.9.1 and updates the "Packages to install" section with "ldap3==2.9.1" pinning the ldap package version to v2.9.1. 

This functionality will greatly simplify administration of code environments as users can decide to always pin "Packages to install" making sure they are no package version changes unless they are explicitely requested.