Call Julia lang in Notebooks inside Dataiku ?

Florent · June 2019

I have Julia installed on my CentOS 7 server with Dataiku. Is there a smart way to open notebooks with IJulia inside Dataiku ?

Regards,

Florent

kldehoff · May 2020

Perhaps a bit late on this post, but there is a somewhat hack-ish way to load the Julia kernel inside Dataiku. Please note that I have only accomplished this on a single-user setup, although it is likely to work similarly for a multi-user setup.

Make sure Julia is installed and on the current (Dataiku) user path.
Create and run a new shell recipe (output to an empty folder if required) with the following contents:
```
julia -E 'using Pkg; Pkg.add("IJulia")'
```
This will install the IJulia package and add the IJulia kernel to Dataiku's list of kernels when run
Start a new Jupyter notebook (any kernel will do)
Change the kernel to "Julia x.y.z" (x.y.z is the installed version)

In addition, it is possible using the PyCall and DataFrames packages to load Dataiku data into Julia. It may be possible to save it back in a similar fashion, but I have not attempted this yet.

For loading Dataiku datasets, the following template should work, but requires the PyCall.jl, Pandas.jl, and DataFrames.jl packages to be installed.

#this forces PyCall to reference the built-in python environment for Dataiku
ENV["PYTHON"] = "/data/dataiku/bin/python"
Pkg.build("PyCall")

using PyCall, Pandas, DataFrames

#load Dataiku Python libraries
dataiku = pyimport("dataiku")
pd = pyimport("pandas")

#load the dataset into both a Pandas dataframe and a Julia Dataframes dataframe
mydataset = dataiku.Dataset("test")
pd_df = Pandas.DataFrame(mydataset.get_dataframe())
jl_df = DataFrames.DataFrame(df)

#test the load
Pandas.head(pd_df)
DataFrames.head(jl_df)

CoreyS · May 2020

It's never too late! Thank you for your contribution.

dshurgatwa · September 2021

How to use After the installation, it will be possible to create and execute Julia recipes the same way you would use any other code recipes. A Julia kernel also becomes available for Jupyter notebooks. Inside recipes and notebooks, use the package Dataiku.jl to interact with DSS. This package is a wrapper around the DSS Public API and provides functions to read and write datasets and folders in DSS easily. See the documentation on the package’s README.md Code environments For now, it is not possible to have multiple code environments in Julia. Therefore, all the julia recipes and notebooks will use the same environment that is located at $DSS_HOME/code-envs/julia. To install or remove packages, this environment has to be managed manually using the home theater julia’s built-in package manager, there are 2 ways to do that: By using Pkg inside a Jupyter notebook in DSS By running julia with the environment variable JULIA_DEPOT_PATH=$DSS_HOME/code-envs/julia

Call Julia lang in Notebooks inside Dataiku ?

Answers

Categories

Setup Info

Tags