Announcing the winners & finalists of the Dataiku Frontrunner Awards 2021! Read their inspiring stories

Call Julia lang in Notebooks inside Dataiku ?

Florent
Level 1
Call Julia lang in Notebooks inside Dataiku ?
I have Julia installed on my CentOS 7 server with Dataiku. Is there a smart way to open notebooks with IJulia inside Dataiku ?



Regards,

Florent
3 Replies
kldehoff
Level 2

Perhaps a bit late on this post, but there is a somewhat hack-ish way to load the Julia kernel inside Dataiku.  Please note that I have only accomplished this on a single-user setup, although it is likely to work similarly for a multi-user setup.

  1. Make sure Julia is installed and on the current (Dataiku) user path.
  2. Create and run a new shell recipe (output to an empty folder if required) with the following contents:
    julia -E 'using Pkg; Pkg.add("IJulia")'
    This will install the IJulia package and add the IJulia kernel to Dataiku's list of kernels when run
  3. Start a new Jupyter notebook (any kernel will do)
  4. Change the kernel to "Julia x.y.z" (x.y.z is the installed version)

In addition, it is possible using the PyCall and DataFrames packages to load Dataiku data into Julia.  It may be possible to save it back in a similar fashion, but I have not attempted this yet.

For loading Dataiku datasets, the following template should work, but requires the PyCall.jl, Pandas.jl, and DataFrames.jl packages to be installed.

#this forces PyCall to reference the built-in python environment for Dataiku
ENV["PYTHON"] = "/data/dataiku/bin/python" Pkg.build("PyCall")

using PyCall, Pandas, DataFrames

#load Dataiku Python libraries
dataiku = pyimport("dataiku")
pd = pyimport("pandas")

#load the dataset into both a Pandas dataframe and a Julia Dataframes dataframe
mydataset = dataiku.Dataset("test")
pd_df = Pandas.DataFrame(mydataset.get_dataframe())
jl_df = DataFrames.DataFrame(df)

#test the load
Pandas.head(pd_df)
DataFrames.head(jl_df)
CoreyS
Community Manager
Community Manager

It's never too late! Thank you for your contribution.

Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
0 Kudos
dshurgatwa
Level 1

How to use After the installation, it will be possible to create and execute Julia recipes the same way you would use any other code recipes. A Julia kernel also becomes available for Jupyter notebooks. Inside recipes and notebooks, use the package Dataiku.jl to interact with DSS. This package is a wrapper around the DSS Public API and provides functions to read and write datasets and folders in DSS easily. See the documentation on the package’s README.md Code environments For now, it is not possible to have multiple code environments in Julia. Therefore, all the julia recipes and notebooks will use the same environment that is located at $DSS_HOME/code-envs/julia. To install or remove packages, this environment has to be managed manually using the home theater julia’s built-in package manager, there are 2 ways to do that: By using Pkg inside a Jupyter notebook in DSS By running julia with the environment variable JULIA_DEPOT_PATH=$DSS_HOME/code-envs/julia

0 Kudos
Labels (1)
A banner prompting to get Dataiku DSS