-
Storing Datasets in Dataiku (with SQL Server and More)
Hi, I am about to start a project with dataiku, and have tow questions 1) i saw in the tutorials, that one basic thing to do with Dataiku, is to create a dataset. I wanted to know, where they are stored. 2) Is it possible (and also logical), to store them, in sql-server? The reason I am asking, is that the projects…
-
Function does not reduce error
Hi, I'm facing some trouble in the following python recipe. # -*- coding: utf-8 -*-import dataikuimport pandas as pd, numpy as npfrom dataiku import pandasutils as pdufrom statsmodels.stats.stattools import medcouple# Read recipe inputsCOLETA_f_datas = dataiku.Dataset("COLETA_f_datas")COLETA_f_datas_df =…
-
Can't write Dataframe to Dataset
So my current setup I have some .las files in a Managed folder, and I convert them into DataFrames using the library lasio. After converting them into DataFrames and joining them to one singular DataFrame, I try to write it into the dataset on the other end of the code recipe by running this: TypeError:…
-
How to have a recipe write its output to an s3 bucket?
I have a workflow which, at the time of its creation had its recipe outputs/files to be saved/written to the local system. Now I want to change this behavior such that all the recipes should save their outputs to an s3 connection. How can I implement this? Thanks.
-
Efficient way to write massive dataset to output in Dataiku
Hi Team, I'm using pyspark with Dataiku after processing the data, I'm facing an issue with writing the data to the output. Could you please suggest an efficient way to write the data to the output? Dataset size: 40million(approx.) Getting Error at line 15 while writing(as the data is massive) #Recipe 1 import dataiku 2…
-
R Recipe Streaming
We're working on a project utilizing an R notebook with some very large datasets and are wondering what the recommended approach is for working with a dataset that does not fit into memory. We are big fans of the streaming API for Python - is there any equivalent for R? Thanks!
-
gpu is unavailable in codenotebook
hi The result of tf.test.is_gpu_available() is true in shell. But GPU is not available in code notebook. The result of tf.test.is_gpu_available() is false in code notebook. Could you let me know how I solve this problem? this is code notebook this is shell Operating system used: ubuntu
-
Link to a notebook in visual studio code extension dataiku
Hi. Dataiku successfully connected using extension to visual studio code. I was able to bring Python's Datacu recipe as .py ,but I want to bring it as .ipynb how can I change it? Or Is there no function to connect code notebook?
-
Running unit tests and dealing with project paths
Hi all ! I'm fairly new when it comes to unit testing, but I've started reading more and more and creating my own test for a specific Dataiku project with a fairly well-furnished custom python lib. As our project is hosted on a distant repo, I'd like to be able to implement a way for the tests of lib/tests/ to run…
-
SQL Notebooks accessible via API?
Hello, I see that there is some support for interacting with Jupyter Notebooks via API in version 10 of DSS. However, I didn't see any functionality that supports SQL Notebooks. Am I missing it? Thanks, Marlan Operating system used: Linux