Using Dataiku

Sort by:

441 - 450 of 517

Predict_proba
I have a custom multiclassification algorithm. I cannot figure out the conditions under which the Dataiku scoring system will call predict_proba. Thanks for any help. Gordon Operating system used: Mar…
Question
Ignition
Started by Erlebacher
Most recent by Erlebacher
Nov 20, 2022
0
1
Last answer by Erlebacher
I have made some progress, within the "lab" attached to one of my training sets. I have defined two scoring functions, and they run properly (I return "synthetic" data consistent with the required formats). But then I get the following error:
ValueError: Classification metrics can't handle a mix of multiclass and continuous targets
It is true that my features are a mixture of multiclass and continuous targets. Still, I have two questions:

1) if I am writing custom functions, why should Dataiku care about this mixture?
2) why can't this error be listed before wasting computational resources processing my custom functions? Dataiku must know about this mix immediately at the very first stage when my custom algorithm is trained (which Dataiku had no issues with.)

3) What is happening after the function scoring?
Last answer by Erlebacher
I have made some progress, within the "lab" attached to one of my training sets. I have defined two scoring functions, and they run properly (I return "synthetic" data consistent with the required formats). But then I get the following error:
ValueError: Classification metrics can't handle a mix of multiclass and continuous targets
It is true that my features are a mixture of multiclass and continuous targets. Still, I have two questions:

1) if I am writing custom functions, why should Dataiku care about this mixture?
2) why can't this error be listed before wasting computational resources processing my custom functions? Dataiku must know about this mix immediately at the very first stage when my custom algorithm is trained (which Dataiku had no issues with.)

3) What is happening after the function scoring?
Reply to Discussion
Reply to Discussion
Admin notifications
Is it possible to set up notifications for admins in Dataiku for certain events? Below is a (non-exhaustive) list of certain events I am interested in: * A new deployment is created by a user * A new …
Question
lex
Ignition
Started by yashpuranik
Most recent by fvargaspiedra
Nov 18, 2022
2
1
Last answer by fvargaspiedra
I have the same question. Subscribing to @yashpuranik
's question.
Last answer by fvargaspiedra
I have the same question. Subscribing to @yashpuranik
's question.
Reply to Discussion
Reply to Discussion
Get the pattern of a partitioned managed folder
Hello, I'm working with partitoned managed folders and i would like to get the pattern of the partionning (available in the web interface ; see screenshot) For Datasets, I can get such informations wi…
Question
lex
Ignition
Started by Moon11
Most recent by Moon11
Nov 18, 2022
0
2
Last answer by
Last answer by Moon11
Thx it was exactly what I was looking for !
Reply to Discussion
Reply to Discussion
Saving custom prepare processor
Hello everyone, I began using Dataiku a few days ago. I have a lot of "address" data, and I tried to use the Geocoder pluging in order to convert them into usable coordinates and geopoints. As this pl…
Question
Ignition
Started by PeteGore
Most recent by Alexandru
Nov 17, 2022
0
1
Last answer by
Last answer by Alexandru
Hi @PeteGore
,
The python processor within a prepared recipe can only apply to one column.
You can compute both and put them to the targeted column for the python processor and later split it to separate columns with another processor.
You can perform this in a Python recipe instead.
You can leverage project libraries if you need to reuse python code https://doc.dataiku.com/dss/latest/python/reusing-code.html
Or package a python recipe as a plugging:
https://doc.dataiku.com/dss/latest/python/reusing-code.html#packaging-code-as-plugins

Reply to Discussion
Reply to Discussion
Import error rand_score
Using the following code: from sklearn.metrics import rand_score (which is what is required per scikit-learn.org I get error ---------------------------------------------------------------------------…
Question
do you
Ignition
Started by cwentz
Most recent by cwentz
Nov 16, 2022
0
4
Last answer by
Last answer by cwentz
ImportError Traceback (most recent call last) <ipython-input-187-d43b243c5958> in <module> ----> 1 from sklearn.metrics import rand_score ImportError: cannot import name 'rand_score'
Reply to Discussion
Reply to Discussion
can you update datasets using loops in an SQL recipe
I have a dataset which I would like to update several times to give me monthly records, for 12 months, for example. can this be done in an SQL recipe using loops? Or can it be done? pls let me know ho…
Question
Ignition
Started by jaronarboleda
Most recent by jaronarboleda
Nov 15, 2022
0
2
Last answer by
Last answer by jaronarboleda
thanks for the suggestions
Reply to Discussion
Reply to Discussion

Writing geotiff file into folder

Hello community, I want to write .TIF file into a folder, after I run the code recipe the run is successful but the file is empty. Here is an example of my code and a photo of the output: input_folder…

Answered ✓

Ignition

Started by razan

Most recent by razan

Nov 15, 2022

Solution by

Solution by razan

I have solved it using os library, where you open the output folder using os.chdir() and after you flush the file cache you delete the output dataset (out_ds) or set it to None.

import os

input_folder = dataiku.Folder('input_folder_name')
output_folder = dataiku.Folder('output_folder_name')

os.chdir(output_folder.get_info()['path'])

band = input_folder['path'] + inputfolder.list_paths_in_partition()[0]

in_ds = gdal.Open(band)
in_band = in_ds.GetRasterBand(1)

gtiff_driver = gdal.GetDriverByName('GTiff')
out_ds = gtiff_driver.Create('nat_color.tif',
                             in_band.XSize, in_band.YSize, 1,
                             in_band.DataType)

out_ds.SetProjection(in_ds.GetProjection())
out_ds.SetGeoTransform(in_ds.GetGeoTransform())

out_band = out_ds.GetRasterBand(1)
out_band.WriteArray(in_band.ReadAsArray())
out_ds.FlushCache()
for i in range(1, 4):
    out_ds.GetRasterBand(i).ComputeStatistics(False)

out_ds = None
in_band = None
in_ds = None

Reply to Discussion

How to remove a message.
How can one remove a message? It no longer applies. I am referring to this message. Thanks.
Question
Ignition
Started by Erlebacher
Most recent by tgb417
Nov 11, 2022
0
1
Last answer by
Last answer by tgb417
@Erlebacher
,
The message you are looking at did not come through.
Reply to Discussion
Reply to Discussion
API deployer debugging
Hi all, I am deploying a api pod in kubernetes, our process seems to be slow. Can i find out where do we find the python printouts in logs within the pods? Is there a place where we need to set the ht…
Question
Api node & API deployer
Ignition
Started by james147
Most recent by JordanB
Nov 11, 2022
0
1
Last answer by
Last answer by JordanB
Hi @james147
,

Where are you experiencing the slowness - is it during the deployment (when you select deploy or update) or while the api is deployed? Note, API deployment requires rebuilding the code env. If you are using R, or code envs with very large Python packages, or Python packages for which precompiled binaries are not available, we confirm that this can take some time. If you are not using R, you can rebuild the deployment without it.

In DSS 10.0.6+ to obtain the pod logs (apimain.log) you can use Administration - Cluster - Actions and run the following command.

kubectl exec <podname> -- cat /home/dataiku/data/run/apimain.log

To identify the pod name, go to Administration - Cluster - Monitoring and find the name(s). You may need to check each pod logs.

You can also manually redirect the apimain.log output to stdout:

cat /home/dataiku/data/run/apimain.log 2>&1

Is there a place where we need to set the http timeouts? > Can you expand on this?

Where can i find the dockerfile that is used to build for this docker image when we do a deployment with code env? > The image can be found within the datadir under /tmp/api_deployer, however, I'm not sure this will help troubleshoot the slowness unless you have customized it.

If you could provide details regarding where/when this slowness is occurring, that would be great!

Thanks!

Jordan
Reply to Discussion
Reply to Discussion
many-to-many relationship in DSS
Are there any examples on how to build a many-to-many relationship within Dataiku? I find it strange that there appears to be no one that even asked for this before. Any direction would be greatly app…
Question
Ignition
Started by jrmathieu63
Most recent by jrmathieu63
Nov 10, 2022
0
1
Last answer by
Last answer by jrmathieu63
We are looking for directions on, if possible, to defined a bridge table within Dataiku to support the many to many relationship between the other tables.
Is this even possible?
Reply to Discussion
Reply to Discussion

441 - 450 of 51745

Trending Discussions

Exporting to Windows Network Drive Folder Location
Answered
2
Send dataset to Teams message
Answered
2
Regarding, Dataiku Scenario, How to control the Scenario steps using variable?
Answered
5

Leaderboard

Member	Points
Turribeach	3702
tgb417	2515
Ignacio_Toledo	1082

Using Dataiku

Top Tags

Trending Discussions

Leaderboard