Using Dataiku

Sort by:
441 - 450 of 517
  • I have a custom multiclassification algorithm. I cannot figure out the conditions under which the Dataiku scoring system will call predict_proba. Thanks for any help. Gordon Operating system used: Mar…
    Question
    Started by Erlebacher
    Most recent by Erlebacher
    0
    1
    Erlebacher
    Last answer by Erlebacher

    I have made some progress, within the "lab" attached to one of my training sets. I have defined two scoring functions, and they run properly (I return "synthetic" data consistent with the required formats). But then I get the following error:

    ValueError: Classification metrics can't handle a mix of multiclass and continuous targets

    It is true that my features are a mixture of multiclass and continuous targets. Still, I have two questions:

    1) if I am writing custom functions, why should Dataiku care about this mixture?

    2) why can't this error be listed before wasting computational resources processing my custom functions? Dataiku must know about this mix immediately at the very first stage when my custom algorithm is trained (which Dataiku had no issues with.)

    3) What is happening after the function scoring?

    Erlebacher
    Last answer by Erlebacher

    I have made some progress, within the "lab" attached to one of my training sets. I have defined two scoring functions, and they run properly (I return "synthetic" data consistent with the required formats). But then I get the following error:

    ValueError: Classification metrics can't handle a mix of multiclass and continuous targets

    It is true that my features are a mixture of multiclass and continuous targets. Still, I have two questions:

    1) if I am writing custom functions, why should Dataiku care about this mixture?

    2) why can't this error be listed before wasting computational resources processing my custom functions? Dataiku must know about this mix immediately at the very first stage when my custom algorithm is trained (which Dataiku had no issues with.)

    3) What is happening after the function scoring?

  • Is it possible to set up notifications for admins in Dataiku for certain events? Below is a (non-exhaustive) list of certain events I am interested in: * A new deployment is created by a user * A new …
    Question
    Started by yashpuranik
    Most recent by fvargaspiedra
    2
    1
    fvargaspiedra
    Last answer by fvargaspiedra

    I have the same question. Subscribing to @yashpuranik
    's question.

    fvargaspiedra
    Last answer by fvargaspiedra

    I have the same question. Subscribing to @yashpuranik
    's question.

  • Hello, I'm working with partitoned managed folders and i would like to get the pattern of the partionning (available in the web interface ; see screenshot) For Datasets, I can get such informations wi…
    Question
    Started by Moon11
    Most recent by Moon11
    0
    2
    Last answer by
    Moon11
    Last answer by Moon11

    Thx it was exactly what I was looking for !

  • Hello everyone, I began using Dataiku a few days ago. I have a lot of "address" data, and I tried to use the Geocoder pluging in order to convert them into usable coordinates and geopoints. As this pl…
    Question
    Started by PeteGore
    Most recent by Alexandru
    0
    1
    Last answer by
    Alexandru
    Last answer by Alexandru

    Hi @PeteGore
    ,

    The python processor within a prepared recipe can only apply to one column.

    You can compute both and put them to the targeted column for the python processor and later split it to separate columns with another processor.

    You can perform this in a Python recipe instead.

    You can leverage project libraries if you need to reuse python code https://doc.dataiku.com/dss/latest/python/reusing-code.html

    Or package a python recipe as a plugging:

    https://doc.dataiku.com/dss/latest/python/reusing-code.html#packaging-code-as-plugins

  • Using the following code: from sklearn.metrics import rand_score (which is what is required per scikit-learn.org I get error ---------------------------------------------------------------------------…
    Question
    Started by cwentz
    Most recent by cwentz
    0
    4
    Last answer by
    cwentz
    Last answer by cwentz
    ImportError                               Traceback (most recent call last)
    <ipython-input-187-d43b243c5958> in <module>
    ----> 1 from sklearn.metrics import rand_score
    
    ImportError: cannot import name 'rand_score'
  • I have a dataset which I would like to update several times to give me monthly records, for 12 months, for example. can this be done in an SQL recipe using loops? Or can it be done? pls let me know ho…
    Question
    Started by jaronarboleda
    Most recent by jaronarboleda
    0
    2
    Last answer by
    jaronarboleda
    Last answer by jaronarboleda

    thanks for the suggestions

  • Hello community, I want to write .TIF file into a folder, after I run the code recipe the run is successful but the file is empty. Here is an example of my code and a photo of the output: input_folder…
    Answered ✓
    Started by razan
    Most recent by razan
    0
    1
    Solution by
    razan
    Solution by razan

    I have solved it using os library, where you open the output folder using os.chdir() and after you flush the file cache you delete the output dataset (out_ds) or set it to None.

    import os
    
    input_folder = dataiku.Folder('input_folder_name')
    output_folder = dataiku.Folder('output_folder_name')
    
    os.chdir(output_folder.get_info()['path'])
    
    band = input_folder['path'] + inputfolder.list_paths_in_partition()[0]
    
    in_ds = gdal.Open(band)
    in_band = in_ds.GetRasterBand(1)
    
    gtiff_driver = gdal.GetDriverByName('GTiff')
    out_ds = gtiff_driver.Create('nat_color.tif',
                                 in_band.XSize, in_band.YSize, 1,
                                 in_band.DataType)
    
    out_ds.SetProjection(in_ds.GetProjection())
    out_ds.SetGeoTransform(in_ds.GetGeoTransform())
    
    out_band = out_ds.GetRasterBand(1)
    out_band.WriteArray(in_band.ReadAsArray())
    out_ds.FlushCache()
    for i in range(1, 4):
        out_ds.GetRasterBand(i).ComputeStatistics(False)
    
    out_ds = None
    in_band = None
    in_ds = None

  • How can one remove a message? It no longer applies. I am referring to this message. Thanks.
    Question
    Started by Erlebacher
    Most recent by tgb417
    0
    1
    Last answer by
    tgb417
    Last answer by tgb417

    @Erlebacher
    ,

    The message you are looking at did not come through.

  • Hi all, I am deploying a api pod in kubernetes, our process seems to be slow. Can i find out where do we find the python printouts in logs within the pods? Is there a place where we need to set the ht…
    Question
    Started by james147
    Most recent by JordanB
    0
    1
    Last answer by
    JordanB
    Last answer by JordanB

    Hi @james147
    ,

    Where are you experiencing the slowness - is it during the deployment (when you select deploy or update) or while the api is deployed? Note, API deployment requires rebuilding the code env. If you are using R, or code envs with very large Python packages, or Python packages for which precompiled binaries are not available, we confirm that this can take some time. If you are not using R, you can rebuild the deployment without it.

    In DSS 10.0.6+ to obtain the pod logs (apimain.log) you can use Administration - Cluster - Actions and run the following command.
    kubectl exec <podname>  -- cat /home/dataiku/data/run/apimain.log 
    To identify the pod name, go to Administration - Cluster - Monitoring and find the name(s). You may need to check each pod logs.
    You can also manually redirect the apimain.log output to stdout:
    cat /home/dataiku/data/run/apimain.log 2>&1 

    Is there a place where we need to set the http timeouts? > Can you expand on this?

    Where can i find the dockerfile that is used to build for this docker image when we do a deployment with code env? > The image can be found within the datadir under /tmp/api_deployer, however, I'm not sure this will help troubleshoot the slowness unless you have customized it.

    If you could provide details regarding where/when this slowness is occurring, that would be great!

    Thanks!

    Jordan

  • Are there any examples on how to build a many-to-many relationship within Dataiku? I find it strange that there appears to be no one that even asked for this before. Any direction would be greatly app…
    Question
    Started by jrmathieu63
    Most recent by jrmathieu63
    0
    1
    Last answer by
    jrmathieu63
    Last answer by jrmathieu63

    We are looking for directions on, if possible, to defined a bridge table within Dataiku to support the many to many relationship between the other tables.

    Is this even possible?

441 - 450 of 51745