-
web logs annalysis
Hello, I write to you again. There are an other part I don't understand what I have to do here Feature engineering the referrer URLs -Split URL in referer, extracting only the hostname -Use the Find and replace processor on referer_host, replacing t.co with twitter.com and matching on the complete value of the string -In…
-
determine whether or not data contains noise
Dataiku's time series prediction learns waveform patterns and generates a model. Using that model, is it possible to input waveform data as test data and determine whether or not the test data contains noise? The purpose is to do something like data cleansing to determine whether the waveform input as test data is usable…
-
時系列データの品質確認について
dataikuの時系列予測で、波形パターンを学習してモデル生成し、 テストデータとして波形データを入力して、そのテストデータにノイズが乗っているか否かを判定する、というようなことは可能でしょうか。 テストデータとして入力した波形が使えるデータか否かを判定する、データクレンジング的なことをやることが目的です。 可能な場合、手順をご教示頂けると助かります。 宜しくお願い致します。
-
Sampling Techniques for AutoML Lab Recipe
Hi - I am new to using the Dataiku modeling interface. I am looking to create a two class classification algorithm using the 'AutoML' feature in the Visual ML lab section. The data I am using has a large class imbalance (80% negative & 20% positive). Where/ how can I make sure the data is being rebalanced correctly? i.e.…
-
problem web logs analysis
Hello. I have a problem in the formation : the course ask us to do that * Cleaning * Remove four columns: br_width, br_height, sc_width and sc_height * Rename the column client_addr to ip_address * Clear invalid cells in the ip_address column for the IP address meaning * Rename the column location to url How I make those…
-
Error while redeploying model from a notebook
I am trying to build an adjusting model in Dataiku. I want to leverage the APIs to do the same. I already have a model deployed in the flow. After each data refresh, I want to check the model's performance, and if it is below a threshold, I want to retrain the model. Below is the code I am using if trained_model_MAPE >…
-
when training a model with a visual recipe, does dataiku fit the model on the entire dataset?
Context: * I have deployed a model to the flow * I want to retrain that model with its associated "train" recipe * I understand that the model's performance is evaluated using a test set or K-folds under a cross-validation strategy My question: after retraining the model using the "train" recipe, is the resulting new…
-
Load from Oracle to Vertica
Hi everyone, How can I load data from an Oracle database to Vertica without drop the destiny table each time that I run the process? Because I try to use the sync recipe but each time that I run the flow, dataiku recreate the table in Vertica instead of append the new rows. I already try with the configurations of free…
-
R recipe dkuWriteDataset "Missing Activity ID" Error
Hello, I am using R recipe on Dataiku v12.5.1. At the end of the code block I am using dkuWriteDataset function to write results to dataset. I am getting error below. [17:50:01] [INFO] [dku.utils] - > dkuWriteDataset(All_preds_cutoff_Final,"sow_output_deneme") [17:50:01] [INFO] [dku.utils] - Start writing table to file ...…
-
Ml Model Training Failed
Hello, I attempted to execute a partitioned XGBoost model using a Python 3.8 coding environment. The log files containing the details are enclosed in the attached .txt file. and also the train diagnosis in a zip file. Could you assist me in comprehending the underlying cause and suggest potential solutions to address this…