# Transition some coding steps to Dataiku Recipe

Level 1
###### Transition some coding steps to Dataiku Recipe

Hello,

My team build one machine learning model previously and I am transition the steps from coding to recipe.

I am curious if I can use some recipes to replicate the same data progress, or I could only stick with R.

i. Grouping Stage

Code written in R

``````jf <- dataS %>%
group_by(COLUMN_NAME)%>%
summarise(count_jf = n())%>%
mutate(Per = prop.table(count_jf))%>%
arrange(desc(Per))%>%
filter(Per>0.005)

dataS\$BUCKET_COLUMN_NAME <- ifelse(dataS\$COLUMN_NAME %in% jf\$COLUMN_NAME,
dataS\$JCOLUMN_NAME,'OTHER')

NEW_BUCKET_COLUMN_NAME <- dataS %>%
group_by(BUCKET_COLUMN_NAME) %>%
summarise(MED_NEW_BUCKET_COLUMN_NAME = median(COLUMN_NAME2))``````

Basically this is trying to create some new columns based on grouping, I think I can complete this with the GROUP recipe (with computed columns in it). The only issue for this step is the percentile, is there anything I can get the top/bottom 5% percentile and eliminate it?

ii. Removing outliers

``````outlier_norm <- function(x){
qntile <- quantile(x, probs=c(.25, .75),na.rm = T)
caps <- quantile(x, probs=c(.05, .95),na.rm = T)
H <- 1.5 * IQR(x, na.rm = T)
x[x < (qntile[1] - H)] <- caps[1]
x[x > (qntile[2] + H)] <- caps[2]
return(x)
}``````

Here is a function to remove the outliers based on the calculations. For this one, I don't know which recipe I can use to perform same calculation. Can anyone tell me if this is possible to replicate by Dataiku recipe?

Thank you very much for reading. Hopefully I can get some answers for these questions.

Best,

Tim

0 Replies