How to access jupyter notebooks setup for Dataiku?
Answers
-
Hi,
You can find the ipython log file in the DSS administration screen, under Maintenance > Log files > ipython.log. Alternatively, if you have access to the server by command line, the file is located in <DSS_DATA_DIR>/run/ipython.log.
Having said that, it is a bit surprising to get a memory crash on a 1GB csv file. Assuming you use pandas to load and transform it, a good rule of thumb is to have 5-10GB of free memory (see http://wesmckinney.com/blog/apache-arrow-pandas-internals/). Have you checked that you are not printing too much to the notebook output? [EDIT] Sorry, I had read too quickly and did not notice that you were using R. Could you check what is the object.size of the CSV file after it is loaded into an R object?
Are you at liberty to share your code and underlying data?
Cheers,
Alex -
Sure, tried for fun some forecasting analysis by rerunning the following code in dataiku R notebook https://www.kaggle.com/merckel/preliminary-investigation-holtwinters-arima/data.
I found that crashing part is the melt of forecast results (probably uniqe() is too much to handle here):
```
meltX <- melt(<BR /> X[, which(names(X) %in% c(unique(keys$Date), "Page")), with = FALSE],
measure.vars = unique(keys$Date),
variable.name = "Date",
value.name = "Visits")
meltX$Date <- as.character(meltX$Date)<BR />``` -
Which package does the melt function come from?