Discover the winners & finalists of the 2022 Dataiku Frontrunner Awards!READ THEIR USE CASES

Matrix prediction

Hello 🙂

I am studying a system that would make predictions for the value of a house. By using Dataiku, I already know how to go far, the software is powerful... BUT!

I realize that the software does not allow "multiple interpolated predictions (matrix prediction)"


To express myself better, here is an example of features

  • House: Price
  • House: Square meter at main floor 0
  • House: Numbers of floor
  • House: Luminosity (% of windows, regarding to the concrete walls)
  • House: Garage
  • Garden: square meter
  • Garden: secluded (closed)
  • Area: Quiet indice
  • Area: Convenience (proximity of school, shop, etc)


today, if i want to have a prediction on price or floor space, i have to create a prediction, choose the features, run it, and finally i would have my prediction... but which will be highly rigid and inflexible on the amount of basic features.
(with this prediction, tomorrow I couldn't estimate the price if I don't have all the features)


My idea:
create a new type of automated prediction, with the 9 features as input, and as output, hundreds of predictions for each feature depending on the possible configuration of the other features

  • prediction 1: price (floor space,)
  • prediction 2: price (floor space, Garage, luminosity, etc)
  • prediction 3: quiet index (price, convenience, etc)
  • ...
  • prediction 2040: convenience (all features as input)


(2040 because with 9 features, it's the number of analysis you need to cover all the configuration)


Thx in advance

Status changed to: Acknowledged

Thanks for the suggestion.

Level 1

Here, some screens i did to have a better understanding of what i imagine:


New analysis screen (part 1):
You choose the new type of model (5Th)




New analysis screen (part 2) ==> Top, Left: "All features"
Once done, "All Features" are selected at top-left. The rest don't change




Design ==> Metric screen:
As the analysis will be a set of several analyzes (performed automatically), it is logical that one cannot apply a type of optimization without knowing the type of variable that one is looking for.




Design ==> "Feature's matrix" new section:
Here, the logic operation is explained.
If we have an array of 4 features (A, B, C, D), then 20 predictions in total will be calculated:
A explained by B
A explained by B and C

Of course, the more a feature is explained by a large number of parameters, the more accurate the prediction will be. (represented by a general diagram)




The choice of the prediction will automatically based on the inputs:

If I have a dataset with only "A", "D" ==> then the Matrix prediction module will automatically know which prediction to choose.

The rest must also be thought out and developed:
FLOW's screen: a new logo