Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
I am used to analyse R regression coefficients and I am a little bit confused about how to do it in dataiku. For instance on the Iris dataset, If I fit a regression on the iris dataset to explain sepal length with the Species and the Petal length I have :
Call:
lm(formula = iris$Sepal.Length ~ iris$Petal.Length + iris$Species)
Residuals:
Min 1Q Median 3Q Max
-0.75310 -0.23142 -0.00081 0.23085 1.03100
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.68353 0.10610 34.719 < 2e-16 ***
iris$Petal.Length 0.90456 0.06479 13.962 < 2e-16 ***
iris$Speciesversicolor -1.60097 0.19347 -8.275 7.37e-14 ***
iris$Speciesvirginica -2.11767 0.27346 -7.744 1.48e-12 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.338 on 146 degrees of freedom
Multiple R-squared: 0.8367, Adjusted R-squared: 0.8334
F-statistic: 249.4 on 3 and 146 DF, p-value: < 2.2e-16
The two regression coefficients iris$Speciesversicolor, iris$Speciesvirginica are to compared with the Species taken as reference (Setosa). Meaning, that iris$Speciesvirginica is the difference of sepal length in mean between the species virginica and setosa.
In dataiku, I have three coefficient and I don't know what is the reference. Besides, none of my coefficients are significative in dataiku whereas there are all significative in R :
species = Iris-virginica ☆☆☆ 9.01e-21.3485 0.3129
species = Iris-setosa ☆☆☆ 8.16e-2-1.4032- 0.3081
petal_l ☆☆☆ 4.02e-10.2486 0.2450
species = Iris-versicolor ☆☆☆ 4.57e-1-0.1091-0.0233
Intercept 5.8531
Could you explain why?