Survey banner
The Dataiku Community is moving to a new home! We are temporary in read only mode: LEARN MORE

Community Conundrum 12: Flight Delays

MichaelG
Community Manager
Community Manager
Community Conundrum 12: Flight Delays

Generic Community Conundrums - header for posts7 (2).png

Flight delays - frustrating for sure but are they predictable?

Attached is a dataset of just under 400,000 domestic US flights. Each flight has some useful data around origin, destination, time of departure, and (vitally for our purposes) if the flight has a 15 minute or more arrival delay. 

Can you use all this data to predict if a flight will be delayed? Share your most significant features in the comments!

 

PS: before you open the data, ask yourself, what percent do you think will have a 15 minute or more delay? Test your intuition before you test your DSS skills!

I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
3 Replies
tgb417

@MichaelG 

My incorrect first guess was 10%.

Others what is your first guess on flight delays?

--Tom
0 Kudos
MichaelG
Community Manager
Community Manager
Author

10% is a pretty good guess though! 

I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as ‘Accepted Solution’ to help others like you!
tgb417

With the imbalanced target Class.  Do we think it advisable to actually use all of the data?  It took my little 2 core laptop between 16 and 50 minutes to computer some of the basic models.  I'm going to start out with a sample for some of my early models in order to decrease compute time for each trial.  

--Tom
0 Kudos