Community Conundrum 12: Flight Delays

MichaelG
Community Manager
Community Manager
Community Conundrum 12: Flight Delays

Generic Community Conundrums - header for posts7 (2).png

Flight delays - frustrating for sure but are they predictable?

Attached is a dataset of just under 400,000 domestic US flights. Each flight has some useful data around origin, destination, time of departure, and (vitally for our purposes) if the flight has a 15 minute or more arrival delay. 

Can you use all this data to predict if a flight will be delayed? Share your most significant features in the comments!

 

PS: before you open the data, ask yourself, what percent do you think will have a 15 minute or more delay? Test your intuition before you test your DSS skills!

I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!
3 Replies
tgb417

@MichaelG 

My incorrect first guess was 10%.

Others what is your first guess on flight delays?

--Tom
0 Kudos
MichaelG
Community Manager
Community Manager
Author

10% is a pretty good guess though! 

I hope I helped! Do you Know that if I was Useful to you or Did something Outstanding you can Show your appreciation by giving me a KUDOS?

Looking for more resources to help you use DSS effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!
tgb417

With the imbalanced target Class.  Do we think it advisable to actually use all of the data?  It took my little 2 core laptop between 16 and 50 minutes to computer some of the basic models.  I'm going to start out with a sample for some of my early models in order to decrease compute time for each trial.  

--Tom
0 Kudos