Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Anomaly detection | Need Expertise

adf057
Level 2
Anomaly detection | Need Expertise

I have a problem, where I need to detect anomalies in a daily shipment report.

The shipment report contains planned shipments from location A to location B.

—-key fields—-

shipment id

material id

shipment created date

shipment created time

quantity

amount

international/domestic

material type

Carrier name

business unit

source location

destination location

distance 

—- end key fields —

I then pre processed my data, extracted date time components and created some new features, like,  

price per unit

price per km

ratio of distance/quantity 


Modeling:

I ran my Isolation forest model, with PCA enabled and disabled and my silhouette score as below-

1. PCA enabled : 0.56

2. PCA disabled : 0.13


Additionally, materials are not shipped on daily basis, that would totally depend on the demand.

My input dataset to the model has 9,000 rows, for the last 13 months of the data.

I am still thinking what new features I can create and if aggregates, lags are of any business value.


Also as I train my model, do I exclude the base features of amount, quantity and distance as I have created a new features from those.

Any help or feedback would be appreciated.

0 Kudos
0 Replies