Anomaly detection | Need Expertise

adf057
adf057 Registered Posts: 6 ✭✭✭

I have a problem, where I need to detect anomalies in a daily shipment report.

The shipment report contains planned shipments from location A to location B.

—-key fields—-

shipment id

material id

shipment created date

shipment created time

quantity

amount

international/domestic

material type

Carrier name

business unit

source location

destination location

distance

—- end key fields —

I then pre processed my data, extracted date time components and created some new features, like,

price per unit

price per km

ratio of distance/quantity


Modeling:

I ran my Isolation forest model, with PCA enabled and disabled and my silhouette score as below-

1. PCA enabled : 0.56

2. PCA disabled : 0.13


Additionally, materials are not shipped on daily basis, that would totally depend on the demand.

My input dataset to the model has 9,000 rows, for the last 13 months of the data.

I am still thinking what new features I can create and if aggregates, lags are of any business value.


Also as I train my model, do I exclude the base features of amount, quantity and distance as I have created a new features from those.

Any help or feedback would be appreciated.

Tagged:
Setup Info
    Tags
      Help me…