Outlier identification in pharmaceutical retail
When a last months’ event-log and lost sales corrected sales value is “too high or too low” compared to “what we expect it to be” according to the last months real demand, it is called an outlier.
The purpose of outlier identification is to point out incoherent values that should be double checked by sales and category managers.
Definition of the expected sales value for a product
In order to make outlier correction a simple, but yet efficient, an expected sale volume should be calculated for each SKU. This section defines the methodology for doing so and the underlying theory that justifies this choice.
Any product follows its proper lifecycle, which means its corrected sales could behave like the following through time: Source : http ://www.quickmba.com/marketing/product/lifecycle/
Then they should follow a predictable pattern, the model. This assumption is also true locally in time: if the market conditions stay unchanged for a given period then the corrected sales (that will be referred to as sales from now on) for that period will follow a given model.
Any difference between the model and the corrected sales is due to the random aspect of the sale, that will be referred to as the perturbation. Whatever the probability function it follows, the perturbation has an average close to 0 over time (the more sales data you have, the more it is verified) and a given standard deviation. In this simulation the corrected sales (green triangles) are following the same model for 20 months with more or less 50% error. Average error is 0.05% (of model value) and standard deviation is 0.29%.
If the standard deviation of the perturbation is close to (or even higher than) the value of the model (sales in qty) for the last months, then the sales of the next month are unpredictable.
In that case a most naïve forecast, the moving average, is the best to determine the expected value.
If it’s not the case (perturbation