Anomaly Detection is the identification of outliers or exceptions that do not conform to an expected pattern. It is a very popular statistics and machine learning concept that has applications in a variety of fields. There are many different methods of executing this concept, which generally depends on the type of data and the definition of what is anomalous in the dataset.
At dv01, we chose to apply a more traditional statistical method to define and detect anomalies within loan performance. Under the assumption that performance metrics are distributed normally, we calculate how loan pool performance differs from its historical mean at each point in time.
Using dv01's immense loan database, we construct cohorts of loans, grouping them by platform, program, term, grade, and age. We use this grouping method because our research has shown loans within these groupings tend to perform similarly. For each performance metric, we find the mean and volatility of these cohorts for each month, as well as across all months. We are then able to match each loan in a given pool to its respective cohort's historical means and volatilities. (Since loans purchased into securitizations and portfolios are almost always current, we make sure to match them to a cohort that only contains loans that were current and the same age as the securitization/pool loan at purchase date). Aggregating each loan's matched cohort's historical performance numbers allows us to construct an overall historical mean and volatility at each date. Aggregating the loan pool and comparing it to the aggregated cohort historical mean and volatility allows us to calculate a z-score, which signifies how many standard deviations a pool's performance is from its historical cohort average in statistical terms.
Available Platforms & Programs
Due to the statistical nature of this tool, a long history is required for the calculations to be meaningful. Thus, only a subset of platforms and programs are covered by Anomaly Detection. Currently, only loans within the consumer unsecured asset class are available; the platforms and programs covered are:
- Lending Club Super Prime/Prime/Near Prime
- Prosper Core/Extended
Using Anomaly Detection
Anomaly Detection is available in our Intelligence product. Choose the desired loan pool you wish to explore, then select the type of performance type you wish to explore. The resulting graph shows how the selected loan pool at each date compares in performance with respect to historical mean and volatility.
Clicking on a bar in the chart or using the dropdown menu below will allow for deeper exploration into anomalies within the selected date. It is important to note that months with seemingly low deviation might have subsets that strongly deviate from their cohort history. By default, the loans for are stratified by loan term (but can be changed), and up to two more strats can be selected using the Attribution Factor dropdown menus. The table generated shows how the loans in each stratification compares in performance to historical means and volatility of the same stratification.
The Actual value is the performance for the given stratification for the selected date, while the Expected value is the historical performance mean. The Z-Score is the number of standard deviations the Actual performance is from the Expected performance.
If the stratification has resulted in a multitude of small stratifications of loans, they can be filtered out by entering a value in the Min # Loans box. Results can also be sorted in multiple ways: Diff from Expected (which is the difference between the Actual and Expected values), Highest Z-Score, Lowest Z-Score, and Absolute Value of Z-Score.