Response to consultation on supervisory handbook on the validation of rating systems under the Internal Ratings Based approach

Go back

1a) How is the split between the first and the subsequent validation implemented in your institution?

NA

1b) Do you see any constraints in implementing the proposed expectations (i) as described in section 4 for the first validation for a) newly developed models; and b) model changes; and (ii) as described in section 5 for the subsequent validation of unchanged models?

The proposal seems consistent with the generally adopted definitions of model change, new models, etc. and it also seems to us to be in line with how internal models are inspected. Therefore, we see no particular constraints in implementing this differentiation.

Question 2: For rating systems that are used and validated across different entities, do you have a particular process in place to share the findings of all relevant validation functions? Do you apply a singular set of remedial action across all the entities or are there cases where remedial actions are tailor-made to each level of application?

In general, a deficiency on a model is such regardless of the level of its application. The management of the findings should therefore preferably be unique, applying a singular set of remedial actions across all the entities.
Eventually, downstream from the validation process, some findings could be identified as not material / not relevant for specific portfolios / sub-portfolios with the aim of not invalidating the assessment of a model on a certain perimeter with deficiencies that do not affect it (e.g. the cross-country model has a finding that does not have a material impact on Country A Portfolio, then the assessment of that model on the Country A entity should not be negative because of that finding).
In several experiences, we have seen how the use of specific Model Risk Management software tools has brought considerable advantages, both from the point of view of optimising and tracking the sharing of findings between validation functions (so as to avoid duplications, inconsistencies, etc.), and for defining a structured process for sharing and applying these findings.

3a) Do you deem it preferential to split the review of the definition of default between IRB-related topics and other topics?

We believe that to avoid a proliferation of definitions and also to simplify communication with third parties, it would be preferable to use a single definition.

3b) If you do prefer a split in question 3a, which topics of the definition of default would you consider to be IRB-related, and hence should be covered by the internal validation function?

NA

Question 4: Which approach factoring in the rating philosophy of a model into the back-testing analyses should be considered as best practices?

In our opinion, a best practice of incorporating rating philosophy into back-testing analyses should include the following steps:
1) the model should be defined - a priori - as PIT or TTC or Hybrid by the credit risk control unit (CRCU)
2) the rating (PIT or TTC or Hybrid) should be validated by Internal Validation (IV)
3) IV should develop differentiated metrics (e.g. more focussed towards Risk Quantification for TTC models, and towards Risk Differentiation for PIT models) and formally made them explicit in the institution's validation framework.
For example, for a TTC model, it might be interesting to evaluate the fluctuations over time of the average scores produced, which should remain within predefined ranges. To this end, it could be useful, in addition to analysing score trends over time, to test the model results after artificial perturbations in the inputs: for example, TTC models should produce results that are not too reactive to variations in the inputs (model variables). In this way, we believe that validation outcomes would be more consistent with the nature of the model (PIT or TTC or Hybrid).

Question 5: What analyses do you consider to be best practice to empirically assess the modelling choices in paragraph [76] and, more generally, the performance of the slotting approach used (i.e. the discriminatory power and homogeneity)?

The fact slotting approach produces fixed regulatory risk – weights poses a challenge when testing its performance given the absence of a direct measure to compare against “observed” outcome (both in term of default and loss).
A possible solution is to “imply” the PD (LGD values are not considered in this example) by reverting the regulatory equation for the RW calculation (where factor “K” here is considered equal to the slotting regulatory values) reported in attachment.
In this formula, for:
• Institutions under foundation approach: LGD is fixed to the F-IRB regulatory values
• Institutions under advanced approach: the Corporate LGD Model is applied (as average values by appropriate sector levels)
The approach allows to obtain an “Implied PD” (Impl_PD_K) value for each of the 5 slotting regulatory categories. These will be used to assess the slotting discriminatory power against:
• Observed historical defaults (challenging given the low default nature of the specialised lending portfolios)
• PD_EL derived from the Expected Loss Provisioning: PD_EL=EL / (LGD*EAD).
Same in terms of homogeneity test: to assess each slotting criteria contains homogeneous observations the Impl_PD_K can be used against observed defaults and PD_EL values.

6a) Which of the above mentioned approaches do you consider as best practices to assess the performance of the model in the context of data scarcity?

Among the alternative hypotheses proposed to evaluate the performance of models in data scarcity contexts:
1) Conduct the validation solely based on either OOT or an OOS sample, using data not used at all by the CRCU for the model development
2) Leverage on the analyses performed by the CRCU, where the CRCU has assessed the performance of the model via OOT and OOS samples only during intermediate steps, but has used the whole sample to train the final model
3) Complement the tests performed by the CRCU with in sample tests and qualitative analysis (such as with the one mentioned above);

Solution 1 would be the ideal option from a validation point of view, but may generate further instability in the model: if the context is already one of "data scarcity", excluding observations from the development sample for validation purposes may contribute to making the situation even worse.

Solution 2 makes it possible to use all available data by evaluating the stability of the OOT and OOS model during intermediate steps, but it is based exclusively on analyses carried out by CRCU and this certainly cannot be considered as a best practice from our point of view.

Solution 3 probably represents the best practice because it allows all available data to be used for estimating the model, but integrates the analyses performed by CRCU with those performed by the control unit: for example, in our experience in data scarcity contexts, cross-validation analyses using leave-one-out strategies have proved to be significant in order to identify the likely range of variation of performance metrics. Furthermore, in this case it is possible for IV to re-train the model again using leave-one-out samples and assess the stability of the results in terms of variable contributions, coefficient values, performance, etc. In this way it is possible to combine the maximum exploitation of the available data with a rigorous and independent assessment of the stability of the estimates, their performance, and so on and so forth.

6b): More in general, which validation approaches do you consider as best practices to assess the performance of the model in the context of data scarcity?

A further unmentioned alternative could be to use solution 1) by supplementing the development sample with synthetic data generated through advanced oversampling techniques.
In this way, real OOS/OOT data could be used to validate a model that has been estimated on real data as well as synthetic data. It is worth noting that his approach has the obvious drawback of using synthetic data for model training purposes.

Upload files

Name of the organization

Prometeia