Response to consultation on supervisory handbook on the validation of rating systems under the Internal Ratings Based approach

Go back

1a) How is the split between the first and the subsequent validation implemented in your institution?


In ESBG’s view, the organizational suspension of the validation function should be independent of the size of the bank. It should be assessed based on criteria oriented toward the objective of objective validation decisions rather than formal organizational structures.

In addition, it should ensure that the validation manual is free of inconsistencies with existing supervisory regulations. In particular, the ECB Guide to Internal Models (General Topics, lit. 65) contains detailed expectations for validation. For example, specific advanced analyses will be performed every three years and will be the initial validation after each material model change. In our view, the EBA Validation Manual should be based on the ECB Guide to Internal Models rules. These guidelines have meanwhile proven their worth in practice. Instead of detailed organizational requirements, such as separate units for the operational execution of development and validation activities, it should generally ensure through organizational measures (including standardized control procedures) that development and validation activities are carried out free of conflicts of interest.

We believe that it would also be helpful to consider the aspect of proportionality more closely. The intensity and scope of validation activities must always be based on the expected data situation, the importance of the rating procedure, and the scope and complexity of the changes made.

The manual formulates a requirement [33] that the validation concept includes a description of the data collection process and selection of all data sets used for validation. At the same time, it is expected (Focus Box 1) that all types of data preparation steps are well documented in the validation report. It should instead clarify that it is sufficient if the relevant documentation is available in one place and the appropriate references allow a clear picture of the data collection and preparation.


The validation policy of some ESBG members envisages 3 types of validation activities: initial, full, and regular validation. An initial validation is a comprehensive and in-depth validation across all validation areas and tests, which is performed for any new models or material model changes. Full validation is performed every 3 years and regular validation is performed every year to cover all relevant aspects at the relevant frequency as defined in the European Central Bank (ECB) Guide to internal models (IM).

The split between the first and subsequent validation is defined as follows:
- First (initial) validation is conducted for the new models (roll-out to IRB) or material model changes;
- Subsequent validation is conducted on annual (regular validation) or every 3 years (full validation) basis to cover all changed as well as unchanged aspects of the model since last validation.
- The reason to assess non-material model changes within the annual validation (i.e. unchanged aspects in line with the section 5 and changed aspects with the section 4) after implementation is threefold:
- Internal validation comprehensively assesses materiality of any ex-ante and ex-post notification in line with the qualitative and quantitative criteria;
- Ex-ante and ex-post notifications have very limited impact on the RWA variability;
- Efficiency and comprehensiveness of the internal validation processes.

1b) Do you see any constraints in implementing the proposed expectations (i) as described in section 4 for the first validation for a) newly developed models; and b) model changes; and (ii) as described in section 5 for the subsequent validation of unchanged models?

For points (i) a) and (ii), we do not see any major constraints to implement expectations of the section 4 and section 5 respectively. However, we strongly believe that requirement to implement expectations of the section 4 for ex-ante or ex-post notifications in line with (Commission Delegated Regulation (EU) No. 529/2014 with regard to regulatory technical standards for assessing the materiality of extensions and changes of the Internal Ratings Based Approach), will increase inefficiencies and produce unduly burdensome in the assessments of the internal validation functions.

In the basic approach, the EBA for validation activities focuses on changed facts to check them intensively. For non-changed points, on the other hand, a system based on standard analyses is pursued. We welcome this approach in principle. However, the approach proposed by the EBA hardly differentiates between the materiality of models/portfolios and model changes in line with the proportionality principle. In our view, however, a fundamentally identical audit approach in the validation for all models that have not been changed and all types of model changes are not risk-adequate, i.e., differentiation should be made here in a risk-adequate manner.

In ESBG’s opinion, in order to define comprehensive overall validation outcome, all the validation areas and tests have to be executed, namely for changed as well as for unchanged aspects of the model. Therefore, we are proposing that model changes driven by the ex-ante or ex-post notifications are assessed in the subsequent (annual validation) after the implementation in line with the section 4 for the changed aspects and in line with the section 5 for the unchanged aspects of the model. Moreover, with respect to point c. of the paragraph [26] we would expect further operationalization, as material changes in the range of application of a model might vary in their magnitude.

Actual difficulties can also be encountered in terms of additional data used in the construction of the models in order to execute test out of specification (OOS)/out of trend (OOT) with the more recent data. Moreover, with respect to the use of external data (par 57), the use of the EBA benchmarking could not be comparable to the information that is already known by the institution due to, for example, different portfolio composition.

Furthermore, as we understand it, para 88 formulates the expectation that the documentation of any notifiable change will be tested by validation before notification. We do not consider this appropriate as this task is not related to the other tasks of validation, especially where non-material changes are concerned. If an independent formal review of notification documents is considered necessary prior to notification, this should be specified in general terms and in any event without assigning the task to the validation function.

Question 2: For rating systems that are used and validated across different entities, do you have a particular process in place to share the findings of all relevant validation functions? Do you apply a singular set of remedial action across all the entities or are there cases where remedial actions are tailor-made to each level of application?

Some of our members define 2 types of internal validation findings, namely central and local validation findings. Local findings refer always to the model under investigation and are addressed directly to the model owner and local entity on the solo level, whereas central findings usually refer to e.g. centrally defined methodologies or guidance, and are usually identifying deficiency related to more than one model or to the group-wide models. Both types of the findings are considered in the evaluation of the final validation outcomes with their respective severity.

3a) Do you deem it preferential to split the review of the definition of default between IRB-related topics and other topics?

ESBG believes that a generally applicable answer to these questions is not reasonably possible. Whether it makes sense to split the review of the definition of default (DoD) between IRB and non-IRB issues depends on the specific IRB processes in the respective institution, the portfolios affected, and the other processes involved (e.g., accounting, depending on the accounting standard). Therefore, the determination of which role, if any, the validation function should have in the DoD review and which tasks, if any, should be assumed by other organizational units in the respective institution must be made case-by-case basis. Therefore, in our opinion, it should be refrained from making a general specification in this regard.

3b) If you do prefer a split in question 3a, which topics of the definition of default would you consider to be IRB-related, and hence should be covered by the internal validation function?

All the topics having an impact on the performance of the model in terms of risk differentiation and risk quantification, which includes:
o Quality of the DoD simulation, if relevant;
o Representativeness analysis;
o Impact on discriminatory power of the models;
o Impact on back-testing of final estimates;
o Impact of stability of final estimates.

Question 4: Which approach factoring in the rating philosophy of a model into the back-testing analyses should be considered as best practices?

Regulation 2022/439 Art. 12 lit. f specifies that the rating philosophy must be considered in backtesting analyses, among other things. Furthermore, the EBA Guidelines (EBA/GL /2017/16) para 66 lit. c in conjunction with para 67, also specify how this should do: the expected responsiveness of PDs concerning changes in macroeconomic conditions based on the respective rating philosophy is examined to determine whether the actual behavior of PDs with default rates over time corresponds to these expectations. In our view, this specification is as specific and concrete as is reasonably possible in a generally usable form.

The appropriate approach in the validation of the concrete procedure in each case must be specific to the rating philosophy chosen in each case, the characteristics of the respective model, and the underlying segment, and must be designed accordingly (e.g., taking into account the cyclicality of the segment and the calibration method in the respective model).

Therefore, in our view, there is no generally applicable "best practice" approach for the specific procedure to consider the rating philosophy in backtesting analyses. Accordingly, we should refrain from defining or recommending a straightforward "best practice" approach. It would instead make sense to examine the development of the one-year default rates and mean PDs over time and to define substantial deviations as a reason for further checks concerning the rating philosophy.

An example of the best practice which has been identified by some of our members to factor rating philosophy in the back-testing, would be to assess final probability of default (PD) estimates at any given point in time and at any relevant level against default rate (DR), which is appropriately adjusted to reflect rating philosophy of the model. For this purpose, one needs to operationalize the concept of rating philosophy (through grade assignment dynamic) and the appropriate adjustment.

In case of minimal dynamic of a pure through-the-cycle (TTC) rating system, rating distributions are following the portfolio structure, but are insensitive to economic circumstances. In such a situation the portfolio average PD remains stable, there is no correlation between portfolio average PDs and default rates. In this case, no adjustment to the LRADR is required and portfolio PD back-test is conducted against LRADR at any given point in time.

In case of maximum dynamic of a pure point-in-time (PIT) rating system, rating systems follow the cycle, and default rates per grade are stable (besides random fluctuations) irrespective of the cycle status. In this case, the default rates on the validation sample are the proper targets for back-testing.

As none of these extreme cases can be achieved, hybrid grade assignment dynamics should be characterized by its degree of PITness, specified as a value between 0% and 100%. With such a quantification at hand, the adjusted target for back-testing shall be defined as (1 – degree of PITness) * (LRADR reflecting the long run average) + (degree of PITness) * (ODR on validation sample).

Question 5: What analyses do you consider to be best practice to empirically assess the modelling choices in paragraph [76] and, more generally, the performance of the slotting approach used (i.e. the discriminatory power and homogeneity)?

We believe that the best practice approach to empirically assess the modelling choices in paragraph [76] would be:
- Sensitivity analysis by statistical means of migrations driven by the changes in the inputs or aggregation logic at any relevant level;
- Cash-flow back-testing by statistical mean at any relevant level of the model including material sub-ranges of the portfolio – this approach may nevertheless imply higher costs than benefits for the understanding of the model performance.

Moreover, for assessment of the discriminatory power and back-testing purposes, the best practice would be to factorize realized losses by default rate (DR) and loss rate (LR) component over the longer time period and to test it against expected losses (EL) at any relevant level (i.e. model and slot). Discriminatory power can be accessed via monotonicity of the realized losses per SLOT. Homogeneity can be assessed by employing back-testing procedure at any relevant material sub-ranges of the portfolio (i.e. geography, residual maturity, balloon payments, exposure at default (EAD) etc.). It should however be considered that this practice could be not conclusive in relation to the number of defaults that are available.

From a methodological point of view, we believe that the validation procedure for supervisory slotting approaches must also be designed in a risk-adequate manner in line with the proportionality principle. In particular, the materiality of the portfolio must be taken into account by the respective slotting approach.

6a) Which of the above mentioned approaches do you consider as best practices to assess the performance of the model in the context of data scarcity?

To conduct the validation solely based on either out of trend (OOT) or an out of specification (OOS) sample, using data not used at all by the credit risk control unit (CRCU) for the model development.

6b): More in general, which validation approaches do you consider as best practices to assess the performance of the model in the context of data scarcity?

ESBG would consider following validation approaches as a best practice:
- Aggregation of data from different observation periods or consideration of analyses based on multi-year periods;
- Testing with external benchmarks (e.g. external ratings or market driven metrics such as bond spreads) - this could however be not conclusive in relation to the number of defaults that are available.
- Comparison with internal credit expert ranking (e.g. blind rank ordering tests, whereby the ranking produced by the model is assessed against the ranking produced by credit experts). This process may nevertheless lead a relevant efforts for the institution.

ADDITIONAL COMMENTS BY ESBG (also available from page 6 of the attached full consultation response)

Upload files

Name of the organization

European Savings and Retail Banking Group (ESBG)