Response to consultation on proposed RTS in the context of the EBA’s response to the European Commission’s Call for advice on new AMLA mandates
Question 1: Do you have any comments on the approach proposed by the EBA to assess and classify the risk profile of obliged entities?
Comments from FTS, developer of STRIX. 14 Supervisors have used STRIX to assess over 300 Sectors’ Reporting Cycles.
- Calculation of Residual Risk
FTS finds the suggested methodology to average entity IR and Controls scores to be an acceptable processing of two distinct models. This avoids the logic issue of the oft-used IR – C = RR relationship (or some modification of this, e.g. haircut), where some Controls are not directly related to any specific Inherent Risk, and thereby a subtraction becomes difficult to justify.
When used as a coordinate system (Inherent Risk = X, and Controls = Y), the suggested methodology produces a consistent heatmap of entity placement and the resulting Residual Risk classification.
Addressed later, FTS does advocate to allow poor controls to result in a residual risk that is higher than inherent risk.
2. Methodology does not Address whether it is Relative Risk or Absolute Risk
It is noted that the methodology to select thresholds have been deferred by the formation committee for the time being. FTS could suggest to give some attention to identify the fundamental type of thresholds to be used, Relative vs. Absolute. This understanding is important in order to gauge the level of effort to implement the AMLA methodology.
Relative and Absolute thresholds are key differences between entity-level risk assessments and a sectoral-level risk assessments. Relative thresholds are typically used to forcibly allocate entities’ risk distribution so that there are entities in each inherent risk classification which is useful for informing a risk-based approach to supervision. Absolute thresholds are typically used to identify a fixed measure of risk for comparison, for trending scores, and enabling downstream calculations of the scores.
FTS assumes that all jurisdictions will be using the same thresholds for AMLA’s EU-wide assessment. Confirmation is requested.
FTS also assumes that Absolute thresholds would be used for consistent comparison. Confirmation is requested. However, this does mean that NCAs must still also perform an additional entity-level risk assessment using Relative Risk thresholds (of their own choosing), for the purposes that their riskiest entities are identified as high inherent risk to them, and enabling risk-based supervision of their own sectors.
The first thresholds selected by AMLA are assumed to be most appropriately selected after having the sum of all data collected in one place.
3. Controls Quality Score vs. Lack of Controls Score
Currently, reference to and the use of the term ‘Controls Quality’ is actually inverted. A higher score that confers weaker Controls translates into a scale for ‘Lack of Controls’, rather than a scale for ‘Controls Quality’. This part of the methodology is a ‘controls’ assessment, not a ‘risk’ assessment.
For clarity, it could be recommended that a higher Controls score means better controls. If this were used, the Residual Risk calculation would need to be modified to be (IR + (5 - Controls)) / 2, and the heat map would invert to resemble a cartesian coordinate system, which is easily understood.
(heatmap image not possible to show)
4. No Geographic Risk Model
The suggested methodology does not have a Geographic Risk Model.
Additionally, there is no description how geographic responses are to be scored. Data points in the Geographies section in Annexes Section A – Inherent Risk, include Country Facts and Numeric Facts with breakdown.
FTS can recommend to allocate a weighting to each jurisdiction commensurate with a known risk of ML or TF/PF.
For example, a country fact: What is the country of parent company formation?
This could be scored according to the country’s weighting (if weighting Countries from 1-10, then assign thresholds between 1 and 10, and score the weighting).
An example of a numeric with breakdown: How many Natural Person customers by nationality?
The responded value could be multiplied by the weighting. A sum of the weighted responses could be used for scoring against thresholds. Or a ratio of the ‘weighted sum’ / ‘responded value sum’ could be calculated and used in scoring.
5. Uses of Complicated Range and odd Divisions
A rating from 1-4 is complicated for users to understand, because the range of 3.00 is divided by 4 classifications. Models must handle the zero-shift+1 and calculations have to be offset by 1. This also has the result that classification of ratings are determined by the 3rd decimal place of the scored item. E.g. the border for Low-to-Medium is 1.75. Users must be familiar with a result down to three decimals, as 1.749 would be Low, but 1.751 would be Medium.
As a comparison, the Basel AML Index uses 0-10, the Global Terrorism Index uses 0-10, and Transparency International uses 0-100 which are all examples of related AML/CFT/CPF rating systems that use Base-10 metrics. These example ranges are easy to interpret, consumer- and modelling-friendly. The determination of classification level could be at the 2nd decimal or 1st decimal place, and involve more whole numbers.
From experience, we recognize that supervisors, especially those not dedicated to AML/CFT full time, could benefit from this simplification.
6. Inconsistent Application of Weightings
In reference to pg 19, Article 2, the weights used for different categories shall be proportional to the risk score, FTS has the following concerns:
- This approach will drive inconsistent application of weighting within a model, unique per entity
- How does a high-risk score, already weighted high, become further amplified?
- Higher risk scores could also be achieved with simply different thresholds
- Depending on the amplification, the % of total risk allocation by category could have strong variations between entities even though their responses are not as strongly varied
This technique does help to distinguish risk (by amplifying it) which compensates for narrowness of scores.
However, as this extra step adds significant, yet-undefined complexity and results in a unique application of weightings per entity, this technique is not recommended. Alternatives to this operation to distinguish risk could be to select thresholds that intentionally elevate risk, to score activity performed rather than all activity allowed by license type, and/or using more classifications (5 instead of 4).
7. Narrow Score Distribution Expected due to Tight Score Range
The suggested methodology, using a rating scale of 1-4, is expected to contribute to a very narrow range of entity scores.
This works to produce IR scores over a very tight grouping. This approach was used by Strix circa-2020, which also used a 1-4 scale, and which produced, for example, a resulting Inherent Risk score of 1.4 +/-0.2 for a statistically significant population of responding entities, having >150 data points. This behavior was observed in many different sectors.
FTS could recommend to foster a more clear and obvious separation among entities by using a larger scale range.
8. Narrow Score Distribution Expected due to Dilution of Risk
From the description provided, it appears the methodology will be using all data points. This also will contribute to a narrowness of results in the following way: entities will receive low risk ‘credit’ for doing nothing of something. If an entity has no activity related to a certain group of products and services and transactions, and is given low risk credit for this, their risk score in the other groups where they do practice offering products and services and transactions, potentially in a risky way, is then diluted. This steers scores of a hypothetically risky entity lower, toward the average score. If the only thing the Entity does is Service A, and they do this in a risky way, is this not a risky Entity?
FTS also employed the currently suggested methodology in earlier implementations, however, due to feedback with a supervisor when comparing model results to manual assessments, entities with narrow business models were not well identified.
FTS could recommend to improve dispersions by employing the ‘dimensionality reduction’ concept from machine learning which would result in scoring entities based on the activities they performed rather than all activities allowed by the license type. Ref: FATF Opportunities and Challenges of New Technologies for AML/CFT, July 2021, Box 2 Case Study, https://www.fatf-gafi.org/content/dam/fatf-gafi/guidance/Opportunities-Challenges-of-New-Technologies-for-AML-CFT.pdf.coredownload.inline.pdf
9. Compensation for Narrowness in Group-wide Assessments
The suggested methodology attempts to compensate for narrowness of entity scores by amplifying risk when it exists using a formula (pg 31). This extra step adds complexity and is a systematic manipulation of scores. This step could maybe be removed if using a wider score range (e.g. Basel’s 0-10), and/or a methodology which better distinguished risk, and/or using more classifications (5 instead of 4).
10. Indicator Organization
The below are comments related to the organization of Section 6. Annexes A and B for Inherent Risk and Controls indicators. FTS has focused on methodology inputs over the substance of specific questions. Comments related to specific questions, if noted, are opportunistically recorded in this document, and not representing a comprehensive feedback on data points.
Inherent Risk Indicator Organization
- FTS agrees with the general organization of Categories / Sub Categories / Data Points
- Customers: there are questions about customers with foreign residency and registered abroad, which have Geographic risk impacts (geographic risk cuts across most sections)
- Geographies: as noticed in Customers bullet above, geographic risk can impact the other risk pillars; most questions in the Geographies section could alternatively be placed in their related risk pillar; this may depend on how the suggested methodology defines its geographic risk model
- Distribution Channels: consider to understand client types onboarded (Natural Persons, Legal Persons, Legal Arrangements)
Controls Indicator Organization
- FTS could recommend to segregate automatically generated scores (Quantitative) from supervisor generated scores (Qualitative)
- As if Quantitative and Qualitative were two separate Categories
- This may be easier to integrate / harmonize (true for FTS)
- This would allow supervisor review of Quantitative data collected which is unscored
- Based on experience, there are many data points which are not alone scoreable and which require context
- For example, Suspicious Activity Reporting: How many STRs? The answer is typically not scorable, because it will depend on the volume of transactions, the types of transactions, lines of business, and the supervisors understanding of the entity
- A Qualitative assessment could be made upon this data and then scored
- Number of Categories is relatively low or Subcategories could be elevated to Category level
- The number of data points used appears low for TFS, STRs (AML vs. CFT vs. CPF)
- Relatively few Y/N questions which are easily scorable
- Record Keeping scoring may be automated if appropriate questions were asked
- The entities’ definition of risky clients will differ from entity to entity
- An entity’s self-assessment of their clients would be different than a supervisor’s assessment of the same entity’s clients
- A common definition of risk classifications could be recommended
- 3A could be expanded for EDD, or new subcategory 3X
Question 2: Do you agree with the proposed relationship between inherent risk and residual risk, whereby residual risk can be lower, but never be higher, than inherent risk? Would you favour another approach instead, whereby the obliged entity’s residual risk score can be worse than its inherent risk score? If so, please set out your rationale and provide evidence of the impact the EBA’s proposal would have.
FTS would advocate to allow poor controls to result in a residual risk that is higher than inherent risk.
Reasons:
Past Risk vs. Potential Risk: it may be appropriate for past risk but the lack of controls compounds future or potential risk. For example:
- For a given bank, a new product related to Virtual Assets is offered. Controls are missing to identify beneficial owners of virtual assets. The activity is conducted for a period of time before a next regular assessment, which itself will take some time for the necessary attentions to be applied. This would suggest a higher risk than simply ‘low risk’.
No Incentive: a small bank would have no incentive to perform best practices, for example:
- A bank consistently performs a low level of activity, but caters to high-risk clients with poor controls. The low level of activity earns the bank a low Residual Risk score, so it does not trigger immediate or significant supervisory attention. As a result, smaller and low volume activity is permitted continue indefinitely in this risky way.
There should be value to also preventing risk activity rather only reacting.
Suggested Methodology:
(heatmap image not possible to show)
If Controls could elevate Residual Risk:
(heatmap image not possible to show)
3a: What will be the impact, in terms of cost, for credit and financial institutions to provide this new set of data in the short, medium and long term?
To reduce the burden of data collection for form fillers, FTS suggests a partially automated submission technology, such as based on XBRL, so that reporting burden is reduced long term. However, as some questions are not simply mapped to a database, such as questions requiring a human to read, take context, and respond with paragraphs or when documents are to be attached, XBRL is not alone sufficient.
Question 6: When assessing the geographical risks to which obliged entities are exposed, should crossborder transactions linked with EEA jurisdictions be assessed differently than transactions linked with third countries? Please set out your rationale and provide evidence.
FTS recommends the following when advising authorities onboarding its risk assessment solutions:
- The home jurisdiction is lowest weighted since these entities are licensed by the jurisdiction and must abide by laws of the jurisdiction over which the authority has full control.
- Other-than-home jurisdictions which are not notably recognized as being risky are weighted more than the home jurisdiction, but minimally
- Jurisdictions which have cause for being higher risk, due to AML, CFT, or CPF reasons, are assigned commensurately higher weightings
Sometimes the authority resides in a jurisdiction which is considered high risk. Still, to this jurisdiction, the transactions within its jurisdiction are considered lowest risk to the home jurisdiction due to these transactions being under its full control and which it may act upon.
Under the same principle, in the context for AMLA, the jurisdictions that fall under its purview would be considered the lowest risk jurisdictions.
Question 5: Do you agree that the selection methodology should not allow the adjustment of the inherent risk score provided in article 2 of draft under article 40(2) AMLD6? If you do not agree, please provide the rationale and evidence of the impact the EBA’s proposal would have.
We expect the question is two parts and is intended to be “Do you agree that the selection methodology should allow adjustments of the inherent risk score…” and “Do you agree that the adjusted score shall not lead to an increase or decrease by more than one category…”.
Para 4 under Article 2 states that supervisors may adjust the inherent risk score accordingly.
FTS does agree justified adjustments should be permitted, as an exception, not systematic.
However, FTS does not agree that scores should be limited to one classification adjustment. The suggested methodology would produce a heatmap that allows these changes to scores.
(heatmap image not possible to show)
However, this lower magnitude change would be prevented from having the indicated adjustment of two classification changes. Would the magnitude of score change be manipulated to be less to prevent the movement into high risk? Or would the magnitude be allowed, but the classification would become substantial rather than high risk? These result in an inconsistent handling of override scores and inconsistent application of classifications (confusing and not recommended).
(heatmap image not possible to show)
FTS advocates overriding scores when a justification of the change is provided, and that the override should be applicable to a total IR score, OR to a category level score (Customer, Products & Services & Transactions, Distribution Channel), depending on the reason for override.
If the reason for override was the model did not allocate due risk, the model should be evolved so that it calculates representative risk and also so that systemic use of overrides does not occur.
Question 6: Do you agree with the methodology for the calculation of the group-wide score that is laid down in article 5 of the RTS? If you do not agree, please provide the rationale for it and provide evidence of the impact the EBA’s proposal and your proposal would have.
How will AMLA approach group handling? To make a comparison of a group as 1) a peer to other non-group entities, or 2) the individual entities of the group should be first assessed and then the group is assessed separately without comparison to non-group entities.
The aggregation method identified in Article 5 (combination of scores via equation) does not address that the total volume of Customers, Products & Services & Transactions, and Distribution Channels of the group is much larger and therefore, together, holds a greater overall Inherent Risk. If compared to a single non-group entity which had the same total values, this entity would have earned a significantly higher risk rating. Accordingly, FTS would not agree with this approach due to the implied inconsistency in handling of Inherent Risk.
The suggested combination of Article 5 is a more reasonable approach for the Controls Quality scores, as these data points are less practical to sum and involve qualitative elements.
Group or cluster eligibility will require attentions to separately manage and track status of entities eligibility.
FTS agrees that specific attentions will be required to distinguish the entities’ AML/CFT governance and internal control framework, ML/TF risk assessment framework, policies and procedures, and the AML/CFT compliance framework. These would likely involve an additional set of Controls questions to those entities involved. For example,
- Does the entity have a group-wide AML/CFT/CPF program?
- Has the entity conducted an analysis to identify whether the group AML/CFT/CPF program complies with the jurisdiction’s legislation and regulatory framework for each branch/subsidiary?
- Did the entity take part in training provided by other group members? Which trainings?
- Did other group members take part in training provided by the entity? Which trainings?
- Did training / awareness raising activities cover topics related to Group AML Procedures?
- Did training / awareness raising activities cover topics related to Group CFT Procedures?
- Did training / awareness raising activities cover topics related to Group CPF Procedures?
- When did your entity last have a group AML/CFT/CPF audit?
- Who prepares the AML/CFT/CPF policies and procedures? Self, external consultants, or another member of the group