Fighting Eventual Consistency-Based Persistence – An Analysis of notyet
Eventual Consistency
Eventual consistency in AWS’s Identity & Access Management (IAM) service is a well-documented phenomenon.
It’s iPhone speculation time: flips, flaps — and Fold
Eventual Consistency
Eventual consistency in AWS’s Identity & Access Management (IAM) service is a well-documented phenomenon. In short, when IAM changes are made in AWS, those changes actually take a few seconds to propagate through AWS’s internal system. Within this propagation window, an attacker-controlled identity with the right starting permissions could theoretically detect and reverse any changes to their privileges.
Throughout the past several months Eduard Agavriloae, Director of R&D at OFFENSAI, has conducted a flurry of research on eventual consistency, releasing a number of articles on the subject [1][2]. These pieces scope out which standard Incident Response (IR) playbooks are at risk of being ineffective against an attacker with advanced capabilities leveraging the security gaps caused by IAM eventual consistency.
notyet
OFFENSAI has now released notyet, an open-source tool designed to help IR teams test their identity containment playbooks. Their release article details how the tooling works in depth, but the gist is that notyet is an adversarial system designed to exploit the eventual consistency propagation window to maintain persistence automatically. Given an access key or role session credentials, notyet will self-escalate to administrator access, then attempt to hang on to these privileges no matter what containment steps are taken in the AWS account.
Red-Blue Collaboration
As part of a collaboration with the notyet’s creator Eduard Agavriloae, I was granted access to early builds of notyet with the aim of hardening it through independent testing. I acted as an incident responder – working through various IR playbooks to attempt to restrict notyet’s access when it was equipped with admin-level credentials.
As new builds of notyet would be developed, I would work through a suite of self-curated IR actions to try and identify gaps in notyet’s persistence loop, providing feedback and recommendations once I was done. We iterated on this cycle a number of times, leading to improvements and new persistence features prior to notyet’s public release.
Tool Analysis
My goal when evaluating this tool was to act as an adversary to notyet. I would equip it with * permissions (via an IAM User access key or IAM Role session credentials), kick off the automated persistence loop, then do everything in my power to claw back those privileges as it used the eventual consistency propagation window to undo my containment actions. This analysis details all the different ways in which I tried to contain the tool, and which strategies actually worked.
For some background on access containment, there are a number of ways to effectively remove privileges from a compromised identity in AWS. The AWS Security Incident Response User Guide references a number of steps that can be taken, including:
Direct removal of permissions via the IAM service
Disabling access keys
Revoking temporary security credentials
But these techniques aren’t all that exist. For automated containment, the guide also leverages the AWSSupport-ContainIAMPrincipal SSM automation runbook.
Moreover, AWS has provided additional guidance for manually quarantining specific principals. As noted in an OFFENSAI article on IAM eventual consistency, AWS has published a Credential Cleanup Procedure on re:Post that instructs responders to use direct-attached inline policies to limit privileged IAM operations
Rather than simply following these guides, I wanted to go a step further and test all the possible ways with which I can restrict an identity’s permissions. This included the commonly-used IR techniques listed above, but also additional techniques that do legitimately remove privilege but aren’t commonly used for IR. The results of this testing are as follows:
Inline Policies [Ineffective]
notyet establishes its base state by attaching an inline policy to the provided User/Role that grants it full access, then removing all other IAM policies, group memberships, and permission boundaries:
{ “Version”: “2012-10-17”,
“Statement”: [
{
“Effect”: “Allow”,
“Action”: “*”,
“Resource”: “*”
}
]
}
There are a number of ways we can use this same inline policy system to try and restrict access. Specifically, we can:
Delete notyet’s base inline policy
Edit notyet’s base inline policy
Add a new inline policy with a Deny effect to override notyet’s base inline policy
The examples below demonstrate attempts to remove privilege from an IAM Role equipped with notyet, but the results were the same when tested with users as well – the exact commands are just a bit different.
Test 1: Deleting the Inline Policy
This test is relatively easy – use AWS CLI commands to identify the inline policy name then delete it from the role or user. We see notyet detect this change immediately and re-add the inline policy within the eventual consistency propagation window:
Test 2: Editing the Inline Policy
There are a number of ways to modify notyet’s base inline policy to restrict privilege:
Flipping the Effect to Deny
Altering the Action to grant minimal permissions rather than *
Altering the Resource field to scope the * permissions to a resource that doesn’t exist, effectively negating it
Adding additional policy statements with the effect Deny, as per the AWS enforcement code logic a deny in any relevant policy statement always overrides allows
All four of these cases were independently tested. notyet confirms its inline policy exactly, so it is able to detect and remediate all of these and more. If the inline policy contains more than one statement, or any unexpected keys or values, the policy is reset within the consistency window.
Test 3: Adding a New Inline Policy
Instead of adding additional inline policy statements to notyet’s pre-existing inline policy, we can add entirely separate inline policies to attempt to deny the permissions granted. Role session revocation actually works this way – limiting permissions by adding a new Deny effect inline policy. This is clear when looking at the policy presented in the role session revocation guide, but we can also verify this by attempting to revoke notyet’s role session via the console.
As seen below, notyet can detect the addition of new inline policies and remove them:
Managed Policy Attachments [Ineffective]
Adding new inline policies isn’t the only way to explicitly deny permissions for an identity. We can also use managed policies for this purpose. AWS actually has a number of managed policies specifically for quarantining compromised identities, including:
notyet polls for any managed policy attachments and removes them. This approach is identical whether it has user or role credentials. For example, trying to remove all privilege from notyet’s role using the AWSDenyAll policy is detected and remediated immediately:
You can also define deny-based customer-managed policies. API operations dealing with managed policy attachments don’t distinguish between customer-managed and AWS-managed policies, so notyet handles customer-managed deny policy attachments in an identical manner.
Permission Boundaries [Ineffective]
Permission boundaries positively define the maximal permissions of an identity, and can be applied to users or roles. By attaching a permission boundary limiting the identity’s privilege to a minimal set of permissions rather than *, we can attempt to restrict notyet’s privileges.
notyet polls for permission boundaries and removes any that it finds the same way it does for managed policies:
Group Membership [Ineffective]
All of the previous examples have been common to both roles and users, but there are features unique to each of these. For users, one additional way we can attempt to limit permissions is via group membership. As mentioned in the AWS enforcement code logic, a deny in any relevant policy statement always overrides allows – and this applies to a user’s permission even when the deny is attached to a group the user is a part of.
notyet polls for group memberships and removes any that it finds the same way it does for managed policies:
Access Key Deactivation [Ineffective]
One key component of access containment in AWS is revoking compromised static credentials. When notyet is equipped with an access key, it therefore makes sense to try deactivating the key it uses.
Until recently, the eventual consistency problem was such that all notyet had to do was poll for a change in its user’s access keys, then recreate them if modified. To AWS’s credit, this is no longer possible; as documented by OFFENSAI, a deactivated access key can no longer be used to create a new access key.
But a deactivated key can be used to perform other IAM actions. When notyet detects it’s access key has been deleted, it:
Creates a temporary role assumable by the user
Attaches an inline policy to the role granting full admin access
Assumes the temporary role
Creates a new user
Attaches an inline policy to the user granting full admin access
Creates an access key on the user
Swaps back to using the access key credentials
Only steps 1-3 need occur within the eventual consistency propagation window (since those are the only steps completed using revoked credentials), and the new access key is not directly created by the deactivated one.
notyet handles access key deletion and user deletion the same way as it does key deactivation, as each of those containment actions must be preceded by deactivating any access keys, which in turn triggers this credential rotation.
Role Deletion [Ineffective]
notyet handles role deletion in a similar fashion to access key deactivation. When it detects its role has been deleted, it creates a new one with identical permissions and assumes it.
This is actually rather difficult to conduct using the CLI, as aws iam delete-role requires that all policies be detached prior to role deletion. Instead, we can delete the role in the console, which has no such requirements:
SSM Runbooks [Ineffective]
There are a handful of SSM runbooks that are relevant to IAM management, namely AWSSupport-ContainIAMPrincipal and AWS-DeleteIAMInlinePolicy.
We can execute these automation runbooks, but they effectively perform the same steps we were taking manually in the previous tests. notyet is able to remediate these actions in the same ways we’ve seen before:
Session policies [Inapplicable]
Session policies are only applied when the session is created; there is no way to retroactively apply them to an existing role session. Because role assumptions after starting notyet are controlled by the tool itself – there is no way to use session policies to prevent the persistence mechanism.
It is possible to prevent notyet from starting in the first place by applying a session policy restricting IAM actions to the role session provided to the tool in the first place. But, this kind of defeats the purpose of using the tool to begin with – where the point is to try and stop the persistence loop of an identity with the requisite permissions.
Trust relationship manipulation [Inapplicable]
Role sessions eventually expire, so notyet needed a mechanism to refresh its creds when configured with temporary role credentials. Had role session refreshes occurred by re-assuming the same compromised role, then the assume-role policy of the role would be important. By manipulating the policy, we could potentially prevent the session from being refreshed.
However when notyet needs a new session, it instead creates a new role with identical permissions and an assume-role policy it controls. By assuming that role, it gains access to a brand new session with equivalent permissions.
Service Control Policies [Effective]
As documented by OFFENSAI, Service Control Policies (SCPs) are an effective mechanism for removing permissions from an identity, as even identities with * privileges in member accounts cannot undo those policy attachments. In the article, Eduard laid out a framework for creating a quarantine policy that cannot be detached by any identity but a defined Incident Response role.
This approach is the ideal for creating an incident response playbook that accounts for eventual consistency, but for the purpose of containing a single identity on an ad-hoc basis I took a simpler approach, using the following policy template:
{
“Version”: “2012-10-17”,
“Statement”: [
{
“Sid”: “QuarantineNotYetIdentity”,
“Effect”: “Deny”,
“Action”: “*”,
“Resource”: “*”,
“Condition”: {
“ArnEquals”: {
“aws:PrincipalArn”: “<CompromisedPrincipalArn>”
}
}
}
]
}
By attaching this to the org root or the specific account the identity belongs to, their privileges are revoked:
There are a few caveats to using this approach. For one, SCPs exist only in AWS organizations; if you’re using a standalone account, this policy type won’t be available. Secondly, SCPs don’t apply to identities in the org management account. That said, if an administrative identity in your AWS org management account is compromised, you have much bigger problems than eventual consistency-based persistence…
Sonrai Security’s Cloud Permissions Firewall [Effective]
Given we at Sonrai Security produce our own cloud security platform targeting IAM specifically, I couldn’t help but test our own platform against notyet.
Sonrai Security’s Cloud Permissions Firewall is a SaaS platform that assists in enforcing least privilege without breaking existing developer workflows. One of its features is the ability to quarantine zombie identities – identities that are unused, but still have standing privilege. Identities are quarantined using the Firewall via a combination of SCPs and Attribute-Based Access Control (ABAC).
The Firewall’s UI features for quarantining identities do mainly target those unused zombie identities. But using the advanced console, we can quarantine arbitrary identities – including the one leveraged by notyet:
The Cloud Permissions Firewall is effective in preventing eventual consistency-based persistence primarily because it relies on Service Control Policies to enforce permission denial. As such, the same caveats that apply to SCP-based remediation apply here too.
Timing-Based Tactics [Effective]
In notyet’s release article – Eduard notes that “if the attacker automates [abusing the propagation window], the defender is playing whack-a-mole against a script that reacts faster than they can click.” It’s a truly apt description, given that manual containment actions can be easily caught and reversed atomically.
But notyet’s detection and response loop does still consist of a series of API calls. By wrapping some of our remediation efforts into a script that can run fast enough to interrupt the persistence actions as they’re occurring, we can sometimes trip up the tool. But in a reversal from the norm – in this case it’s the blue team that just needs to be lucky once.
For example, by repeatedly disabling access keys for the initial compromised user or any recently created users, we can attempt to interfere with the credential rotation process. It doesn’t work the first try – but it only needs to work once. In this case, we were able to disable the access key of the new user early enough that the propagation window passed before notyet could successfully kick-off another credential rotation:
Similar approaches succeed in preventing role session refreshes – deleting the new role before it can be used. You can even spam delete-user-policy or delete-role-policy requests until you get rate limited in an attempt to trick the system into thinking it was unsuccessful in re-adding the inline policy.
However, these timing-based tactics all inherently target the tool rather than the core risk itself. These succeed in stopping notyet, but can’t be guaranteed to work against another eventual consistency-based automated persistence tool. They can be useful if all else fails, but SCP-based approaches really are the best way to target the phenomenon rather than the specific implementation.
Conclusion
Having now tested the most recent version of notyet against nearly a dozen distinct mechanisms for containing access and seeing it persist against nearly all of them – I have to say its resiliency is astounding. None of the standard AWS-recommended ways to contain an identity worked against it.
While scripts that targeted the timing of the tool could be effective with some regularity, the real takeaway from this exercise is that anyone worried about eventual consistency-based attacker techniques needs to be using AWS Organizations and SCP-based controls for quarantining identities.
But don’t just take my word for it – go try out notyet for yourself. Iterate on the techniques detailed above, or try out containment methods that I missed. Test out your own IR playbooks – and make sure you’re ready when we start to see this attacker behaviour in the real world.
*** This is a Security Bloggers Network syndicated blog from Sonrai | Enterprise Cloud Security Platform authored by Nigel Sood. Read the original post at: https://sonraisecurity.com/blog/fighting-eventual-consistency-based-persistence-an-analysis-of-notyet/
