How AWS Client VPN SAML Federation Actually Works (And Where It Will Quietly Destroy You)
The silent failures, routing gotchas, and Terraform pain you'll hit setting up federated authentication for AWS Client VPN, and how to get through them.
There’s a version of this story where setting up federated authentication for AWS Client VPN is clean and elegant.
You configure a SAML application in IAM Identity Center
Point the VPN endpoint at it
Your engineers authenticate via SSO browser flow instead of managing certificate files.
It’s genuinely a better experience than the alternative.
The reality of getting there, though, involves a series of silent failures, configuration gotchas that aren’t documented anywhere obvious, and at least one Terraform apply that will just... hang.
I’ve been through it, and I want to save you the debugging spiral I went through.
The traditional approach to Client VPN authentication is mutual TLS, where every user needs a client certificate signed by your VPN CA. That means generating certs, distributing them securely, revoking them when someone leaves the organization, and maintaining the PKI infrastructure around all of it.
At small scale this is manageable.
At enterprise scale it becomes a whole operational discipline.
Someone has to own the certificate lifecycle, someone has to handle the rotation story, and someone inevitably loses their cert file or locks themselves out at 10pm before a production incident.
Federated authentication via SAML 2.0 solves the distribution problem by replacing that per-user cert with your existing identity provider. If the user exists in IAM Identity Center and is assigned to the VPN application, they can authenticate. When they leave, you deprovision them in one place and they lose access everywhere, including the VPN.
Understanding why this matters requires thinking about what a VPN endpoint actually is in your AWS architecture. It’s not just a secure tunnel into your VPC. It’s a network boundary control, a layer of access management that sits in front of resources you’ve decided shouldn’t be publicly addressable. When that boundary is guarded by distributed certificates you don’t have full visibility into, you have a security posture that’s hard to audit and hard to respond to. Federated auth gives you the authentication event in CloudWatch, the SAML assertion in your identity provider logs, and a single control plane for access management. That’s a meaningful improvement, and it’s worth the setup complexity.
There’s also the audit story.
Security and compliance teams increasingly want to answer the question “who had network access to these resources and when?”
With certificate-based VPN auth, that question requires correlating cert serial numbers with user identities, assuming you’re even logging cert-level connection events.
With SAML federation through IAM Identity Center, every successful VPN connection shows up in CloudWatch with the user’s SSO identity attached. Your audit trail becomes a real-time log of named human beings, not certificate thumbprints. For organizations operating under any meaningful compliance framework, this is not a minor quality-of-life improvement.
The Architecture Before You Touch Anything
Before writing a single line of Terraform, it helps to understand what you’re actually configuring and why each piece exists independently.
AWS Client VPN still requires a server-side certificate even when you switch to federated client authentication.
These two things are separate concerns.
The server certificate lives in ACM and proves to the connecting client that it’s talking to a legitimate VPN endpoint, not an impostor. The client authentication mechanism is how the VPN endpoint verifies who the user is. You can replace the client cert with SAML, but the server cert is non-negotiable. Every TLS-terminating AWS service works this way, and Client VPN is no different.
In Terraform, you generate this server cert using the TLS provider and import it into ACM. A minimal version looks like creating a tls_private_key, a tls_self_signed_cert, and an aws_acm_certificate resource that imports both. The cert is just for server identification, so a self-signed cert from your own CA is perfectly fine here.
One thing I learned from this build, make the certification time life long.
validity_period_hours = 17520.
That’s two years, and it sounds like plenty of time until it isn’t.
Use 43800 for five years, and immediately put a calendar reminder at the four-year mark.
I’ll explain what happens when this expires in a moment, and it’s unpleasant enough that the five-minute effort of setting a reminder is worth it.
On the IAM Identity Center side, you’re creating what AWS calls a custom SAML 2.0 application. Not an AWS-managed integration, a custom one. You configure the ACS URL as:
http://127.0.0.1:35001
and the audience URI as:
urn:amazon:webservices:clientvpn
That localhost ACS URL is not a typo. The AWS VPN Client application on the user’s machine opens a local HTTP listener on that port to receive the SAML assertion after the browser-based authentication completes.
It’s an interesting design choice: the browser handles the IdP auth flow, and then the assertion is posted back to localhost where the VPN client picks it up and passes it to the endpoint. Users who’ve never seen this before sometimes ask if the URL is wrong. It isn’t.
The Silent Failure That Will Ruin Your Afternoon
Here is the single most important thing in this entire article, and it isn’t documented well enough:
AWS Client VPN requires at least one attribute in the SAML AttributeStatement beyond the Subject NameID.
If your SAML assertion doesn’t include any additional attributes, the authentication will fail with a message that says “credentials received were incorrect” and the username in your CloudWatch logs will show as N/A.
Not a helpful error.
Not a pointer toward the actual problem.
Just a silent rejection that looks like a misconfiguration on the client side or a cert issue when it’s actually an IdP configuration issue.
The fix is adding an attribute mapping in your IAM Identity Center application configuration.
Go into the attribute mappings for your custom SAML app and add an entry that maps email to ${user:email}. This populates the AttributeStatement in the SAML assertion, which is what the VPN endpoint is checking for. Without this mapping, IAM Identity Center’s default behavior generates an assertion with only the Subject NameID, which isn’t enough.
I spent a significant chunk of time on this before I found it.
The CloudWatch log group for your VPN endpoint, which you should absolutely enable during setup, will show you every connection attempt including the SAML username and the failure reason. When you see username: N/A paired with “credentials received were incorrect,” look at your SAML assertion before you look anywhere else.
IAM Identity Center has a built-in SAML debugger that’s genuinely useful here: hold Shift and click the application tile in the access portal, and you’ll get the raw SAML assertion XML. You can inspect exactly what attributes are being sent. If you don’t see an AttributeStatement with at least one attribute, add the email mapping and test again.
For DevOps engineers, this is the kind of thing that’s important to document internally the moment you discover it. The error message provides no signal toward the actual fix.
New team members will hit this.
Contractors will hit this.
Anyone who sets up a new Client VPN endpoint without prior context will hit this.
Write it down in your runbook.
What Split Tunnel Mode Actually Means for Routing
Client VPN endpoints support split tunnel mode, which routes only the traffic you explicitly configure through the VPN rather than sending all of the user’s traffic through it.
This is the operationally sensible choice for most enterprise setups: you don’t want your engineers’ Slack traffic and Spotify streams going through your AWS network. But split tunnel mode has a consequence that’s easy to overlook when you’re first setting it up.
Every CIDR you want reachable through the VPN has to be explicitly added as a route on the Client VPN endpoint.
This is not automatic.
If you have a VPC at 10.0.0.0/16 and you want clients to reach it, you add a route for 10.0.0.0/16.
If you have another VPC at 172.31.0.0/16, you add that too.
If traffic to a destination isn’t listed in the VPN’s route table, it silently falls back to the user’s local internet connection.
There’s no error.
The traffic just doesn’t go where you expected it to, which in practice means someone on the VPN tries to reach a private database endpoint, gets a connection timeout, and opens a ticket saying the VPN is broken.
It isn’t broken.
You just haven’t told it where to send that traffic.
This becomes particularly relevant when you’re doing cross-account access, which is common in multi-account AWS architectures. Your VPN endpoint might live in one account, but the engineers connecting to it need to reach resources in three other accounts. That requires VPC peering or Transit Gateway between the accounts, route table entries on both sides of each peering connection, and a Client VPN route for each destination CIDR. The VPN client’s CIDR needs to be routed in the destination account so return traffic knows how to get back, and the destination CIDRs need to be in the VPN endpoint’s route table so outbound traffic is directed through the tunnel.
In practice this means onboarding a new account into your VPN isn’t just “create the peering.”
It’s a checklist.
✔ Peer the VPCs
✔ Update the route tables in both accounts
✔ Add the destination CIDR to the Client VPN route table
✔ Verify connectivity.
Missing any one of these produces the same symptom: connection timeout, no obvious error message.
Treat this as a runbook item with explicit steps and you’ll save a lot of troubleshooting time.
The Certificate Expiry Story
I mentioned setting a five-year validity period on your server cert and calendaring a reminder.
Here’s why that’s not optional.
When the server cert expires, clients connecting to the VPN get a TLS handshake error. This is expected behavior, but it happens to everyone simultaneously, which means your VPN becomes completely unavailable for your whole engineering team at the same time.
That’s bad enough. The renewal process is what makes it worse.
Renewing the cert with Terraform requires tainting/destroying the PKI resources so they get recreated. The problem is that Terraform’s dependency graph, when you destroy and recreate these resources, will try to delete the old ACM certificate before the new one is attached to the VPN endpoint. While the endpoint still references the old cert, the deletion will fail or block. In practice, the apply hangs while Terraform waits to delete a certificate it can’t delete because it’s still in use.
The Solution? You end up having to carefully manage the apply order, potentially using -target flags to create the new cert and update the endpoint reference before running the full apply, or restructuring your Terraform to handle the lifecycle overlap.
There’s also a user-facing piece to certificate renewal that’s not obvious until you’ve done it.
The .ovpn configuration file that users download from the console embeds the CA certificate from the server cert. When you renew the cert and rotate the CA, the embedded CA in every user’s existing .ovpn file is now wrong. They have to re-download the config file from the console and re-import it in the AWS VPN Client. Until they do, they’ll get TLS handshake errors even after the new cert is valid and the endpoint is healthy.
This needs to be communicated clearly when you do a cert rotation, with explicit instructions for users.
The five-year validity period and the calendar reminder are your primary defense against this being a surprise. The secondary defense is keeping this entire rotation procedure in a runbook so that when it does happen, you’re not figuring it out under pressure.
The User Experience of Federated Auth
One of the genuine wins of this setup is what the authentication experience looks like for users after it’s working.
The .ovpn config file, when you download it from the console after configuring SAML auth, includes auth-federate and auth-retry interact flags. These tell the AWS VPN Client to open a browser window for authentication rather than prompting for a certificate or username/password. The user clicks Connect, a browser window opens to the IAM Identity Center portal, they authenticate with their SSO credentials (including MFA if you’ve configured it), and the browser posts the SAML assertion back to the local listener on port 35001. The VPN connects. The whole flow takes about ten seconds once you’re used to it.
Compare this to the certificate-based flow:
⛔︎ The user needs a cert file.
⛔︎ They need to know where to put it.
⛔︎ They need to not lose it.
⛔︎ When it expires or gets revoked.
⛔︎ They need a new one distributed to them.
The federated flow removes all of that. Onboarding a new engineer means assigning them to the VPN application in IAM Identity Center, either directly or through a group. That’s it. The first time they download the config and connect, the browser handles the rest.
The access control model also gets cleaner.
You can use IAM Identity Center groups to manage VPN access, which means your group membership drives VPN access and you have one place to review who has connectivity.
When someone leaves, deprovisioning their IAM Identity Center account removes their VPN access automatically on the next authentication attempt.
Worth noting: users who are not assigned to the VPN SAML application get an “access denied” error in the browser during the auth flow, not at the VPN client level.
That error is clear enough that they can self-diagnose, which reduces support load.
Honest Assessment of the Operational Overhead
This is a better setup than certificate-based auth for most organizations, but it’s not without ongoing complexity.
The routing table maintenance is the biggest recurring operational concern. Every new VPC, every new account, every new CIDR range is a potential routing gap that produces silent connectivity failures. Establishing a clear process for updating Client VPN routes when you expand your infrastructure is worth doing proactively rather than reactively.
The Terraform lifecycle for cert rotation is genuinely awkward and I don’t have a clean solution for it. I’ve seen people work around it with lifecycle blocks that create-before-destroy, but the attachment constraint means you have to be careful about ordering. This is an area where the tooling hasn’t fully caught up with the operational pattern. If you’re managing this in a team, make sure the rotation procedure is documented by someone who’s actually done it, not someone who’s just read the Terraform docs.
What I still don’t know: whether there are meaningful performance differences between SAML auth and cert auth for connection setup latency.
The TLS handshake overhead and the SAML assertion processing presumably add some latency relative to mutual TLS cert verification, but I haven’t benchmarked this in a way I’d trust for general guidance.
For typical VPN use cases this almost certainly doesn’t matter, but I’d want to verify before building something latency-sensitive on top of the federated auth flow.
What’s your experience using AWS VPN and AWS IAM Identity Center to manage access to your organization?
I’d love to hear about it in the comments.
With Love and DevOps,
Maxine
Last Updated: March 2026
If this kind of infrastructure depth is useful to you, this is exactly what I cover in my learning library.
My DevOps Career Switch Blueprint covers the foundational AWS and IaC patterns that make setups like this make sense, and my Docker Essentials course is a free place to start if you’re building toward cloud infrastructure work from scratch.
I Love DevOps, is where I write about production systems the way engineers actually encounter them, not the way the documentation presents them.
Come find me there if you want more of this. 🩷





