Migrating from RSA to ML-KEM: A Practical Guide

RSA has been the backbone of public-key cryptography for nearly five decades. Introduced in 1977 by Rivest, Shamir, and Adleman, it secures everything from TLS connections to email encryption to code signing. But RSA's security rests entirely on the difficulty of factoring large integers -- a problem that Shor's algorithm solves in polynomial time on a quantum computer. Migrating from RSA to ML-KEM (FIPS 203) is not optional. It is a matter of when, not if. This guide covers the technical details, practical deployment strategies, and common pitfalls of executing that migration.

Understanding the Vulnerability

RSA-2048, the most commonly deployed variant, provides approximately 112 bits of classical security. A classical computer would need approximately 2^112 operations to factor a 2048-bit RSA modulus -- an astronomical number that would take billions of years on the fastest supercomputers. From a classical perspective, RSA-2048 is secure.

Shor's algorithm changes this calculus entirely. Published by Peter Shor in 1994, the algorithm factors integers in polynomial time on a quantum computer. Specifically, factoring an n-bit integer requires O(n^3) quantum gates and O(n) logical qubits. For RSA-2048, current estimates suggest approximately 4,000-10,000 logical qubits would be sufficient, depending on the error correction scheme used.

The number of logical qubits required has been steadily decreasing as quantum computing researchers find optimizations. In 2012, the estimate was 20 million physical qubits. By 2023, optimized implementations had reduced this to approximately 20 million physical qubits with surface code error correction, but only about 2,000-4,000 logical qubits. The gap between physical and logical qubits depends on error rates and error correction overhead, both of which continue to improve.

Critically, this is not a theoretical concern limited to the future. The Harvest Now, Decrypt Later (HNDL) threat means that RSA-encrypted data being transmitted today can be recorded and stored by adversaries. Every TLS handshake using RSA key exchange, every RSA-encrypted email, and every RSA-signed document is a candidate for future decryption. The cost of storage is negligible compared to the potential intelligence value.

ML-KEM: The Replacement

ML-KEM (Module Lattice Key Encapsulation Mechanism), standardized as FIPS 203, is the NIST-approved replacement for RSA and ECDH key exchange. It is based on the Module Learning With Errors (MLWE) problem, a lattice problem with no known efficient classical or quantum attack.

The key differences between RSA key exchange and ML-KEM:

Property	RSA-2048	ML-KEM-768
Security basis	Integer factoring	Module-LWE
Quantum safe	No	Yes
Public key size	256 bytes	1,184 bytes
Ciphertext size	256 bytes	1,088 bytes
Encapsulation time	~0.5 ms	~0.05 ms
Decapsulation time	~10 ms	~0.05 ms
NIST security level	~Level 1	Level 3
FIPS standard	FIPS 186-5	FIPS 203

Several things stand out from this comparison. ML-KEM public keys and ciphertexts are larger than RSA -- approximately 4-5x larger. This has implications for protocols with strict message size limits. However, ML-KEM's performance characteristics are dramatically better than RSA. Key generation and encapsulation/decapsulation are approximately 10-200x faster than RSA key generation and decryption. In practice, the transition to ML-KEM often improves TLS handshake latency even with the larger key sizes.

It is also important to understand the conceptual difference between RSA key exchange and ML-KEM. RSA key transport works by encrypting a premaster secret with the server's RSA public key -- the client generates a random value, encrypts it, and sends it to the server. ML-KEM is a Key Encapsulation Mechanism (KEM), which means the encapsulation function generates both the ciphertext and the shared secret simultaneously. The server decapsulates to recover the same shared secret. This is a cleaner cryptographic construction with better security properties.

The Hybrid Approach

The recommended migration strategy is to deploy ML-KEM in hybrid mode -- combining a classical algorithm (like X25519 or ECDH-P256) with ML-KEM so that the connection remains secure even if one of the two algorithms is broken. This provides defense in depth during the transition period.

Why hybrid mode matters:

Protection against quantum attacks. The ML-KEM component provides quantum resistance that the classical component lacks.
Protection against lattice breakthroughs. The classical component provides security even if a new attack is discovered against MLWE (similar to the classical attack that broke SIKE in 2022).
Regulatory compliance. Some regulatory frameworks still require classical algorithms. Hybrid mode satisfies both PQC and classical requirements simultaneously.
Backward compatibility. If one party does not support ML-KEM, the handshake can fall back to classical-only key exchange.

In TLS 1.3, hybrid key exchange is implemented by concatenating the key shares. The client offers both an X25519 key share and an ML-KEM-768 key share in the ClientHello. The server responds with both key shares in the ServerHello. The shared secret is derived from the concatenation of both shared secrets, ensuring that an attacker would need to break both algorithms to compromise the session.

The IETF has standardized this approach in RFC 9370, which defines the X25519MLKEM768 named group for TLS 1.3. This is the recommended configuration for most deployments.

Step-by-Step Migration

Step 1: Build Your Cryptographic Inventory

Before changing anything, you need to know exactly what you are running. Identify every system using RSA for key exchange or key transport. This includes:

TLS server configurations: Web servers, API gateways, load balancers, reverse proxies. Check the cipher suites and key exchange groups configured on each.
VPN endpoints: IPsec and WireGuard configurations often specify key exchange algorithms explicitly.
SSH servers: Check KexAlgorithms and HostKeyAlgorithms in sshd_config.
Email servers: SMTP, IMAP, and POP3 servers using STARTTLS or implicit TLS.
Application-layer protocols: Any custom protocol using RSA encryption for key establishment.

Build this into a Cryptographic Bill of Materials (CBOM) so you have a comprehensive, maintainable view of your cryptographic landscape.

Step 2: Prioritize by Risk

Not all RSA deployments carry the same risk. Rank systems by the sensitivity and longevity of the data they protect:

Migrate first (highest HNDL risk):

Systems handling healthcare data, financial records, or classified information
Internet-facing services (highest interception probability)
VPN concentrators protecting long-term business communications

Migrate second (moderate risk):

Internal services handling confidential but shorter-lived data
Cloud-to-cloud communication channels
CI/CD infrastructure and code signing

Migrate last (lower immediate risk):

Development and staging environments
Systems handling only public data
Ephemeral communication channels with short-lived session data

Step 3: Upgrade Your TLS Stack

The most impactful single action is upgrading your TLS stack to support hybrid ML-KEM key exchange. Here is how to do it for common platforms:

Nginx with OpenSSL 3.5+:

ssl_protocols TLSv1.3;
sslconfcommand Groups X25519MLKEM768:X25519:P-256;
sslpreferserver_ciphers on;

Apache with OpenSSL 3.5+:

SSLProtocol all -SSLv3 -TLSv1 -TLSv1.1 -TLSv1.2
SSLOpenSSLConfCmd Groups X25519MLKEM768:X25519:P-256

Node.js (with OpenSSL 3.5+ linked):

javascript

const https = require("https");
const server = https.createServer({
  // OpenSSL 3.5+ will negotiate X25519MLKEM768 automatically
  // when the client supports it
  minVersion: "TLSv1.3",
});

Go 1.24+:

tlsConfig := &tls.Config{
    MinVersion: tls.VersionTLS13,
    // Go 1.24+ supports ML-KEM key exchange natively
    // X25519MLKEM768 is enabled by default
}

The key point is that hybrid ML-KEM key exchange is a server-side configuration change. It is non-breaking: clients that support ML-KEM will negotiate hybrid mode, and clients that do not will fall back to classical X25519 or P-256 key exchange. You can deploy it incrementally without any client-side changes.

Step 4: Update Certificate Authentication

Key exchange migration (RSA/ECDH to ML-KEM) and signature migration (RSA/ECDSA to ML-DSA) are separate workstreams. Step 3 addresses key exchange. Certificate migration is more complex because it requires coordination across the trust chain.

For now, plan to:

Issue dual certificates (ECDSA + ML-DSA) for internal services where you control both server and client.
Continue using ECDSA certificates for public-facing services until browser support for ML-DSA certificate verification reaches sufficient levels.
Test ML-DSA certificate chains in staging environments to identify compatibility issues early.

The signature migration is less urgent than key exchange migration because signatures provide authentication (integrity), not confidentiality. An attacker who can forge signatures in the future cannot retroactively read data encrypted today. However, signatures do need to be migrated before a CRQC arrives to prevent real-time authentication bypass.

Step 5: Validate Everything

After deploying hybrid key exchange, validate thoroughly:

Interoperability testing: Ensure all clients can connect successfully. Test major browsers, mobile apps, API clients, and any custom TLS implementations.
Performance testing: Measure TLS handshake latency before and after. ML-KEM key exchange typically adds 1-3 KB to the handshake but reduces computation time, so net latency impact is often negligible or positive.
Load testing: Verify that your servers can handle the computational load of ML-KEM operations under peak traffic. ML-KEM is computationally lighter than RSA, so this is rarely an issue.
Certificate chain validation: Ensure that certificate verification still works correctly through your full trust chain, including intermediate CAs and cross-signed roots.
Monitoring: Set up alerting for TLS handshake failures, unexpected cipher suite negotiation, and performance regressions.

Step 6: Monitor and Iterate

After deploying hybrid key exchange on your first set of endpoints, monitor the results and expand to additional systems:

Track the percentage of connections using hybrid key exchange versus classical-only.
Set a target date for disabling RSA-only key exchange on all endpoints.
Continue expanding coverage to internal services, VPN endpoints, and application-layer protocols.
Update your CBOM to reflect the current state and track migration progress.

Common Pitfalls

Ignoring Non-TLS Uses of RSA

RSA is not just used in TLS. It appears in:

S/MIME email encryption: Email encrypted with RSA is HNDL-vulnerable. Migrate to ML-KEM-based S/MIME or consider alternative secure communication channels.
JWT tokens: RS256 (RSA with SHA-256) is one of the most common JWT signing algorithms. While this is a signature (not key exchange), the tokens may contain sensitive claims that should not be exposed if the signature is forged.
XML signatures (XMLDSig): Enterprise SOAP services and SAML identity providers often use RSA signatures.
PDF and document signing: Adobe PDF signatures commonly use RSA.
SSH authentication: RSA host keys and user keys are still widespread.
PGP/GPG: Email encryption and code signing with RSA keys.

Each of these requires its own migration plan and timeline.

Underestimating Key Size Impact

ML-KEM keys are larger than RSA keys (1,184 bytes vs 256 bytes for the public key). This can cause issues in:

Protocols with strict message size limits: Some legacy protocols have maximum message sizes that may not accommodate larger key material.
Embedded systems with constrained memory: IoT devices and microcontrollers may not have sufficient RAM for ML-KEM operations.
Databases storing public keys: If your database schema has a fixed-width column for public keys, it will need to be expanded.
QR codes and barcodes: Some systems encode public keys in QR codes, which have limited capacity.

Plan for the size increase in your architecture and data models before deploying.

Waiting for Perfection

The standards are finalized. Hybrid deployment is safe. The implementations are audited and production-ready. There is no reason to delay. Every day you wait is another day of data being transmitted with quantum-vulnerable encryption -- data that may be recorded and stored for future decryption.

The perfect is the enemy of the good. Deploy hybrid ML-KEM key exchange today on your highest-risk endpoints, then iterate from there. A partial deployment that protects your most sensitive data is infinitely better than a comprehensive plan that remains on a whiteboard.

Measuring Your Progress

Track your migration progress with concrete metrics:

Percentage of TLS endpoints using hybrid ML-KEM: This is your primary KPI. Start at 0% and drive toward 100%.
HNDL Score trend: Your HNDL Score should decrease as you deploy hybrid key exchange. Factor 5 (Cryptographic Posture) directly improves with each endpoint migrated.
CBOM quantum-vulnerable asset count: The total number of quantum-vulnerable cryptographic assets in your CBOM should decrease over time.
Mean time to migrate new endpoints: As your team gains experience, the time to configure hybrid key exchange on a new endpoint should decrease.

The migration from RSA to ML-KEM is the most consequential cryptographic transition since the move from DES to AES. The difference is that this time, you have the tools, standards, and guidance to execute proactively rather than reactively. Start now.