There are quite a few dimensions to how performance can vary between TLS libraries.
Handshake performance covers how quickly new TLS sessions can be set up. There are broadly two kinds of TLS handshake: full and resumed. Full handshake performance will be dominated by the expense of public key crypto – certificate validation, authentication and key exchange. Resumed handshakes require no or few public key operations, so are much quicker.
Bulk performance covers how quickly application data can be transferred over an already set-up session. Performance here will be dominated by symmetric crypto performance – the name of the game is for the TLS library to stay out of the way and minimise overhead in the main data path. The data rates concerned are typically many times a typical network link speed.
A TLS library will represent separate sessions in memory while they are in use. How much memory these sessions use will dictate how many sessions can be concurrently terminated on a given server.
This blog post covers memory usage. See the introduction for details of other measurements
Another important facet is how much memory each TLS session consumes. For workloads with many concurrent connections this can be a limiting factor.
To test this, we’ll measure the peak memory usage of a process which creates N sessions (N/2 client sessions associated with N/2 server sessions) and then takes these sessions through a full handshake in lockstep. This benchmark design captures any memory usage peaks during the handshake, such as in certificate handling.
Once we have this for both OpenSSL and rustls, we measure for several values of N which yields these results:
Cipher suite | N | OpenSSL (KBytes) | Rustls (KBytes) | vs. 2sf |
---|---|---|---|---|
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (TLS1.2) |
100 | 13452 | 6140 | -54% |
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (TLS1.2) |
1000 | 73668 | 29836 | -59% |
TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (TLS1.2) |
5000 | 341604 | 135984 | -60% |
TLS_AES_256_GCM_SHA384 (TLS1.3) |
100 | 12812 | 6184 | -52% |
TLS_AES_256_GCM_SHA384 (TLS1.3) |
1000 | 66000 | 30596 | -54% |
TLS_AES_256_GCM_SHA384 (TLS1.3) |
5000 | 303424 | 139592 | -54% |
If we fit against the TLS1.3 values, we get 27.2N + 3417
for rustls
and 59.3N + 6789
for OpenSSL: very roughly a rustls session costs 27KB
and an OpenSSL session costs 59KB peak, in this workload. This multiplies
up to give a C10K memory usage of 269MB for rustls and 586MB for
OpenSSL.
We see evidence that OpenSSL uses less memory during a TLS1.3 handshake compared to TLS1.2, but rustls does not. This might be an area for future work in rustls.