Memory Lane: How the Dual_EC_DRBG Backdoor Works
Table of Contents
How well do you remember 2014? A bit fuzzy? Me too. I’ll try to recall things from nearly 10 years ago as I walk back through old code I wrote to explore and teach others about the fundamentals of the Dual_EC_DRBG backdoor, Which I often call “Dual_EC” for short. What better way to learn than by breaking a toy version of Dual_EC?
The messy code is all here, updated and
tested to work under Python 3.13. Just run server.py
in one terminal and attack.py
in another, this should set up all 4 challenges and then sequentially solve them.
Huge thanks to Jeremy Kun’s blog post and code on doing math with elliptic curves over finite fields, which these challenges couldn’t have happened without.
Disclaimer: I am not a cryptographer, and my math is a bit rusy. These challenges are flawed in the intended ways, and probably also unintented ways.
Context#
Back in 2014 the Snowden leaks were still fresh. Among the revelations was that the NSA had a program codenamed “BULLRUN”1 that involved intentionally undermining encryption standards, explicitly the Dual_EC_DRBG2 algorithm. It was already a highly suspicious algorithm to people who pay attention to such things, but hard evidence from one of the largest leaks of secret documents in history does wonders for awareness. The Citizenfour documentary is a popcorn-friendly look at the leaks, if you want more high-level background and don’t feel like reading.
I wasn’t following this closer than any other techie. But when I came across a paper in the Notices of the AMS, “The NSA Back Door to NIST”3 by Thomas C. Hales, I realized the math behind the backdoor wasn’t as complex as I first thought. Maybe I could show this to other people with less math experience and demystify this backdoor a bit? Also, what better than learning by doing, and having a series of challenges to demonstrate each building block in the backdoor?
Technical Background#
The Dual_EC_DRBG algorithm is meant to be used as a cryptographically secure pseudo-random number generator (CSPRNG). Every cryptographic scheme requires a source of random bits to function. The Dual_EC_DRBG algorithm can supply those bits. These pseudo-random number generators need to be cryptographically secure to be suitable for cryptographic schemes – you should not be able to predict future output by observing previous output. If an attacker can predict output of the random number generator (RNG) used by a cryptographic scheme, they can break that scheme. This makes the RNG an attractive target for a backdoor. You just have to – ideally – make it a backdoor only usable by you and nobody else.
Dual_EC_DRBG uses elliptic curves, which mostly just matters for the details of
the algorithm and the backdoor. There are great intros to Elliptic Curve
Cryptography45 so I won’t repeat what those guides
have already done. The important part for us is that points on the curve –
pairs of numbers (x, y) where x and y are solutions for the curve’s equation –
form a Group6. This allows us to define very “basic” math
among these points (addition and subtraction) and have it work overall like we
expect addition and subtraction to work. For us, the important thing is that
given points P
and Q
on a curve, P + Q
gives us a third point R
on the
curve. Knowing this, we can define a “scalar multiplication” in this group as
repeated addition. So for some integer n
and point P
, n * P
is just P
added to itself n
times: n * P = P + P + ... + P
. We’ll be doing that a
lot.
The Algorithm#
In very general terms, Dual_EC_DRBG maintains some internal state s
that is
used to derive random bits that are output to the user. As more bits are
needed, the internal state is mutated, and more output bits can be derived from
the new internal state. Repeat as needed. The fact that we’re using an elliptic
curve is only important for the details. Here is the core of it as described by
Thomas C. Hales in the paper3 mentioned in the context
section. In math convention, s'
is pronounced “s prime”, and usually denotes
something derived from or closely related to another variable, in this case
s
.
let pseudo_random s =
let r = x1 (s * P) in
let s' = x1 (r * P) in
let t = x1 (r * Q) in
let b = extract_bits t in
(s',b);
The x1
function just takes the x-coordinate from the given point, as a simple
way of getting an integer from a point on the curve. extract_bits
just masks
its input, throwing away some of the most significant bits and only giving the
user what’s left. An ASCII diagram might help make this clearer. The internal
states s
, s'
, and so on are entirely internal to the RNG, and the user only
sees bits b
derived from these states. Which means all an external attacker
could hope to see would be the bits b
themselves, or things derived from
them.
s --P--> r --P--> s'
|
'---Q--> b
s' = x1( x1(s * P) * P )
t = x1( x1(s * P) * Q )
b = extract_bits( t )
Note that the internal state is mutated by multiplications against point P
,
while the random bits we observe also involve multiplication by a point Q
.
If there was some way to go from b
back to r
we could break the algorithm.
But this would require solving for r
in X = r * Q
, which is the Discrete
Logarithm Problem7 which in these circumstances is
considered difficult. This problem is the basis of many other cryptographic
schemes’ security, so that’s probably not our path forward.
The Backdoor#
For an attacker, everything8 about the random number generator is known except for the internal state. If we could somehow recover the internal state, we could simply set up our own Dual_EC_DRBG with that identical state to predict all future output of the targeted Dual_EC instance.
That’s a great goal, but how does the backdoor get us there? I’ll quote extensively from Thomas C. Hales who does an amazing job explaining this to mathematicians. Emphasis and formatting tweaks are mine:
The NIST standard gives a list of explicit mathematical data
(E,p,n,f,P,Q)
to be used for pseudo-random number generation … The groupE(Fp)
has order n, which is prime for all of the curves that occur in the NIST standard. … Finally, P and Q are given points on the affine curve. … No explanation is given of the particular choices(E,p,n,f,P,Q)
. … The standard stipulates that “one of the following NIST approved curves with associated points shall be used in applications requiring certification under FIPS-140 [U.S. government computer security accreditation].”
So we have certification and standardization being used to force adoption and
use of DUAL_EC_DRBG with the specific constants that were chosen by the NSA
with no explanation. I emphasized the fact that the order of our group – the
number of elements it has – is prime because this tips the hand of the
backdoor and these magic constants P
and Q
.
Since
P
andQ
are nonidentity elements of a cyclic group of prime order, each is a multiple of the other. WriteP = e*Q
for some integere
. We show that, once we havee
in hand, it is a simple matter to determine the secret internal states
of the pseudo-random bit generator by observing the outputb
and thus to compromise the entire system. The functionextract_bits
discards 16 bits. Given the outputb
, we take the 2^16 (a small number of) possible preimagest
ofb
underextract_bits
. For eacht
, the coordinatex
is known, and solving a quadratic, there are at most two possibilities for the coordinatey
of a pointA
on the elliptic curve such thatt = x1(A)
. One suchA
isr * Q
. For eachA
, we computee * A
. One of the small number of possibilities fore * A
is(1)
e * (r * Q) = r * (e * Q) = r * P
Finally
s′ = x1(r * P)
.
Closer to plain english:
As an outside observer, we just get output b
. The output had 16 bits thrown
away that we don’t get to see. But 2^16 = 65536
, which just isn’t that large
of a number for computers. So write down our 65536 possibilities for t
and
one of them must be correct, we just don’t know which one yet.
Next, the correct t
was the x-coordinate for some point. Let’s call that
point A
. Because of how these elliptic curves work, we can start with x
,
and solve the equation for two values of y
. One of these is the correct y
for this point A
. So for each of our t
guesses, it’s easy to solve the
curve’s equation to find possible points A1
and A2
. One of these has to be
r * Q
. 131072 guesses now, still not many.
Because of the properties of these groups, we know there must exist some
integer e
such that P = e*Q
for the points P
, Q
on the standard’s
curves. If we’re a lowly member of the public, we only know P
and Q
, and
finding this e
is solving the Discrete Logarithm Problem7,
which is not known to be feasible in this situation. However if we’re the NSA
– or any other outfit that could pick the points P
and Q
that people use
Dual_EC with – then choosing e
, P
and Q
such that P = e*Q
is a
cakewalk.
Let’s assume we know e
.
One of our 131072 points A
is the correct one, where A = r*Q
. Taking e
,
and the correct A
, we can do e * A = e * (r * Q)
and based on equation
(1)
from the paper, we know that this is r * P
9, which
gives us the target Dual_EC’s current state s'
. So do e * A
for all of the
guesses we have, use each resulting state guess s'
to instantiate a different
instance of Dual_EC, and one of these will be synchronized with the target
Dual_EC. Broken10.
The Challenges#
Making CTF-style challenges is tricky. They must be intentionally vulnerable and exploitable, but only in the intended way, so you also have to avoid all other vulnerabilities (good luck!). Also, in my opinion they should lead the attacker towards the intended solution so people don’t waste time getting frustrated with other unrelated rabbitholes.
I decided a guessing game with an open-source network server worked best. How the server worked shouldn’t be a mystery for this, and the code would also be reusable by attackers so they wouldn’t have to waste time re-implementing Dual_EC or searching for libraries for doing math on elliptic curves over finite fields.
The only interaction possible was over a network socket, and Python should make simple network communication relatively secure. JSON is used so that network traffic is human-readable and serializing/de-serializing is handled. Only hardcoded elements of the object from the attacker are accessed. SSL was used so teams would at least need to use ARP poisoning (if possible) and some more active interception to steal the flag from other teams, and the cert was available so attackers could hardcode cert checking.
The server gives players output of its random number generator, and accepts a guess for the next number. If the challenger can make 50 correct guesses, the server starts sending the flag. They win! Only a few bad guesses and the player gets disconnected. They have to try again, with a freshly reseeded RNG to guess against. This should make it only possible to win by actually breaking the PRNG.
There’s the issue of knowing e
however, and the groups used in the
standardized curves are all too large to brute-force this from P
and Q
. We
make this easier by just drastically reducing the group size so that anyone –
even someone with a netbook from 2009 – can brute-force to find e
. The
challenge is supposed to test people’s knowledge of the backdoor, not their
wallet.
Throwing even most cybersecurity professionals at a weakened Dual_EC and expecting them to break it isn’t fair, the elliptic curve math is still pretty specialized. We’ll break the full problem down into more digestible steps. This also gives us multiple related challenges with a difficulty progression to pad out the challenge count and hopefully give people at least one achievable challenge.
Level 0#
For devoping & testing the rest of the challenge design besides the PRNG parts, we use a dummy PRNG that always returns 42. I don’t think this was live during the actual CTF, but it’s in the code.
self.prng = type("",(),dict(get_num=lambda: 42))
Level 1#
We use a similar design as Dual_EC, but in a simpler finite field11
and without output masking. This should put the focus on seeing where doing
math with the given P
and Q
in the given finite field gets us.
This challenge uses the arbitrarily chosen GF(104729)
with P = 7
and Q = 2
. 104729 is prime, but the choice of P
and Q
with that prime
was based on vibes and basic trial-and-error that the PRNG itself had a
“long enough” run before repeating a value. If I were a cryptographer
I could’ve picked all this more intelligently, but I’m not.
Apparently an oversight on my part, but this weak PRNG uses multiplication and not addition. The finite field also gives us more flexibility, and the hardcoded way I have for cracking the PRNG involves using multiplicative inverses in a way that doesn’t directly match what’s used for Dual_EC. Oops!12
def get_num(self):
r = self.state * self.P
self.state = r * self.P
t = r * self.Q
return t.n
The t.n
is just us getting a Python integer from the finite field element t
so we can do typical arithmetic with it, rather than more limited finite field
arithmetic. For attacking, we have this:
class Guesser:
def __init__(self, output):
F = FiniteField(104729,1) #use the same finite field at the prng
P = F(7) #same parameters
Q = F(2)
PQ_inverse = (P*Q).inverse() #the trick! we have multiplicative inverses!
T = F(output)
oldstate = T*PQ_inverse
curstate = oldstate*P*P
self.p = prng(seed=curstate.n)
def guess(self):
return self.p.get_num()
This method doesn’t exactly match Dual_EC. Instead of finding the current state
s'
we can actually go back even farther and get s
, which we then
fast-forward to s'
. We could’ve followed Dual_EC more closely by just
finding the inverse for Q
, but nothing really forces us to brute-force
an e
to solve P = e * Q
.
Level 2#
Same field, same algorithm, same ‘points’ P
and Q
. However this time we
add in some bit masking.
def get_num(self):
r = self.state * self.P
self.state = r * self.P
t = r * self.Q
return t.n & 0x7fff # throw 2 leading bits away
The math for breaking this is identical, but now we have to do some guessing to handle the bits that were thrown out before we got the output. Our program and challenge interaction get more complex
class Guesser:
def __init__(self, output):
F = FiniteField(104729,1)
P = F(7)
Q = F(2)
PQ_inverse = (P*Q).inverse()
#recover 'potential' outputs
outputs = [(i<<15) | output for i in range(4)]
Ts = [F(o) for o in outputs]
oldstates = [T*PQ_inverse for T in Ts]
print("old states:",oldstates)
curstates = [o*P*P for o in oldstates]
self.candidates = [prng(seed=c.n) for c in curstates]
self.refine_step = 0
def guess(self):
print("generating number")
return self.p.get_num()
def refine(self, num):
"""
based on next output (num)
determine which candidate prng
is correct.
"""
for i in range(len(self.candidates)):
if self.candidates[i].get_num() == num:
print("old state found!")
self.p = self.candidates[i]
Basically we make a number of “candidate” PRNGs like in Level 1. One of them is correct, but we have to find out which. We just make a wrong guess to get another “random” number from the server, then use that for a refinement step to determine which candidate is correct.
Level 3#
Here’s where we move to elliptic curves, and not being a cryptographer really
shows. Solely because it’s a cool prime that’s about as large as I want, I
chose a prime of 331337 for our underlying field, and the curve y^2 = x^3 + x + 1
.
The cycle this gives our RNG, based on the comment, is really short but
probably long enough to work for the challenge. I probably found Q
by picking
random points until I found one that generated a decently sized group, then
just randomly picked an e
to generate P
by computing e * Q
.
class prng:
def __init__(self,n=331337,seed=42):
#don't need to keep field or curve beyond constructor
F = FiniteField(n,1)
C = EllipticCurve(a=F(1),b=F(1)) #choice of curve assumes default
self.state = F(seed).n
#this gives a 'cycle' of 71 elements...
self.Q = Point(C,F(153116),F(171795)) # <Q> has order 6257
self.P = Point(C,F(285710),F(143307))
def get_num(self):
r = (self.state * self.P).x.n
self.state = (r * self.P).x.n
t = (r * self.Q).x.n
return t #define SO LONG AS we're in integers mod P
It’s all just good enough to get the point across, but actually knowing the
cryptography at play would’ve made it easier to pick a curve and points that
would work properly. I don’t actually know if the set of all points on this
curve is of prime order, but by choosing P
and Q
as I have, I at least
restrict us to a prime order cyclic subgroup that we know would have P = e * Q
,
and is small enough to brute-force e
.
class Guesser:
def __init__(self, out):
"""
given the initial output,
init out guesser so we can guess
all remaining outputs
"""
prime=331337
F = FiniteField(prime,1)
C = EllipticCurve(a=F(1),b=F(1))
e = F(3) #backdoor! we'd have to pre-compute this
val = out*out*out + C.a * out + C.b
print(time.time(),":","finding points...")
points = [Point(C,F(out),F(y)) for y in tonelli_shanks(val.n,prime)]
#print("points: ",points)
print(time.time(),":","recovering states...")
states = [(e.n*T).x.n for T in points]
#as both candidates are additive inverses of
#one another, they have the same x coordinates
print(time.time(),":","making prng")
self.p = prng(seed=states[0])
def guess(self):
"""
by the nature of level 3, we
don't have to 'modify' our guesses.
we're guaranteed to get precisely
the right prng
"""
return self.p.get_num()
Aha, the backdoor e
for this situation is apparently 3
. This relationship
between P
and Q
would be figured out some other way then inserted into this
attack script. Once someone started looking for e
to do P = e * Q
it
wouldn’t have taken long. Because the output is the x-coordinate and we know
the curve equation, we can easily get to y^2 = n
for some number n
.
Normally you’d just take the square root here, but we’re in a finite field, how
would we even do that?
Luckily the Tonelli-Shanks Algorithm13 can do this because we’re
working in a field of prime order. This is included in the prng library used in
the challenge, and is used in a find_point
function that returns a point on
the curve C
that has the given x-coordinate. The player doesn’t have to know
about this beforehand or implement it themselves, assuming they read the code
for the challenge. This is also very helpful for returning to these challenges
nearly 10 years later, I’d completely forgotten about Tonelli-Shanks!
Level 4#
Ok, full strength now! Well, except we’re still in our small and brute-forceable
subgroup of points generated by Q
. Like in the jump from 1 to 2, we just add
in bit masking.
def get_num(self):
r = (self.state * self.P).x.n
self.state = (r * self.P).x.n
t = (r * self.Q).x.n
return t & (0x7fff)
Which means that the attack from 3 just has to add in generating multiple PRNGs and a refinement step that determines which one is the correct guess and synchronized with the target.
class Guesser:
def __init__(self, out):
"""
given the initial output,
init out guesser so we can guess
all remaining outputs
"""
prime=331337
F = FiniteField(prime,1)
C = EllipticCurve(a=F(1),b=F(1))
e = F(3) #backdoor! we'd have to pre-compute this
xs = [(i<<15) | out for i in range(16)]
print("xs:",xs)
vals = [x*x*x + C.a * x + C.b for x in xs]
print("vals:",vals)
print(time.time(),":","finding roots...")
# at this point, *some* vals won't be quadratic
# residues, thus invalid points. But, we must
# associate each possible preimage with its root
coords = [] # list of tuples (x,y)
for i in range(len(xs)):
try:
t = tonelli_shanks(vals[i].n,prime)
coords.append( (xs[i], t[0]) )
coords.append( (xs[i], t[1]) )
except Exception:
# not quadratic residue
pass
print("coords:",coords)
print(time.time(),":","making points...")
points = [Point(C,F(c[0]),F(c[1])) for c in coords]
print(time.time(),":","recovering states...")
states = [(e.n*T).x.n for T in points]
print("states:",states)
print(time.time(),":","generating candidates...")
self.candidates = [prng(seed=s) for s in states]
def guess(self):
return self.p.get_num()
def refine(self, num):
"""
given the next output, refine our
candidates list to only the correct one
"""
print("refining...")
for i in range(len(self.candidates)):
if self.candidates[i].get_num() == num:
self.p = self.candidates[i]
print("successfully refined!")
break
Conclusion#
Wow! We just broke our own toy version of Dual_EC. Sure it was smaller and
simpler, but we dealt with the bit masking, the point arithmetic, and even the
extra – and for actual Dual_EC, unfeasible – step of brute-forcing the
backdoor e
for ourselves. It does take a bit of a dive into math world to
grasp, but hopefully with some code and comments to guide the way, it can be
less of a mystery to someone than it was before.
-
https://en.wikipedia.org/wiki/Bullrun_(decryption_program) ↩︎
-
https://blog.cloudflare.com/a-relatively-easy-to-understand-primer-on-elliptic-curve-cryptography/ ↩︎
-
https://www.jeremykun.com/2014/02/08/introducing-elliptic-curves/ ↩︎
-
A group is a specific algebraic structure. In simple terms, we have some elements (numbers, points, objects …) that are members of the group, and an addition operation
+
that works between them. We have an identity element that works like zero, and each element has an inverse.A + -A = 0
andA + 0 = A
. For points on an elliptic curve, the trick is defining a+
operation so thatP + Q
makes sense and follows these group rules. The other Elliptic Curve intros get into all that. ↩︎ -
https://en.wikipedia.org/wiki/Discrete_logarithm#Cryptography ↩︎ ↩︎
-
the process for deriving random bits and mutating internal state, and the approved elliptic curves and points P and Q on those curves are public as part of the standard. It would generally be safe to assume that an in-the-wild instance of Dual_EC_DRBG is one of these public standardized versions, rather than something on a completely different custom curve, or using different points P and Q. ↩︎
-
Because our scalar multiplication is just repeated addition, we can do this
e * (r * Q) = e * (Q + Q + ... + Q) //(r many Qs in parens) = Q + Q + ... + Q //(e*r = r*e many Qs) = r * (Q + Q + ... + Q) //(e many Qs in parens) = r * (e * Q)
It’s basically just mentally re-grouping the Qs in the expression. ↩︎
-
Broken until the target Dual_EC gets reseeded, that is. But once we have a method worked out to synchronize to a target Dual_EC by observing a few bytes of output, this might not be a huge deal. ↩︎
-
https://en.wikipedia.org/wiki/Finite_field
Doing math in a finite field seems intimidating at first, but just involves taking remainders when you do arithmetic. You might know this as the modulus operator
mod
. For a simple example in the finite field of order 5GF(5)
, we have elements{0,1,2,3,4}
.1 + 3 = 4
and4 / 5 = 0 remainder 4
, so that works as expected. But3 + 4 = 7
and7
isn’t in our group, which isn’t allowed. But7 / 5 = 1 remainder 2
, so our actual answer inGF(5)
is3 + 4 = 2
. Usingmod
this is3 + 4 = 7 mod 5 = 2
.Multiplication also works! That’s what makes this a field and not just a group. We just ignore the element
0
.2 * 3 = 6 mod 5 = 1
. ↩︎ -
In this finite field there is some number of times
Q
can be added to itself to getP
, so theP = e * Q
trick would still work, if addition were used rather than multiplication. Because the field is of order 104729, it’s small enough to easily brute-forcee
. ↩︎ -
https://en.wikipedia.org/wiki/Tonelli%E2%80%93Shanks_algorithm
This uses number theory lingo, but basically if
p
is prime, you known
, and you want to find anx
such thatx^2 = n (mod p)
, this algorithm can findx
. Butp
must be prime. ↩︎