# Linear independence and GF(2) – 34C3 CTF software_update (crypto) part 2/2

We are going to solve a crypto challenge with

some cool math. I think it’s an awesome example of linear

algebra applied to some real world security problems. The story about this challenge is that we

have a firmware update that will only be applied if verified, and works by hashing files and

folders with sha256, and xoring those hashes together. The resulting xored result is then checked

against a signature. If you want to hear more about how we explored

different attack ideas you can checkout part 1, but in this part I just want to focus on

the mathematical solution and try provide an intuitive understanding for the linear

algebra involved. And I’m also sorry to all mathematicians

out there if I butcher some terminology, I’m a math noob. Okay, so to understand the solution I have

to introduce two concepts. One is about linear independence, and how

linear independent vectors in some vector space can match to any point. And the second concept will be Galois Fields

or finite fields, specifically the GF(2), the field of only two elements. If you didn’t learn this kind of math, for

example in university or you are still too young, then these terms sound crazy, but I

wanted to mention them because while it sounds crazy, I think it’s really easy to understand

once I explained it. So let’s start with linear independence

of vectors. If you are not familiar with vectors, just

ignore the term and just see how we are using it. They are basically just arrows. So… Imagine a 1 dimensional space. So just a line of numbers. And I give you a one dimensional vector. Now a vector is like an arrow, so this vector

starts at 0 and points to 2. With a scalar, or just “regular number”

we can multiply that vector to move that arrow around. For example 2x the vector will point to 4. And 4x the vector will point to eight, and

-3x this vector will point to -6. Easy. And as you know, multiplication is just fancy

addition. For 4x the vector we just add the vector to

itself 4 times. So you see that with this one vector, if we

multiply it with some value, we can make it point to EVERY number in that one dimensional

space. Now let’s look at this in two dimensional

space. Just a flat surface. We add another axis. Now a vector will have two values, x and y. Now let’s take one vector, for example [1

1]. Now this is an arrow that points to this point

(1/1). Again we can multiply a scalar or just add/subtract

this vector many times, we can move around where it points to. With 2x this vector we point to (2/2). With 5x we point to (5/5). And with -3x this vector we point to (-3/-3). But you see that we can’t add or subtract

and reach every value on our plane. We only ever can reach any point on this line. Let’s introduce a second vector. [3 3]. Now this one points to (3/3) and again we

can add or subtract it, we can even combine our two vectors, but we are still limited

to this line. You might also have noticed, that we could

basically create this vector by taking our first vector 3 times. These two vectors are linearly dependent. But if we get a vector that is linear independent,

that is not laying on the same line, for example [1 0], now we can’t reach this point with

only the first vector. And vice versa, the new vector can’t reach

a point of the second vector. Except of course 0. But we can now use them together to get to

any point on this 2 dimensional field. For example to reach the point (3/5) we can

take the first vector 5 times, now we are at (5/5). And then we take the second vector and subtract

it two times, and now we reached (3/5). And so with TWO independent vectors we can

get to any point of the TWO dimensional field. And this also works with three dimensions. If you have three linear independent vectors

in this three dimensional space, you can get to ANY point you want by adding or subtracting

them. That makes sense, right? Just think about it for a second and imagine

in your head how you can combine three different arrows in a three dimensional space and reach

any point in there. And of course this also works in 4-, 5- or

higher dimensional space. In 4 dimensions we need 4 linear independent

vectors and in 5 dimensional space we need 5 linear independent vectors, and so on. Of course you can’t really visualise that

anymore, but you can just write them down as a vectors and play around with adding and

subtracting them from another to reach any point you want. Amazing right? But if you have very large vectors, it’s

a bit cumbersome by hand to select the vectors you want to add or subtract to reach certain

points. That’s where you get into linear algebra. You just put all those vectors as columns

into a matrix, here is the point you want to reach, and then you can solve this. But we don’t have to do that by hand, we

can let computers do that for us. Ok so, what the heck does this have to do

with this challenge? We get there. Patience. So now let’s introduce the second concept. Finite field is already a fancy sounding name. But there is a much nicer name for it. That’s synonymous. Galois fields. That’s a nice french name, right? Ok. So when you wanna, tell your girlfriend or

boyfriend or your mother, whoever you want to impress what you did today, at university,

say I learned about Galois fields. Ok? Specifically we want to have a look at GF,

Galois Field (2). Ahh it sounds sooo cooool. But it’s really dumb. But clever. Basically, throw away any numbers you know

and just keep two. 0 and 1. And you live now in a world where only 0 and

1 exists. The binary world. No 2, no 3. Only 0 and 1. Now let’s think about how addition could

look like in this world. 0 + 0 could be 0, that makes sense

0 + 1 could be 1, that makes sense too So does 1 + 0, that’s also 1. But 1+1 is a bit weird, because in our world,

there is nothing larger than 1. And so we are imprisoned in this weird world

and if we try leave here to find a larger number, we just wrap around in our world and

arrive at the start again. Like the game snake. And as a computer scientist, do you notice

something? What is this? This is XOR. Addition in the finite field of two elements,

GF(2) is equivalent to the logical XOR operation. Ding ding ding dinggg. Does it ring a bell? So when we XOR for example two bytes, each

byte has 8 bits. Eight 1s or 0s. For example 0x55, in ascii the capital letter

‘U’, would be 0101 0101. And if we XOR that with 0xD9, in binary 1101

1001, we get 1000 1100 , or 0x8C. So these bytes could also be understood as

a vector, it’s an 8 dimensional vector with 8 elements either 0 or 1. They are vectors in an 8 dimensional GF(2)

vector space. Each vector has 8 bits, and a bit exists only

in a world of 0 or 1. A bit can never be 2. It’s always either 0 or 1. And we just learned, that an XOR in GF(2)

is like addition. So no we just added two vectors together and

got another point in that vector space. And this means a sha256 hash, has 256 bits,

and can be understood as a 256 dimensional vector in a Galois Field (2) vector space. And the first concept we learned about was,

that if we have enough linear independent vectors, we can add or subtract them to reach

any point in the space. And so if we had 256 linear independent vectors,

256 linear independant sha256 hashes, we could reach ANY point in the 256 dimensional space. And the point we want to reach, is the 256

bit vector, or better to say the final xor result of the hashes that is already valid

and signed! So if we can create these 256 linear independent

vectors, we should be able to combine them by addition, or as you know now, XOR, to craft

any XOR result that we want. In mathematical terms, we then just have to

solve this matrix equation of the 256 independent vectors with the vector that we want, any

XOR result that we want. And actually, that is very simple. Hashes are basically random data. Random bits. They are a random vector. And it’s very very unlikely that such a

random vector is linear dependent. But you can also very easily test that. So let’s start by creating 256 linear independent

bit vectors. Or vectors operating in GF(2). To do this I’m using sage, which is basically

a tool to perform mathematical calculations. Sage scripts are basically python. The code gets transformed into a python script

and that is then executed. Also as I explained in part 1, I did not come

up with this. The solution and the code I show was written

by bennofs from our CTF team. Just so that you don’t get the wrong impression

that I’m some kind of wizard. He is. So we can start with a loop to find these

independent vectors. We know how the hashes are generated by the

firmware installer, check out part 1 if you forgot, so we basically we do the same now

here, we generate a filename from just counting numbers, and append the null-byte. Then pass it to something that will return

the 256bit vector for it. The make_vector function takes the filename,

generates the sha256 hash and then goes over each byte of that hash. For each byte it extracts the single bits. It does that via a lookup table of bits that

was generated at the start of the program. Long story short, that function will just

return the vector with all 256 bits of the hash. Now that we have the vector that represents

the hash for this file, we can create a matrix of all the possible vectors. Defined over the Galois Field GF(2). And then we can check the rank of the matrix. In linear algebra, the rank of a matrix is

the dimension of the vector space generated by its columns. This corresponds to the maximal number of

linearly independent columns of the matrix. So if the new vector increases the rank of

the matrix, so increases the number of linear independent columns of the matrix, then we

can keep that one. And so we just remember all those vectors

and filenames that are good. And we can also print if a vector wouldn’t

be linear independent. If we run that now, it takes a moment for

sage to get going, and then we print out all the filenames that produces linear independent

vectors. And you can see, after the first 256 we didn’t

even get ONE that was linear dependent. Next we have to define the point in our 256

dimensional vector space, that we want to reach by combining our vectors. To do that we have to read the current valid

xor result, the one that is signed, then we do our modification of the pre_copy python

script to spawn a shell, and then we get the new invalid xor result. If we xor those two, we get our target goal. If we can somehow craft this XOR result with

our files, basically find which vectors we have to combine to reach that point in space,

then they would be xored with the now invalid xor result to produce the valid signed result. So we take that result now and define a GOAL

vector, and then we take the matrix of all our vectors, and say that we want to solve

this system of equations for the goal vector to the right, basically just solve this matrix

equation and the resulting vector will tell us which vector we take and not take. Where we have a 1 we take it, where we have

0 we don’t take it. Or to be more mathematical, we multiply the

first vector with 1 and add it, we multiply the second vector with 1 and add it, and we

take the third vector and multiply it with 0 and add it. And so forth. So we can now loop over this resulting vector,

and wherever we have a 1 we print the name of the file we have to create in order to

craft the corresponding XOR result. And if we do all of that. We can create the new .zip update file with

all of these chosen files, and that should have a valid signature, but a modified pre_copy

script. And so we can connect again to the service

and send our modified update.zip. It gets unpacked, the signature checked, and

the precopy script is executed. And this should have now spawned a shell on

the server that we can use to look around, and in the end, read out the flag. Boom. Solved. Isn’t that amazing. t

Hello. Pls I Don't understen

First

Like, as always

Finally 😀

1 AM

Perfect time for crypto mathFinally! The solution!

Awesome. Combining Comp Sci and Advanced Math, my 2 favorite subjects.

wow

Why are you using redstar os

Are you in korea

Somewhat unrelated to the actual CTF challenge, but boy have you made me understand matrices a whole lot better!

Neat! I knew the math, but didn't know you could apply it to xor like that!

Two questions. a) You find a method to forge the hash, but it does not include your actual python payload? Its not part of the 'added vectors', is it? And b), how do you solve a series of mathematical equations that have xor instead of plus as the operator?

Amazing video!

How to defend against something like this?

5:40 pass auf was du sagst sonst erkennt dir die TU Lina ab ^^

Danke für das Video jetzt hab ich ne idee für was ich in den Mathe modulen eigentlich lerne!

I'm probably dumb, but,

If I remember my ALGA classes right, you need to compare 2 vectors to say if they are independent or dependent.

At 11:50 when you run the SAGE script what are you comparing against to know if they are or not.

Da ich ab August in den Mathekurs für die Fachhochschulreife (neben der FISI-Ausbildung) einsteige, danke für die Einführung in Vektoren.

Aber das auf SHA-256 und XOR in Verbindung mit französisch anmutenden Feldern anzuwenden, ist pure Magie.

Ok bat I dont understen how read creating or reding vurnability in any programs? Is important programme lengu?*

Your pronounciation of scalar is very german. It should sound almost like skate but with a lar at the end.

7:26 i'm going to use your

dingdingdingas a ringtone for my phonepuffffMINDBLOWN!!What is your GF like? Crypto nerd: She is ideal.

Everyone! Head over to http://crypto-textbook.com . Cool videos spotted!

[Ignore this comment]

To get 256 linearly independent vectors, you could have used unit vector in each direction. Something like [1,0,0……0] and [0,1,0,0….,0] and …

Same as standard unit vectors i, j and k.

Can you provide us with the router python scripts so that we don't have to retype them in order to practice this thing ?

"Isn't that amazing?"Yes, that is mind blowing.

Finally, closure from part 1. Need to clean the walls now, my mind just got blown and there are bits everywhere.

Great visualisation of vector algebra BTW. Either you've seen +3Blue1Brown's videos, or he needs to see this before you collaborate (you really should). I had no idea these two topics could overlap, seriously well done.

11:17 Defined over the Galo-is field

So, I wondered what was the probability of those 256 random 256-dimensional-vectors in GF(2) being linear independent?

I got 28.8788…%, can somebody confirm or deny this?

Oh wow… That was a lot more involved than I was anticipating…

Despite learning all this in university, this would never have occurred to me.

In retrospect, of course, it does make perfect sense.

Haha, my former professor at your video! Did you studie at Ruhr Universität Bochum? 😉 I got my master degree in IT-Security at RUB.

Mind blowing. I loved the part of Christof Paar's lecture. It reminded me of my cryptography classes and his videos helped me to learn theories of cryptography

Nicely explained

That one was really mind blowing. Probably have to watch it again to get all things sorted out in my brain 😀

This was amazing! Your explaination and visualization were perfect!

Ok, but somewhere is a dude sitting who came up with this challenge. What. The. And I am happy I understood it for second watch. I am inspired!

Wieso sind in dieser Kommentar Section so viele Deutsche? xD

Fantastic vid

Brilliant. If only it was solved for ADD instead of XOR we may have been millionaires…

You seriously make some of the best stuff on YouTube.

I get it! Awesome. BTW any suggestions on how to form/find/join a CTF team?

Splendidly done! And an a great write-up!

I wished I had a professor like you at our university. Instead, we had a lady who corrects herself 3 times in a row. Please don't stop.

Today in school while making a cross word puzzle for a project, I injected some HTML into the hints for the cross word words. Now I’m the only student in our grade with a cross word puzzle with images for the hints.

Thank you. I've always been interesting in "hacking" but never had the motivation to do something with it. Mostly because I just did not know where to start. But since I found out about your channel, the motivation came out of nowhere. Your video's are really helping me understanding how "hacking" works. You also heped me discover CTF challenges, wth I didn't even know these existed. One week later I finished about 20 easy challenges without cheating/hints. You are amazing, and you make amazing video's. Please keep up the good work!!!

Hello have you a server discord ,

omfg

around 7:33 when he started talking about XOR-ing bytes thats when i got the picture and understood what was wrong. i remembered the code from part 1.

HOLY SHIT,

just…holy shit! o.O

Very interesting, have to try these crypto challenges myself but I'm pretty sure I wouldn't figure this one out and just waste time. At least my goal is to learn what not to do in cryptography, and your videos sure help a lot. In this case it's how not to make an update system.

Amazing!!

I need to learn linear algebra.

I've had enough calc anyways.

Wait so in gf(2) 0+0=1+1?? What is this?

I was curious about the actual probability of 256 vectors being linearly independent. Quick calculation with WolframAlpha gave me 0.288788, which means you were in a little bit of luck. Interestingly enough, (1/2; 1/2)_n, the probability of n n-dimensional vectors being linearly independent, approaches ~0.288788 as n goes to infinity. (That symbol (a; q)_k is called q-Pochhammer Symbol: http://mathworld.wolfram.com/q-PochhammerSymbol.html )

This was brilliant. Thanks.

Pretty good video, I wanted to understand most of it but had a good sleep instead. I understand most of your videos, but sleep is good as well

What's the probability that 256 random bitvectors will be linearly independent?

If I did the calculations right, it's:

prod_{i=0}^{255} (1 – 2^{i – 256})

(see https://arachnoid.com/latex/?equ=%5Cprod_%7Bi%3D0%7D%5E%7B255%7D%20(1%20-%202%5E%7Bi%20-%20256%7D) for a picture)

The idea is that every time we add a new linearly independent 256-dimensional bitvector to the set, the span of the set will double in size. Each new bitvector must not be in the span of the previous bitvectors, and the first bitvector can't be the zero bitvector.

The expressions evaluates to about 28.88%.

Re Linear dependence: Aren't all non-equal, non-zero vectors in GF(2) independent?

My thinking is that if you have two vectors P and Q that are dependent, there must be a scalar S such that S*P=Q.

But since we're in GF(2), the possible values for S are just 0 and 1.

That's not to say anything you did is incorrect, but it does mean that the dependence check using sage is unnecessary if you ensure all file names are unique (and that hash collisions are improbable).

Isn‘t what you‘re talking about linear independence js actually just span of the vectors? Linear independce is when all of your vector collection can‘t be built by another vector in that collection

This is interesting and I think I kind of get it. Is there a way of intuitively or even mathematically check quickly if the runtime of such an attack is feasible? Obviously, it is but how could you guess beforehand?

Scary stuff! The xoring looked so sane to me…

That is magic, but I kinda understood it. But my linear algebra is not that good though

sploosh

Wow, nice stuff. Thanks for the video.

I know this is an old video but is this essentially just describing how a hash collision work? If so this is the best description of the actual mathematical reasoning behind it I've seen.

That or I totally missed something and this is different, but still cool none the less.

6:05 probably no one ever told his girlfriend 🙁

I thought this would be a lot simpler. Include the original pre_copy.sh and two copies of your modified version. Hack the zip file so they all have the same name. The hashes of the two copies of your version cancel out, making it as if they didn't exist, but they overwrite the original when extracted.

Also, could you defend against this by also including the number of files in the zip in the hash input?

My mind is so blown

In gf 2 two vectors need to be identical to be linearly dependant. Just saying.

Watch out 3blue1brown! LiveOverflow's stealing the market!

The finite field GF(2^p) is everywhere in crypto

Any chance you could talk about eigen values? :3

Solving a linear equation in 256×256 matrix sounds pretty impossible to me and yet the program seems to do it so quickly…

This trick is also used in competitive programming.