Function Calls, Arguments, and Walking the Stack - by dcx2

This will be an academic discussion of how to “walk the stack”.  When writing a code that modifies the game's assembly code, you need to be careful about when the modified code is being called.  For example, I made a Super Mario Galaxy code that increased star bits by 1 when you shot them.  However, it had the nasty side effect of subtracting 1 from your star bits when you picked them up.  Worse, if you started with 0, you couldn't shoot any, and getting star bits subtracted 1, so you become screwed!

In programming, we have functions which are called with arguments.  When a function has to do some work, it needs to free up some registers, so it makes some room on the stack and pushes the registers it will use onto the stack.  When the function is done computing, it can return the registers to normal by popping the registers off the stack before returning to the caller.

So you could say that the AddStarBits() function was being called when when I collected [ AddStarBits(1); ] or shot [ AddStarBits(-1); ] star bits.  Because this code is called from multiple places, changing it could have many wide-ranging and unintended side effects.  It would be much better to see who is calling me and change the argument in the “shot star bits” code.

DCX2, why would we want to do that?  Just write the max to the RAM value all the time!  For your general h4x that's true, but this is an academic discussion so we're using something a bit more contrived in order to demonstrate this process.

Go to Good Egg Galaxy (or any world that's not the observatory, because my example requires you to be in a level).  Get some Star Bits (I got 20), do some Code Search, find the memory location.

Too many results, so I shoot one

Ah, that's better.  Right click on the address, then poke (to 0x14) and verify that it is the star bit count.  Then right click on the address again, then breakpoint.  Switch tabs and set a Write breakpoint and shoot a star bit.  This will provide a clue as to who is changing our star bits.

[NOTE: Sometimes when using data breakpoints you will want Exact match, but this road is occasionally fraught with perils ]

This is our friend who is writing the new star bit count from r3 (was 0x14 but is now 0x13).  We could nop this out if we wanted to, but remember, this is academic (and noping would break the adding star bits as well).  So let's explore the dissassembly surrounding this instruction.

Note the highlighted add instruction.  It sets r3, and a few instructions later, stores r3 into the star bit count.  So what's the state of the registers when this is going on?  Right click the add and breakpoint.  Shoot another bit

[ NOTE:  WiiRdGUI automatically makes it an execute breakpoint when you use right-click in the disassembly ]

r4 has 0xFFFFFFFF, which is two's complement for -1.  So we're adding -1 to the current star bit count in r0 and putting the result in r3 before storing it back to RAM.  So let's try making the add a sub instead!

This succeeds.  Shall we call it a day?  We shouldn't...let's run around and make sure nothing odd happens.  Go grab a star bit and suddenly our star bit count is decreasing!  Set the breakpoint on that instruction again and grab another star bit

r4 now has a 1!  That's going to subtract from our current star bit count, making it go down.  So where does r4 come from?  If you search through the disassembly, you will find nothing that sets r4.  That's because r4 is an argument.  So when we get a star bit, it passes 1 in for an argument, and when we shoot one, -1 in for an argument.  We must walk the stack and find the original caller and change the -1.  But how do we find where the caller was from?

When you call a function, you need some way to get back.  This mechanism is provided by the Link Register, LR.  When you use the Branch and Link instruction, BL, it places the address of the next instruction into the Link Register as a way of “linking” program execution back to where we left off.  Once a function is done, it will Branch to Link Register, BLR, and program execution continues after the call.

We can't just look at the LR immediately, though.  Functions can call functions can call functions can call functions...Any BL between the start of the function and the current instruction will have modified the LR.  So if a function is going to modify the LR, it will need somewhere to put the current LR, so that it can get back to the function that called it.

To store the LR, we will use the stack.  The stack is just a large portion of continuous memory used for to hold register values so that a function can use those registers without worrying about destroying any of the caller's data.  The stack pointer, r1, points to the top of it.  When a function needs to modify any registers (LR or otherwise) without losing the original value, it will subtract the required space from r1 to make room on the stack and then push the register's value into the blank spot.

When making room on the stack, you will always see stwu r1,[space required](r1).  In the case below, we're clearing out 16 bytes for storing registers on the stack.  The next instruction, Move From Link Register mflr, puts the LR into r0.  This is necessary because there's no instruction for writing the LR directly onto the stack.

The next two highlighted stw's are pushing registers r0 (the “return address” from the LR, so that we can bl safely) and r31 (room required for a local variable) onto the stack.

At the end of the function, the highlighted lwz's are popping the registers' original values off the stack now that we're done with them.  Then we Move To Link Register mtlr to put the return address back to normal, add 16 to the stack pointer to release the memory we requested, and finally BLR back to our caller!

This provides two opportunities to find our caller – before the function pushes the LR and after it pops the LR.  I usually choose to break on stwu because there could potentially be multiple blr's in a single function.

Right click and set a breakpoint on the stwu and shoot another star bit.  When we break, the link register will hold the address of the instruction after the bl that called us. 

Copy and paste the value in the LR into the disassembler.  Scroll up several instructions, to see the calling bl and whatever comes before it.

[ NOTE: bl stores the address of the NEXT instruction in the LR, so you won't be staring at the caller when you enter the address into the disassembly, but the instruction AFTER the caller ]

The highlighted bl is one before the value in the LR.  That's the instruction that calls the function that's updating the star bit count.  As a sanity check, double click the bl and the disassembler will jump to that instruction.

It doesn't point to the stwu!  That's okay because there's a branch that DOES go to the stwu.

Back to the disassembly.  There's a bunch of bl's before the one we set the first breakpoint inside, and any one of them *could* load r4, or it could be another argument from whoever called us, so let's break at the start of this new function and see if we can see the argument.

Got a star bit here...

Shot a star bit there...

Gee, r4 looks like a really good candidate for the argument.  After staring at the Matrix for a while, you'll start to notice that r3 and r4 are often used to pass things into/out of functions.  So let's walk the stack again, by copying/pasting LR into the disassembly tab, and scroll back a few lines.  (remember that the LR holds the address of the NEXT instruction AFTER the bl, which is why you end up staring at the lwz if you don't scroll up)

Notice that this time the bl points straight at the stwu; this is the typical case.  So the argument r4 came from r31.  Where did r31 come from? 

So once again, someone passed something in, this time with r3.  This was then cached in r31.  So let's break on the start of this new function!  (a journey of a thousand functions begins with a single call...)

Copy and paste the LR into the disassembly, scroll up, look around...

HMMM li r3,-1 right before the bl?  Gee, I wonder what that does.  Let's try changing it to li r3,0 and see what happens.  Whoo, star bit count stays unchanged when shooting!

We started with rudely swapping the add for a sub, which lead to unintended consequences when getting star bits legitimately.  After walking the stack a few times, we found the one line of code that only gets run when we're shooting star bits, and now our hack is sleek.

I hope that now you understand function calls, arguments, and stack management a little better, and that it might come in handy when writing more interesting hacks.