HTML Code:
<ol><li>You must be able to force the game to call your subroutine.</li><li>Your subroutine must return cleanly to the game. Ideally, the only noticeable changes in the game's behavior will be the effects the code was designed to create.</li><li>Registers used by your subroutine's code must be preserved.</li></ol>
Each of these will be covered by this example.
A slow-motion code is a fairly generic custom subroutine. The actual code is very straight forward and there are a handful of guides with generic routines already floating around the internet. The basic premise is to create a loop that will slow down the game's execution just enough to be a useful cheat. We're going to create a slow-motion code for Final Fantasy X-2.
For this code, following Rule 1 is pretty easy. The simplest place to call the routine from (often referred to as "hooking") is the scePadRead function in the game1. The scePadRead is part of a generic developer library that deals with the control pad. The easiest way to find it is to import labels in PS2Dis from a game that has its scePadRead labeled. There are other ways (link to joker/mcode guide(s)). The scePadRead in FFX-2 is located at 0x00321600.
Now, the code definitely shouldn't interfere with the functions of the scePadRead. We only hook it here so that the loop will be executed frequently. Sticking the call in the wrong place might have some interesting effects on the pad. So we'll hook it at the return instruction, which is at 0x00321674. Here's what the scePadRead looks like, the hook will be the "jr ra" op near the bottom of the image:
Now that we know where our call will be, we need to write the routine.
The routine will be written in memory at 0x000C4000. The "jr ra" at the end of the scePadRead will need to be changed to jump to that address. There are a couple of ways this can be done. One is to use a "jal" op code. However "jal" is "jump and link" which means that the contents of register "ra" will be overwritten when it is executed. In this case, we don't want to destroy that value, so this code will use a plain "j" op code to 0x000C4000. The cheat code for that will be: 20321674 08031000
Now the delay loop must be written. The code for that is pretty simple.
A high value is dropped onto register v0 and then decremented in a loop until it reaches zero. The li op in the branch delay slot (0x000C4010) accomplishes the decrement. The actual op is addiu v0, v0, $-1, the li is just an attempt to be helpful by PS2Dis.
You may be wondering why the code in the screen print starts at 0x000C4008 instead of 0x000C4000. Patience.
Now we need to follow Rule 2 and return cleanly to the game. Rule 2 can be broken down into a few general guidelines:
If you jal into a subroutine, you should jr ra back out of it and the return address (register ra) of the function you hooked had better be stored somewhere so it can be retrieved.
If your subroutine will only be called from one place, it's often simpler to just j to it and then j back to the instruction after the branch delay for your hook. This saves you having to maintain the ra register.
If your subroutine is sufficiently simple, you can merely use the return address of the hooked function and return to its caller. You have to be careful not to do this before the hooked function finishes its work.
For this code, the third option will be used. That's why the hook replaced the jr ra op. Our little subroutine will be able to safely return to the scePadRead's return address.
Now to add the final touch to the code. Notice that register v0 is used to create the delay. Whatever was on that register before is now gone. The scePadRead is called from a few places in FFX-2, so it's difficult to tell if any register can be safely used the way s7 was in the previous example. To be safe, we'll need to preserve register v0 before the main body of our code.
Registers are generally preserved as a preamble to most subroutines. The most used method is to move the stack pointer (register sp) back a few bytes. That is, add a negative value to the stack pointer. The registers that need to be preserved are then stored at offsets from the stack pointer. You need to be aware how much data you need to store and shift the stack pointer back a sufficient number of bytes. There's no harm in using a larger value than necessary to be safe. We've only got one register to store in this subroutine, so we'll just back up the stack by -0x10 (0xfff0) or 16 bytes. The code for that is addiu sp, sp, $fff0; the cheat code for that will be 200C4000 27BDFFF0. To store v0 and preserve its contents we need only do a "sd v0, $0000(sp)"2; the cheat code will be 200C4004 FFA20000.
That's only half of the task involved in preserving registers though. The second half is restoring the registers once the subroutine has finished its work. That includes restoring the stack pointer. All that's required for this code is to "ld v0, $0000(sp)" and restore v0's old contents. The cheat code for that is 200C4014 DFA20000. Finally, the stack pointer must be restored by adding positive 0x10 to it to restore its old value "addiu sp, sp, $0010". The cheat code for that is 200C401C 27BD0010.
You may notice the single instruction gap between the restore of v0 and the restore for the stack pointer. This is there to allow room for the "jr ra" op that will return from the subroutine. The cheat code for this will be 200C4018 03E00008.
The final result in assembly looks like this:
The cheat code, including the "hook" will be:
Final Fantasy x-2
Slow Motion Code
200C4000 27BDFFF0
200C4004 FFA20000
200C4008 3C020020
200C400C 1C40FFFF
200C4010 2442FFFF
200C4014 DFA20000
200C4018 03E00008
200C401C 27BD0010
20321674 08031000
Wait though, slow motion is annoying and not the sort of thing you want active all the time. So let's make an on/off joker set up. We'll use L3 and R3 to deactivate and activate the code. All you have to do is joker the line with the hook in it (the last line) and provide two extra lines to "turn off" the hook by putting the game's old code back. So the real final code will be:
Press R3 For Slow Motion, L3 Returns To Normal
200C4000 27BDFFF0
200C4004 FFA20000
200C4008 3C020020
200C400C 1C40FFFF
200C4010 2442FFFF
200C4014 DFA20000
200C4018 03E00008
200C401C 27BD0010
D05B29C2 0000FFFB
20321674 08031000
D05B29C2 0000FFFD
20321674 03E00008
The lines with 20321674 are the on/off codes (03E00008 is off). The joker command for FFX-2 is D05B29C2 0000????. I'd provide a screenshot of the code in action, but it really wouldn't show anything. All pictures are in slow motion.
HTML Code:
<div style="border-top: thin solid; font-size: 8pt;">1 - Not all games have a scePadRead. If a game you're hacking doesn't have one, you'll need to find somewhere else to place the "hook". <p>2 - The store op is dependent on the number of bits in use in the register. While most times storing a word will be sufficient, it doesn't hurt to play it safe. When in doubt, store quad (<font face="courier new">sq reg, off(reg)</font>) </p><pGeneral Note - This code probably could have been done without the need to preserve registers, however I wanted to do so for demonstration purposes.</div>
More Complex Subroutines And Code Optimizing
Now we've got the basic rules of writing custom subroutines down, so let's expand on the concept a bit to create a more complex routine. This example will also include some tips on optimizing the code. In this case "optimization" refers more to refining the code into the fewest lines we can. Speed really isn't a consideration when you're writing such tiny amounts of machine code.
This example will create a code for the game "Genso Suikoden III." What the code will do is give all characters an S rank in all their skills. Skills have an S rank when 0x8 is present in the memory that represents the rank, so there's the value we need. Now at first blush this might sound like something that could be accomplished simply with cheat device commands, but it really can't be.
Each character has at most eight skills and ranks (all characters have space for eight, but they aren't always used).
The way the skills are organized in memory looks like this, "S" stands for skill and "R" stands for rank: |S|R|S|R|S|R| and so on. Each value is one byte. There in lies the problem. You have to do 8-bit writes to set the ranks unless you want to overwrite the skills as well. There are 113 characters in the game, so that's about 8 * 113 = 904 lines of code to do it. Now on an AR MAX, you can do 8-bit slide codes and that would reduce it to 16, but that doesn't help people who don't have an AR MAX. And there's another problem, the AR MAX will write and 8 to all eight skills for all characters, even if the skill slot is unused. That may not be a problem, but it would be nice if it can be avoided.
Luckily, each character occupies the same amount of memory and they're all adjacent to each other. That means all the subroutine has to do is load the address of the first character's first skill, make jumps to set all eight for that character and finally jump to the next character. There are 109 characters in the block, so the code can just use two nested loops to set the whole mess.
Pseudo-code for this routine:
HTML Code:
<pre> LOAD INITIAL ADDRESS
CHARACTER_COUNT = 0
WHILE (CHARACTER_COUNT < 109)
SKILL_COUNT = 0
WHILE (SKILL_COUNT < 8)
IF SKILL IS ZERO
JUMP TO NEXT SKILL
ELSE
SET SKILL RANK TO 'S' (8)
END_IF
INCREMENT SKILL_COUNT
END_WHILE
CHARACTER_COUNT = CHARACTER_COUNT + 1
JUMP TO NEXT CHARACTER
END_WHILE
</pre>
Now we have to decide where to hook it. After that, we have to choose what registers to use and then the subroutine can be written. Since this deals with skills, the normal routine for increasing them seems like a good place to hook it. The jump should be near the bottom of the routine, for safety and other reasons that will be explained shortly. Here's the end of the skill increase routine (it's rather large):
Several registers are being restored from the stack at the end of this routine. That's perfect for us, because you can save instructions in your subroutine by putting the hook in the right spot and using as many registers as possible that are already saved on the stack. The ideal place to put the hook is before any registers, other than ra are restored. Unfortunately, we can't do that here. See the labels "__016c7fb4", "__016c7fb8" and "__016c7fbc"? Each of those indicate some branch instruction or other refers to that address, so it's tough to guarantee execution of instructions above it. We need to guarantee execution of our subroutine, so the hook should be on or after the last branch target (label). So we'll put our hook at 0x016C7FBC. (The label "__016c7fc4" is mine because my version of the subroutine references it. It's nothing to worry about.)
Putting our hook there means that the "ld s4, $0040(sp)" op will be replaced. So somewhere in the subroutine we'll need to execute it on our own. The "ld s3, $0030(sp)" will be executed as it will be in the branch delay slot of our jump. That means it will be restored, so if we use it, we'll need to restore it again. If we can avoid using it, then all will be well, so we'll take care to avoid using it. That leaves s2, s1 and s0 free for use and they need not be preserved by us because they'll be restored on returning from our routine.
We could just as well use whatever registers we want and preserve them ourselves, but there's a good reason for avoiding the need to preserve registers. Each register requires a store and a load op to preserve. In even a very simple subroutine, you can easily add a dozen or more lines to the code just to save registers. In a subroutine that uses five registers, finding away to avoid preserving four of them in your routine will save eight lines. If you can find a way to avoid preserving all five, then the code can be twelve lines shorter. You gain two more because you no longer need to maintain the stack pointer.
Now we know that we can use s0, s1 and s2 freely. We'd might as well use s4 since we'll have to restore it anyway. So let's try to write the subroutine using only those four registers.
We'll use s0 for the address. s1 and s2 will be the character and skill counters respectively and s4 will be multi-purpose. Here's the assembly for the pseudo-code:
The last two operations are the replacement for the s4 restore and the return to the game routine. The return address is for the instruction immediately following the branch delay for the hook. We already know the address for the hook, so we just need to make it a j $000C4028 at that address. Making the addresses and machine code for the routine and hook into cheat codes gives us the final result:
Genso Suikoden III
One Lesson Promotes All Skills For All Characters To S Rank
200C4028 3C100196
200C402C 3610E79C
200C4030 0000882D
200C4034 0000902D
200C4038 82140000
200C403C 12800002
200C4040 24140008
200C4044 A2140001
200C4048 26520001
200C404C 2A540008
200C4050 1680FFF9
200C4054 26100002
200C4058 26310001
200C405C 2A34006D
200C4060 1680FFF4
200C4064 2610007C
200C4074 DFB40040
200C4078 085B1FF1
216C7FBC 0803100A
So our final result is nineteen lines long. Not too shabby. If we had been forced to preserve the registers we used, the final line count would have been twenty-nine, which is rather high. The AR MAX equivalent would be sixteen lines long and seventeen counting the verifier code. And the AR MAX won't be able to check whether a skill is actually present. That takes up two lines in this code, so the code-line count is really pretty good.
General Note - There's a happy coincidence with the value for S rank being the same as the skill count. An extra code line could have been removed by testing for equality between the skill count and s4 instead of using the slti s4, s2, $0008 op.