A Beginner's Guide to MIPS

Team CMP
This article's for the beginner game hackers (or advanced game hackers who dont know anything about MIPS) who need help understanding the MIPS assembly language.

This section gets into some more complex stuff as you will learn about the MIPS assembly programming language (the language in which all PS2 games are written in), but is still meant for the beginners who want to understand the code they see in the ps2dis... lets get started.

when it comes to programming in 'assembly', there is NOT one type of assembly. ALL assembly languages are programming languages in which the source code deals directly with the processor chip. the PS2 runs off of a MIPS processor chip, and for this reason all PS2 games must be written in the MIPS assembly language. there are more than just MIPS assembly however... all of the types of processors have their own assembly language. MIPS assembly is the code you see when you open a slus file in the ps2dis. because assembly languages interact directly with the processor chip, they are EXTREMELY fast when it comes to program execution. in fact... when you read about a computer that has, lets say, a 2.4 GHz processor... this is telling you how fast the processor works. the 2.4 GHz is how many processes the processor chip makes per second... lets think about that. theres hertz, mega hertz, and giga hertz. about 1000 hertz in a mega hert and about 1000 mega hertz in a giga hert. that many processes per SECOND... thats REALLY fast. anyway, back to the part that matters.

there are some complex and key points to the MIPS assembly language which MUST be taken into account when reading MIPS assembly source code (or even more so... writing MIPS assembly source). ill start from the beginning.

each and every action done by the processor is done by a line of code called an 'instruction'. EVERY instruction in the MIPS assembly language is a 32-bit process. now, a single bit is a single binary digit that can be either '0' or '1' standing for 'false' and 'true'. there are 32 bits (or on/off digits) in every instruction. there are 8 bits in a single byte... and the 8 hex valued digits that make up an address are made up of 4 bytes. you can test this theory by multiplying 4 by 8. in other words you mulitply the 8 bits that make up a byte by the number of bytes... the answer is 32, where you have 32 bits (hence the '32-bit' instructions).

MIPS assembly uses 'registers' to store data for operation in program execution. there are 31 general purpose registers, 30 double float registers, and 31 single float registers (if you dont know what i mean by 'float', read up on some c++... specifically the types of variables). the general purpose registers are broken down even more though... for instance, there are certain general purpose registers that should be used for certain things. (have you ever seen a register in the ps2dis that was identified with a 't'... i.e. t0, or t1??? these are 'temporary' registers and should ONLY be used within a function.) also, there are 2 (i believe) registers that are not meant to be used to store information... the zero register (known as $0 or zero) ALWAYS holds the value zero. so if you try to store data in it for an important comparison or for a branch... itll compare the other register with the value zero instead of what you tried to store into register $0. and register 'ra' (i think) is the other register that is a special register. this register is used to hold the address for jumps and jump returns and things like that... not to hold values for comparison or anything else.

there are also little rules that one MUST abide by when using MIPS assembly. the first of the two major issues ill talk about is: the 'PC' (think of this as the 'program counter' which keeps track of which line the program is on) is incrimented by 4 durring the execution of each instruction (it increases by 4 because of the four bytes that make up an address). the PC in increased by four durring the MIDDLE of the instruction... so when the program comes accross a 'j' or jal' or any kind of jump, the PC is increased THEN it executes the line of code which contains the jump instruction. because of this, the program runs and executes the line of code after the line with the jump instruction. and, in the middle of that lines execution, the PC is finally set to the address in which the jump referred to. this is NOT a big deal at all... in fact, because of the one line delay, you can make good use of its time and put an important instruction after the jump. if you dont have an instruction after the jump... who knows what could happen (the program would crash most likely). this is why when you are viewing the code for the games, there is ALWAYS a line of code after the jump, even if its just a 'nop' (nop or no-op stands for 'no operation').

the second of the two key issues is the 'load/store delay time'. the MIPS assembly language (because it is 32-bit based) has addresses that range from 00000000 all the way to FFFFFFFF. BUT, the MIPS processor sections off certain ranges of addresses for certain usages... one of these usages is memory. MIPS has a section of addresses where you can store data and call upon it at a later time (if you are using the pcSPIM MIPS simulator, the 'memory' range starts at 10000000). there is, however, a delay time when it comes to loading or storing information in the memory. the delay time is only one instruction long... which is NOTHING considering how fast the programs execute. but, due to the delay time... you SHOULD NOT use the register for ANY reason after loading or storing information until at least one more instruction has already been executed. you will also see this in the ps2dis... there is always time between a load and store instruction, and an instruction that uses the registers that held (or hold) the data for loading or storing.

now ill go over a couple commands for the MIPS assembly language which should really help you when it comes to hacking ps2 games...

-----------------------------------------------------------------------Part 2:
the 'ori' command (or 'logical OR immediate') will 'logical OR' two values and catch the result in the specified register. 'immediate' means that you are going to give a value straight up... one thats not in a register... but just give a value to compare with. the other value, however, must be in a register. for instance, you can do this: ori t0, t1, 0x0008. this would compare the bit patterns (remember that all MIPS instruction are 32 bit, it compares the bits of the two given values) held in register t1 with the bit pattern that represents the value 8 (or 8... the 0x means its a hex value, which in this case doesnt make a difference, but if you were to give it 0x0010, it would be different than to give it 0010). the result of the ori instruction would be caught in register t0. you can also have a value in t0, and do: ori t0, t0, 0x0008. this does the same thing, only spares us the use of another register. if you dont give the 'immediate' instruction, you have to use 2 registers... like this: or t0, t0, t1... which does the same thing, only it compares the value of 2 registers (registers t0 and t1). this instrucion can also be used to simply assign a value into a register... for instance: ori t0, zero, 0x0008... this would compare 0x0008 with zero, and simply assign the 8 to register t0.

next is the sl's and sr's. you may have seen an instruction in which the command was 'sll'. these are 'shift left' instructions. there are other commands that start with 'sl' and 'sr' like slt (and they are differnet the 'sll'), but generally, when you see a 'sl' or a 'sr', they are shift commands ('l' for left, and 'r' for right). what these do is shift the bit pattern specified to the direction specified and the result is caught in the specified register. for instance: sll t0, t1, 4... this would shift the bit pattern held in register t1 to the left 4 bits and catch the result in register t0. if the value in register t1 is 0008, the result that is caught in t0 after the shift would be 0080 (remember that there are 8 bits in a byte... this 8 moved over half a byte). this can be very useful in many situation (especially for game programs), but as you get more advanced, you will realize that the 'sll' command can be used to simply multiply a value (hint- shifting left 1 bit multiplies by 2, shifting 2 bits multiplies by 4, 3 bits multiplies by 8...). you can also use the same register in these commands... like this: sll t0, t0, 4... in which the value already held in t0 would shift left 4 bits and would then be stored in t0 (in which case the value in t0 would have been multiplied by 16).

and, as i spoke of earlier, there is a 'slt' command. my may also see it ass 'slti', but remember that the 'i' or 'immediate' only means that you are using a specific value (like $0004 is ALWAYS gonna equal $0004). im not quite sure what the 's' in 'slt'/'slti' stands for, but i do know that the 'lt' stands for 'less than'. this command tests to see if a value is 'lt'/'less than' another, and if so it puts the value '1' into the specified register... otherwise, the specified register will catch '0'. it looks like this: slti t0, t1, 0x0004... this would catch the value '1' in register t0 if the value in t1 is less than 4... if the value in t1 is equal to or greater than 4, zero will be caught in register t0. you can also catch the value in one of the registers being compared like this: slt t0, t0, t1... which will catch '1' in register t0 if the value in t0 is less than the value in t1, otherwise t0 will be assigned the value 0.

next there are a whole bunch of add, subtration, multiplication, and division commands. these simply carry out the math... add t0, t0, t1 (this would add the values held in t0 and t1 and store the result in t0). there is also an addui (or 'add upper immediate'). normally when adding or subtraction, the values are held in the last four digits of the data... but if you want to add to the first four digits... you use the 'upper' instruction, and once again you can use the 'immediate' to give a constant value. note, however, that addui and addiu are different, and only the addui will work with the first four digits. there are also multiplication and division commands, but these are pretty straight forward as the command is something like 'mult' (which is ovbiously multiply) and div (which is obviously divide). however, with multiplication and division, the answer is stored in something called mfhi and mflo. the values must be called from these to be stored into a register... and im not sure, but i think you must retrieve the values before you carry out another mult. or div. instruction.

next, we have the load and store commands. there are load and store instructions for bytes, half-words, and words (lb, lh, lw and sb, sh, sw). a word is is the full 00000000 value (address 00000000, where the 0's make up the word). a half word is... well... half of a word. and a byte is simply 00. these are commonly used to set up a call to a desired location in memory, and often coincide with a 'lui' ('load upper immediate') instruction. for instance, you can do this:
lui t0, 0x0040
lw t1, $240c(t0)
the end result would be... the value stored at 0040240c would be stored into register t1. the (t0) part tells the program to load the full word value from the address starting with (t0) and, to be exact, load from 240c (which together equals 0040240c). you can also use 'lb' the same way... and same with the 'lh'.

well, thats pretty much it... you should now have a pretty good start on understanding a little bit of MIPS assembly. this, of course, is NOWHERE near knowing the language, but when you are hacking, if you keep all these things in mind, you should understand the code a lot more... happy hackin.