Metroid ※ Cracking VG Passwords S2e2

When I was a child, I liked videogames But even more so, I was fascinated by how they work And in this series, we continue this tradition – by studying how videogames save and load their progress, using passwords Metroid is an insanely successful Nintendo brand – that was made even more popular, when they made the Game Boy Advance remake, which introduced the very cosplayable form of the titular character But the game was initially released on the Nintendo Entertainment System, and was simply titled, “Metroid” This game was initially going to have – a battery-backed save and load feature – similar to its Japanese Famicom Disk System counterpart, but at some point in the development, they forwent this choice, probably to make the game cartridges cheaper, and they produced a password save feature instead What makes this password screen particularly memorable, besides the strange lowercase font, are two things: First, it has the full English alphabet, meaning that contrary to the later guidelines from Nintendo themselves, it is actually possible to input obscene words in this one Secondly, the password is long 24 characters, at 6 bits per symbol, brings the password length to 144 bits, or 18 bytes For comparison, here’s the apparent password complexities – in games we have previously featured in this series: The least complex of them has been Bubble Bobble, where you can only possibly try and input 100000 different passwords, before you have exhausted them all And the most complex has been Swords & Serpents, where if you had a billion trillion people typing – a billion trillion different sets of input – a billion trillion times in a second – on a billion trillion different planets – in a billion trillion different universes, it would still take about 3000 quadrillion years – before all possibilities would have been covered [ Dramatic pause ] Most of the passwords have been less than 144 bits long So where does Metroid need all this storage? Let’s have a look at the Metroid world This is planet Zebes, where Metroid is set In planet Zebes there is an extensive tunnel system This map consists of 32 by 32 screens, some of which are separated by airlock-type doors Most of these airlocks are free to open from either side, but some of them – require blasting them several times with missiles Throughout the map – there are also plenty of collectable items, such as weapon upgrades, and health tanks With such a huge map, you can imagine there are quite a few of these, and the game remembers your interactions with all of them In fact, let’s dive straight into the password contents This is my password encoder and decoder that I wrote in C++ I used the user literals operators to help make the code easier to read The password is 144 bits long as previously discussed, or eighteen bytes The following list will be grouped topically, not by bit position, so the bit positions may appear to jump from place to place A checksum and an encryption key are stored in the last two bytes, eight bits each More about these later That leaves exactly sixteen bytes for everything else There are eight equipment bits, which are saved in the ninth byte Each of these equipment bits is, well, one bit There is a ninth bit – that indicates if Samus is running around in a leotard, a special mode, that is unlocked by previously having beaten the game – in an exceptionally short time Five of these equipment bits are seemingly duplicated in the password, but they are not actually duplicates These five separate bits indicate whether the powerup has been picked up, not whether Samus has it It is possible to create a password – in which this item is removed from the game, without Samus ever acquiring it 19 missile doors as discussed earlier 21 missile powerup bits 8 energy tank pickup spots Now we get into enemies and bosses First, one bit for each of the 5 zebetites These are the glass fuse things in Mother Brain room – that regenerate if you don’t destroy them quickly enough 1 bit for the Mother Brain, also in the Mother Brain room Then there are two sub-bosses: Ridley and Kraid The state of each boss is indicated in two bits A value greater than zero indicates the boss is dead, and a value greater than one indicates – that the player has activated the statue corresponding to the boss, in the statue room The number of missiles that Samus has is saved as an eight-bit number The maximum missile count is not saved separately It is calculated from the number of missile acquisition bits, times five, plus the number of beaten sub-bosses, times 75, for a maximum of 255 It is possible for Samus to have more missiles than her maximum number Next comes the age The game keeps track of how long you have been playing This timer ticks once per 256 frames,

which means about 17 seconds per four ticks However, this number has a peculiar format It is stored as a 32-bit integer, but its lowest order byte counts 0—207, not 0—255 This timer is used by the game – for deciding which ending you are eligible for The timer code is the same for the NTSC and PAL versions, even though NTSC counts 60 frames per second – and the PAL clocks 50 frames per second This means that on the North-American and Japanese consoles, you have to beat the game in under 59 minutes to get the best ending, but the rest of the world can take almost 71 minutes to qualify It is funny that they chose a four-byte integer for the time It might not sound like much, but this is so much capacity, that it takes almost 500 years for the counter to wrap, and almost two years of non-stop playing – even to get the fourth byte to advance by one It is no wonder that Kent Hansen a.k.a. SnowBro, whose excellent disassembly I referred to extensively – while making this video, thought that this byte is unused by the game Anyway, moving on our list, next up is the area code The area code is four bits, five bits or six bits, depending how you look at it Six bits are stored by the game, but all the major decisions are made from the four lowest bits – and sometimes the fifth bit as well I call the fifth bit an extension code to the area There are five areas in the game Let’s review this variable in more depth later Besides the sixth bit of the area, that the game does not use for any purpose, the password contains ten other bits – that do not influence the game in any manner I call these bits ghost bits, because they are totally ignored by the game There’s just room for them in the password, but they are not saved in the game’s RAM, and new passwords generated by the game always have these bits zero Finally my structure has two special flags here, but these are just internal parts of the password codec, not part of the password encoder Now let’s move on to the encryption and decryption There are three steps to the password encryption: Calculate a sum of the first 17 bytes – and store it in the 18th byte Rotate the first 16 bytes right – by the number of bits indicated by the 17th byte, and convert the result into 6-bit characters – in a manner very similar to BASE64 encoding Conversely, the decryption repeats the same steps in reverse: Convert the 6-bit characters into bytes in a manner resembling BASE64 decoding, rotate the first 16 bytes left – by the number of bits indicated by the 17th byte, and verify that the sum of the first 17 bytes matches the 18th byte Yes, this is one of the few games – where password encryption does not involve the XOR operator In my design, I would have preferred to handle the password as a single 144-bit integer, but alas, there is no such datatype in standard C The biggest that GCC has is 128 bits wide, but there would be no particular advantage – on choosing such a compiler-specific type over a standards-compliant 64-bit type So I represent the password as an array of 64-bit integers The outermost operation is – to convert between 8-bit and 6-bit representations of the password Here is how the conversion works First, the symbols correspond to some particular numbers, in range zero to 63 Then, groups of four characters are packed into three bytes, as indicated by the diagram In encoding, this just means pushing six bits in from the right side In decoding, I pop the same six bits out from the left side To convert numbers into characters, I just index the string, and the opposite operation, character into a number, is managed by the find() method of the string_view [ Sound effect: Record needle scratch ] Actually that’s wrong I missed one symbol Again Yes, the password character set is indeed 64 symbols long, but there is one more symbol There are four different kind of spaces! First, of course, the password is grouped in four groups of six symbols each, and there is a mandatory space between each group Okay that’s cheating! That’s not really a space But if you press the B button, you can skip over some slot and leave it empty This is kind of inconvenient – since you can only skip backwards, not forwards In any case, this is a valid password It gives you the leotard, four health tanks, five missile packs, and kills three zebetites and the mother brain The slots that you skip over are actually zeros, and you can just input a zero instead This is the same password essentially However, if you navigate to the bottom-right corner and press A, to actually input a space, the password is no longer valid That’s because this is a different kind of space now

And yes, that is a valid symbol too This is a valid password that contains the second kind of space Replacing this space with a zero will not work, but I can replace it with a dash They space has the numeric value of 63, like the dash symbol, not zero However, in this password, which is also valid, neither zero nor dash works as a replacement for the space This is a third and yet different kind of space, with a numeric value of 255 It leaks into the next character; I have to adjust an innocent neighboring letter too And now it validates, and produces identical result compared to before Here is the updated chart Space has three different values, as indicated When the value is 255, the two extra bits overlap with the preceding character – and are orred together as indicated by the red arrows And this has to be accounted for in the password decoder, of course I do not account for skipped slots, because it is difficult to think how to write them other than as verbatim zeroes The next operation is to rotate the password: left when decoding and right when encoding The number of bits to shift is indicated in byte 16, which is stored near the top of the third 64-bit integer Next, is the checksum calculation The first 17 bytes are added together, and this result must match the 18th byte When encoding, the result is placed in that byte, and when decoding, these two values are compared This is my checksum function It basically just creates a byte pointer into the data – and calls std::accumulate on it with a plus operator Everything else in it is just metaprogramming – to give a compile-time error in non-portable use cases If we check out the Compiler Explorer… Pretty much all compilers generate tight SIMD code for this function, except the Microsoft product, which kind of goes “meh, who cares”, producing a sixteen-iteration loop instead Let’s have a closer look at the SIMD code First, it loads all 16 bytes from the memory address – into XMM0, a 128-bit register Then it creates a copy of that register, XMM1, where the bottom half contains a copy of the top half GCC uses a shift instruction for this purpose, while Clang uses a shuffle instruction I am not sure if there is any difference in performance Next, these two registers are added bytewise – and the result is stored in XMM0 At this point the low 64 bits of XMM0 contains – the pairwise sums of the high and low bytes Next, it initializes XMM1 with zero in preparation to the next step Then it calls an operation called VPSADBW This little-used instruction does the following: First, it subtracts all bytes in the first operand – from the corresponding bytes in the second operand This produces 16 new bytes from the two sets of 16 bytes Then, the low eight bytes are added together, and stored – in the first 16 bits of the lower half of the target register, and the same is done for the upper half Finally, the VPEXTRB instruction is used – to fetch the lowest-order byte from this register, and store it in EAX, the function return value register Here, the VMOVD instruction chosen by ICC would have been shorter, but in terms of operation and outcome, VPEXTRB is kind of cleaner I guess And this is how it calculates the sum of sixteen bytes – in just six CPU instructions ICC does an additional MOVSX here, but that instruction is redundant in this context, because the function returns an 8-bit value – and the rest of the bits in the register are irrelevant Note that this is not what these compilers originally did GCC 9 and Clang 8 both compile this function into a longer sequence, but I filed in both compilers bugreports that were well received, and the code generators were changed Once all the checksum stuff has been done, it is time to access the actual data inside the password First though, I will do a byteswap for the data, because the left-aligned byte-indexes are a bit cumbersome to work with Here is how the byteswap operation is performed If we use the Compiler Explorer – and inspect how four popular compilers compile this function, we can see that – GCC and Clang both optimize it literally into a single CPU instruction, while the two corporation-made compilers kind of do their own thing When encoding, the data is placed in the password – using a bitshift and the bitwise OR, and when extracting, the data is taken from the password – using a bitshift and the bitwise AND The C++17 structured binding declaration proves very useful here,

making the code simultaneously very compact and expressive Oh yeah, did you know that this game has a special password? That’s right! If you enter this password: NARPASSWORD00000-BISQWIT, you will get a secret game mode This game mode starts like a normal game, but on every frame, that is fifty or sixty times in a second, three things will happen: Samus is given all powerups She is given five missiles Her health tenths digit is set to three And you can see how it works My default weapon is now the ice beam, and I have an infinite supply of missiles, and Samus can take any amount of hits from enemies, but the health will never go below thirty Actually you only need to enter these eleven letters This mode is activated by any password that begins with those 11 letters, followed by five blanks or zeros Whatever comes after those 16 symbols is irrelevant This password skips all the checksum logic – as well as the encryption and decryption, and I reproduced this behavior That’s what the MagicWords array is for This is the only hardcoded password in the game There is some debate on what the NAR in the NARPASSWORD refers to Rather than parrot fan theories, I will just refer and urge anyone to check out the Metroid Fandom Wiki – for this article, if you are interested Now let’s have a closer look – at some interesting details in the real passwords There are eight energy tanks that are separately saved in the password However, Samus can only have up to six spare energy tanks If you collect more than six, the game just discards the rest The password has a bit for the Mother Brain However, the game will actually never give you a password in which this bit is set If you die in the game after the Mother Brain is killed, the game deliberately punishes you – by giving you a password in which not only the Mother Brain is alive, but so are the five zebetites – and the two missile doors leading there For the two bosses, Ridley and Kraid, the password saves a two-bit value for both A nonzero value means the boss is dead, and a value greater than one means – that the statue corresponding to the boss has been lifted The value three is unused by the game, but it is interpreted to mean the same as two If you encounter a password in which the value is three, it is definitely crafted manually There are ten unused bits in the password These bits are not saved by the game anywhere They are always zero in passwords generated by the game This means two things: If you encounter a password in which one of these values is nonzero, it is a manually crafted password And, for every valid game state, there are 262144 different passwords that represent that same state There is one more unused bit right here This is the sixth bit of the area code The game will never set this bit, but if you manually craft a password in which this bit is set, the game will remember it, at least until the next elevator, where it is reset Nevertheless, the value of this bit does not affect the game in any way The seed is a eight-bit integer – that is used for rotating the password bits The seed also contributes to the checksum calculation There are 256 possible values for this byte However, because the data area that it shifts is only 128 bits wide, the first 128 values accomplish exactly the same outcome – as the second set of 128 values Moreover, the game has a condition, that it will never generate a password where the seed is divisible by sixteen I am not sure why I mean, I know that a shift divisible by eight means – that the bytes are just shuffled around, but the password operates on 6-bit symbols, not 8-bit bytes I digress In any case, if you come across a password where the seed is divisible by 16, that password is definitely man-made and not generated by the game Now, the area code! The area code is a six-bit value that has four significant bits, one bit that indicates something I am not sure what it means, and one bit that is completely unused Let’s study the four significant bits This value, which ranges 0—15, has five valid values Values 0—4 correspond to the five game locations: Brinstar, Norfair, Kraid’s hideout, Tourian, and Ridley’s hideout respectively But what happens if you enter a value that is outside this range? To find out, let’s study what the game does with this number Let’s make a table Let’s just search for references to this variable, InArea Here’s the first one The scrolling direction is vertical if the area code is nonzero, and horizontal if zero There was other similar bit of code somewhere –

that sets Samus’s starting coordinates, but I could not find it again when I was making this footage, so I will just add it here Here is the next one The ROM bank number is set using a table This table contains five values: 2, 3, 5, 4 and 6 If the number is five or greater, the code will access data that immediate follows this table, which is this routine over here The exact values are shown on the screen This value is saved into the SwitchPending variable The code that reads the SwitchPending variable – subtracts one from the value, and writes this to the MMC1 mapper If we look at the MMC1 documentation for that register, we see the low four bits of this value are used to select the ROM bank, and the fifth bit is used to enable or disable the PRG RAM However, it also says the fifth bit is ignored on MMC1A, and this game is MMC1A, so we only need to care about the four lowest bits Additionally, because this game only contains eight banks of code, not sixteen, I think it will discard the fourth bit and only care about the lowest three After the ROM bank has been switched, the next part is that the full 8-bit byte, minus 1, is multiplied by two, and the resulting value is used as an index to this jump table below The first five values map nicely into the jump pointers, but after that, things go very wrong It loads pointers from code that is not meant to be interpreted as data This is the table of jump addresses The first five are normal Value ten loads a pointer to a memory region that begins with byte $02 This is an illegal CPU instruction that kills the CPU Literally! When the CPU executes this instruction, it goes in an infinite loop where nothing can wake it, except a console reset Values five and six point to the console’s internal RAM This is the cache of the OAM, where sprites are stored At this point, the memory contains nothing but F0 F0 values, which is a branch-if-equal instruction At this point, the zero flag is not set, so no branch will happen, and the next instruction is executed This instruction, too, is F0 F0 This process will repeat 48 times, until it hits memory address $0300, which contains the following instruction: BRK BRK activates a software interrupt, and the interrupt vector in Metroid happens to point to the reset vector So, area codes five and six will unconditionally reset the console Area code 7 loads a pointer to $EF0 This is a mirrored copy of $6F0, which is part of the CPU’s RAM This portion of the RAM is unused by the game, and is zero-initialized on reset So, it will also trigger the interrupt, which leads to an unconditional reset Area codes 8, 12, and 15 load pointers to valid program code within the game, albeit in totally wrong contexts There is nothing that should outright cause a crash in those areas, but in effect, they will produce a reset anyway, because sooner or later they will run into an invalid call stack There is mild potential for ACE exploits there Codes 13 and 14 point to addresses that are write-only registers of the PPU Attempting to read those addresses will map into “open bus” This means that the CPU will receive whichever value – last happened to be transferred between the PPU and the CPU My time and available tools were not sufficient – to investigate the full implications of this, but it is possible that – this might lead into an arbitrary code execution exploit at some point Area code 9 loads a null pointer In other words, the CPU will begin executing code from the beginning of the RAM By chance, the first byte that it happens to execute – is a copy of the mapper byte, which is C9, and this is an executable instruction (CMP imm) The next instruction is, eh, a three-byte store instruction, at least that’s what it was in my tests And this happens to be valid as well, albeit the PPU register that it writes to – ignores all writes, but the CPU happily churns along The third instruction that it executes is BRK, causing a system reset However, this is just what it did in my tests, and I’m not sure what sets the RAM contents from byte 1 onwards Again, there could potential for ACE exploits in this, but the exact mechanism would require some extensive study Value eleven is interesting – in that it actually selects bank three, which is Tourian By coincidence this value works nicely! [ Sound effect: Record needle scratch ] Not so fast, my dear Watson! I made a little mistake earlier Remember that ROM bank switch code I showed earlier? I thought this code only writes five bits to the mapper –

and is done with it Nope, that is not what happens! The register that is written here is interfaced through an address – that accepts two bits of data at once, not just one While the lowest-order bit is used for loading data to the relevant register, the highest-order bit will reset the shift-register if set Here is what is actually happening On areas number 8, 9, 10, and 11, the mapper shift register is reset during one or more of the bit writes, and the bank does not actually get changed It remains as bank 0, which is used by the password codec Area 9 is not affected by this revelation it still executes those same RAM bytes and ends up resetting the console On area 10, it now loads code from bank 0 instead of bank 4 This code runs valid but nonsensical instructions for several clock cycles, followed by one undocumented instruction, and then it issues a jump into a memory area – that is mapped to a write-only PPU register In area 11, it now successfully calls InitBank3, but remember, two things have changed The mapper is not actually programmed for bank 3 now, but for bank 0, which is for Brinstar Secondly, the mapper shift register is in a desynced state The next time, that the game wants to reprogram the mapper, it expects that the fifth write will commit the changes that it is making, but instead, the very first write will do it This will cause the game to behave in very erratic manner, and it is only a matter of time before it crashes or resets The game contains no mechanism to gracefully recover from this desynced state, and in fact, if you attempt this area code on the Nintendo 3DS, the console becomes unresponsive until you power-cycle it, and the virtual console on the Wii just rage-quits I did not trace it, but I suspect that – almost exactly the same thing will also happen with area code eight Now I have been asked to explain some particular popular passwords One of them is this JUSTIN BAILEY followed by twelve dashes However, the fact that this password is accepted by the game – is completely coincidental In fact, I was able to craft several other short passwords – that work just as well For example, “PLEASE WEEP?” gives you the leotard Samus But the player is expected to weep, because the morph ball is gone, and Samus does not have it, making the game unbeatable To fix it, let’s do another: “BAILED ME” And now, the morph ball is there, but Samus is wearing her normal suit again; she bailed me One particular popular request has been this: ENGAGE RIDLEY MUFFIN KICKER You thought I was going somewhere else with this? This is a well-known password that invokes the area number 11 The emulator that I am using will simply hang here, and this is because – executed a mapper switch at an inopportune time – due to the shift register desynchronization, and this caused the CPU to execute a KIL instruction This is exactly the same as what would likely happen – with a real console as well However, I have a feeling that this is not the only possible outcome Again, I would not count out – the possibility for arbitrary code exploits through this area code As a bonus, here’s a list of some single-word passwords that I found There are plenty more, but I excluded passwords that break the game And that’s all I have for this game I am Bisqwit Be sure to join me next time, when we take a step back, and cover a game that is easier to approach even by novice hackers By the way, according to YouTube analytics, less than six percent of my subscribers – have enabled notifications about my new uploads If you like my content, make sure you click the bell icon and enable all notifications, and enable YouTube notifications as well, so you won’t miss new videos I wish you a peaceful and revitalizing day or night See you again Bye!