Written by Anadara on 16.10.2019
An interview with Vitor Vilela, the engineer who polished Gradius III, by Lucas Milani Santiago.
– Hi Victor! We’re honored to have you today in this edition of Canal3 convention, in San Paulo. It’s the 41th edition of the Canal3 convention, to be precise. We’re very curious about your work, to know how did you manage to make Gradius 3 even better. Patching it in order to make it run smoother and faster, eliminating all slowdown. This kind of work is very important for the retrogaming community and deserves to be shared to everyone.
So, I’d like first to ask you to talk a little bit about yourself. And how did you get so interested on programming for a classic console like the SNES?
Vilela: Ok, so, first, to start, I’d like to thank you guys for having me here for the first time. I really hope it’s just the first of many! It’s great to have a chance to share my experience.
Well, everything started when I was 7 years old and my father gave me a SNES. I was a quiet kid, liked to stay at home, didn’t had many hobbies. And I quicly fell in love with it. I started playing Super Mario World. As the years passed, I started looking for new games. I am from a very humble family, it wasn’t easy to get new games.
When I turned 11 years old I got my first computer and internet access. And I was always very curious about how videogames works. I was curious about how to edit Super Mario World stages, for example. Then I found a rom editor called “Lunar Magic”. “Lunar Magic” is very famous among the romhacking community. By using it you can edit almost everything in a game, every attibute. And that’s how I first got interested on editing game roms.
Over time my interests got broader: How do a videogame work? How do a computer work? Because I knew how to edit Super Mario World roms and became very skilled on it, this knowledge helped me develop my programming skills. I mean, informally, as a curious child. I was in basic school, had plenty of free time and could use my computer as much as I wanted. I had to fix my computer many times, format the hard drive and re-install the OS many times, losing all my work…
Time passed, I got experience, and when I turned 13 or 14 years old I managed to gather a lot of knowledge about how the SNES work. Also managed to get some knowledge about programming, in general.
– Did you had a chance of formally studying programming after you got interested on the SNES? Let’s say, high school or college classes? Because looks like you self-taught yourself a lot during this time. According to your narrative looks like your passion fuleled your desire to learn more about programming.
Vilela: That’s true. After this stage I became very interested on everything regarding this subject, from electronics, when younger to programming, when I turned into a teenager. It wasn’t hard for me to decide studying computer engineering. I am still studying to be honest.
– Talking about the game a little bit. Could you please explain to us how did you fix Gradius3? How did you modified it?
Vilela: Gradius 3 was a very early game for the SNES. The programmers focused on making an arcade port. Since it was an arcade port, they didn’t optimize the code at all. Because it wasn’t optimized, the game ran on a very low framerate compared to the arcade. The infamous slowdow in present even in the arcade version.
What I did was to use the SA-1 chip. The SA-1 is just like the SNES CPU, but it runs at 4 times its clock speed. (FOTO Nintendo S1 CHip YT?) SNES CPU base clock is 2.68Mhz. While the SA-1 runs at 10.74Mhz. I basically had to open the game code, understand what each code routine do and use automation tools and I migrated the memory (RAM) aera which used to run on the SNES to the SA-1 RAM.
The SA-1 chip has its own BUS. So it’s possible for it to read and write code routines on its own memory addresses. After this whole memory area was migrated to the SA-1 I managed to route the most CPU-intense parts of the code to the the SA-1 instead of the much slower SNES CPU. Which means the SA-1 is doing all the hard work. Leaving the other basic processing to the SNES CPU, like audio processing, for example.
The SNES CPU is still taking care of video and audio processing, for example. As a result you have a game which internal logic is much faster than the original. The SNES can now process everything which is necessary to update a frame. Within the limit, which is 16 milliseconds. And as you can see, the game now runs very fast and very smoothy.
– It runs in what, 30fps, 60fps?
Vilela: No, 60fps!
– Oh, wow, 60fps! That’s impressive. So you had to transfer a memory area to the SA-1, using it as a co-processor to avoid frame drops. Do other games use the SA-1?
Viola: That’s correct. I had to migrate the most CPU-intense parts of the code to the SA-1, which is four times faster than the original SNES CPU.
– The SA-1 has its own memory, right?
Vilela: Exactly. The SNES can access the SA-1. The SA-1 can only access what’s within the cartridge. A regular Gradius 3 frame takes between 15 and 100ms to be processed. It could drop up to 20 per second. This doesn’t happen when you use the SA-1 for the task. You have a warranty that the game will run in 60fps.
– You already told us a little bit about how you started working on this mod, but I have a question: Would it be possible to achieve the same result using other SNES chips, like the Super-FX?
Vilela: No, because there’s something very important about the SA-1. The SA-1 and the SNES CPU share exactly the same internal architecture. You can run SNES code on the SA-1 with minimum tweaks. This was was critical to get this patch done. I just had to migrate the memory area from the WRAM (SNES memory) to the SA-1 memory, which is the BWRAM. Allowing the SA-1 to run the same code the the programmers originally planned to run on the SNES CPU.
The main difference is the Super-FX uses the RISC architecture. Which is radically different from the SNES CPU and the SA-1. Which is a CISC, based on the… 65C816. Same architecture as the Apple II.
The biggest advantage of using the SA-1 was being able to route the most CPU-intense part of the code to a faster chip whithout having to change the code at all.
– So this means that, if you tried to do the same using the Super-FX. The Super-FX is very famous, that’s why I asked about it. You’ll have to change big chunks of code, but not by using the SA-1 becuase they have the same architecture, is that right?
Vilela: Yes. The patch I programmed allows the same code to run on both. The code can automatically switch between CPUs. If it’s running on the SNES CPU, it switches to the SA-1. This happens back and forth. For example, if it’s a code that only the SNES is supposed to run for example video or audio processing the SA-1 chip switches the code to the SNES CPU. When the game is running, both chips operate in parallel, exchanging information and optimizing the processment. This is only possible because of the SA-1.
– Since you originally had the idea of using the SA-1 until you had the game finally patched how many time did you had to dedicate to make it happen?
Vilela: That’s quite hard to measure. I can say it took me months, probably 3 or 4 months.
– So 3 or 4 months all together?
Vilela: Sometimes I worked 3 hours per day, sometimes 12 hours… The biggest challenge wasn’t to migrate parts of the code to the SA-1. The worst part was to disassemble the game. Because is quite hard to disassemble the game and keep it readable and understand which part does what.
It’s very hard to clearly read parts of the code and then analyse it using tools and create viable changes to the code, allowing the SA-1 to switch automatically as I explained before. Because it would be impossible to do this manually, editing the game byte by byte. Gradius 3 has 512kb. That’s almost 1 million bytes to analyze and edit manually.
– Why Gradius 3, specifically? Did you consider patching other games too? Or did you always aimed for fixing Gradius 3? Were you familiar with the Gradius series? Do you consider yourself a fan?
Vilela: As I mentioned before everything started with Super Mario World. I implemented the SA-1 patch for Super Mario World back in 2012. But it wasn’t very popular, because it was used for rom hacking. If you were interested on fixing the slowdown issue on Super Mario World. Specially because you could implement new features by hacking the rom. The SA-1 became viable. So you could use the SA-1 for extra CPU power, add more enemies, special effects and so on.
I discovered the Gradius franchise when I was younger. I remember playing the arcade version of Gradius 3. But, you know, it was a long time ago, and I forgot I had played it. Only later, when I was checking the list of games that people recommended me I saw Gradius 3 on it, and said “Oh, wow, I remember this one! I like this game a lot!” And by playing it I noticed how slow the game was. I thought If I implement the SA-1 patch on this game, it would be a dramatic upgrade on speed. Because most SNES games used to run on an acceptable speed / framerate. But on Gradius 3, just by watching the attract mode, you could notice a massive amount of slowdown. And that slowdown would be fixed using the SA-1 chip.
– There’s a group of people from the retrogaming community, specifically from the shmup community, they consider Gradius 3’s slowdown to be part of the game. Since the arcade version has some slowdown they’re kinda purists and consider it to be a mandatory part of the game. They consider the slowdown is so important that it’s impossible to beat the game because of it. So, I’d like to ask you, out of curiosity, did you manage to beat the game after you patched it?
Vilela: No, I didn’t. Gradius 3 is a pretty hard game even in its natural form. And I totally understand and agree that a Gradius 3 with slowdown was what the developers intended to achieve. They probably noticed that the SNES wasn’t up to the task. So they slowed down the game. I’m sorry, the other way around: they speeded up the game in some parts in order to compensate the slowdown. Which means that by eliminating Gradius 3’s slowdown, it became a faster game. I mean, running on a higher speed, higher than the developers planned for it.
Playing it this way is a totally diferent experience. Just by the fact you can shoot at the same speed from the beginning to the end of the game. Playability-wise, it is totally different. The game just… flows! And I know that, even if the game is faster now, some people managed to beat it. Beat it on the arcade mode, using the SA-1 chip with no problem at all. If you’re curious about it, just check Youtube. People managed to beat it on arcade mode using the SA-1.
– There’s a PS2 package called “Gradius 3 and 4”. On this version of Gradius 3 there are options to play the game with full, partial or no slowdown at all. Did you know that? Did you investigate it, or someone told you about its existence? Or you just didn’t know about it at all?
Vilela: Well, I got to know about it after releasing the SA-1 patch. And for me it makes total sense. If you’re used to play the good old Gradius 3 and then you decided to try it on the PS2 version and other platforms the presence or absence of slowdown will provide you a different experience. You will feel like you’re playing a different game.
So, for me, it makes total sense that Konami implemented this feature Even allowing you to select the amount of slowdown you want to experience when playing, which also changes the difficulty I found it a very clever design decision. It could even be an option for the SA-1 patch of Gradius 3. To allow the player add or remove the slowdown factor.
But since I had no idea this version existed when I started working on the original patch. My intent was just to make Gradius 3 faster, to make it run faster, smoother, with a constant 60fps framerate. I just focused on make it as fast as possible.
– That’s ok! I was just curious if you used the PS2 game as a reference or if you simply tried to make it run on a constant 60fps, as smooth as possible, all the way to the end as per our conversation I understood that you just planned to make it run in constant 60fps and that’s the result. I played it and loved it, the game is simply amazing! I am a long time fan of Gradius 3, I can’t beat the game, not even on “easy” mode. But anyway, the game is incredible!
Do you believe it’s possible to implement the same SA-1 solution to improve other SNES games?
Vilela: For sure! It’s the same architecture. The vast marjority of the SNES library that uses only the SNES CPU, and nothing else they can be “ported” to the SA-1, definitely. Eliminating the slowdown issue on several other games the same way. Or just by making them run smoother.
For example, on Super Mario World. There’s no much slowdown on this game. But when you use the SA-1 chip you can make sure it will run smoothly you can improve the performance on specific animations. For example, when you find a secret passage. Or when you’re about to figh Bowser and there’s a special lighting effect most people may not even notice, but those specific areas run at 15 frames per second.
– Oh, yeah! Definitely! I noticed that! I noticed that after reading an article about your patch on the internet. People specifically mentioned your work with the SA-1 and the Super Mario World ROM. I really didn’t notice it before. But after you finish a stage or leave a ghost house the screen kinda “zoom” and close on your character, using the Mode 7 effect like a classic cartoon ending. There’s a massive slowdown there!
Someone made a side-by-side comparsion, the original game and your SA-1 patched game. And then we can notice the amount of slowdown. Ok, it’s one of the first, if not the first SNES game, but the slowdown is massive. We just had no idea.
Do you believe it’s probably a long list of games? I mean, there’s probalby a very long list of games that can receive SA-1 implementations and improvements?
Velola: Exactly. The problem is, probably, creating the patches! It’s impossible to patch them all, because it’s very time consuming! Even by using all the tools to get shortcuts, it’s a massive amout of work.
For example, there’s a tool that allows you playing the game and while you play it, by using another tool, it automatically generates a partial SA-1 version of the code. Even though, it’s a massive work. But, by knowing which games are the most popular ones. By knowing which ones are the worst cases of slowdown issues, I can prioritize on fixing them.
Of course, not counting games that already uses other specific chips, for example: Starfox. There’s a technical limitation regarding the SNES hardware. There’s an issue regarding the “SNES GPU”. I mean, the SNES has a “PPU” (The “Picture Processing Unit”).
If not by pre-existing technical limitations every game that has a slow game logic because of the way it was coded and its internal complexity the SA-1 chip can be used for sure.
– For our last “bonus” question, to finish the interview. This one came from the fans. One of the most impressive shmups on the SNES also by Konami, is Axelay. It’s also an early SNES game. Unfortunately, also suffers from a tragic slowdown issue. Even back in the day, the slowdown in some areas was considered horrible.
Some magazines reduced the game score drastically because of this slowdown. So, some people consider Axelay beautiful, but a failure as a game. Because of the slowdown.
So here’s my question: Do you know Axelay? Do you think it’s possible to use the SA-1 to fix the slowdown issue on this game?
Vilela: Axelay was introduced to me after Gradius 3. I never had a chance to play it, didn’t know the game. But by looking at its gameplay I am absolutely sure it’s possible to fix it using the SA-1. If I find time to do it, I’ll definitely do it. It’s definitely something I want to do.
– It may be surprising for some people, but that’s not your main occupation, right? I mean, you have a life.
Vilela: Oh, definitely! I have to attend college classes and I have a full time job. I do this kind of thing on my free time. I receive a lot of help from supporters, I have a Patreon account.
This chip, for example, was a donation, because all the work I do with the SA-1 was being developed on emulators. The chip I use now was donated by a canadian supporter and this helped me a lot. Let me show you.
– Oh, ok, where is it?
Vilela: You can just get the chip. You can pre-program the chip using an EPROM device. An EPROM writer connected to a computer and then you can simply take special cartridge with a socket. You remove one chip, insert a different one.
And then, after replacing the game chip I can load it on the real SNES hardware and check if the SA-1 is really working as expected. What is different from the SA-1 and other techniques to fix slowdown issues is that with the SA-1 you can also play the final result on the real hardware, which is the complete experience.
Whatever changes you made, you’ll experience how to play it on a real SNES. Everything running on the real hardware.
– This is specially important because sometimes the emulators aren’t properly coded. They rarely deliver a real cycle-by-cycle implementation when running the games. There’s a new SNES emulator, a recent one, I think is the BSNES if I’m not wrong, which is intended to be perfect implementation of the SNES on a hardware level, but it requires a Pentium IV to run properly or something like that. It’s heavy on the processment department, reproducing every frame, every cycle. For your work, this feature is very important. Because you can code and then test it on a real hardware.
Vilela: That’s right. Also, until today, there’s no SNES emulator capable of reproducing the SA-1 in a real and accurate fashion. It’s a chip that nobody managed to fully understand it how it works. There’s nothing conclusive about it. The SNES was totally studied, inside out, completely, but there’s not much about the SA-1.
So, the NES run on a base clock of 2.68Mhz or 3.58Mhz. What do the SA-1 do: While the SNES is idle, the SA-1 uses all unused resources (ROM and RAM), and run the code on this time, but faster. No emulator can achieve this with perfection.
Because you’re not just emulating the SNES cycle by cycle. You have to emulate all the internal chips cycle by cycle, simultaneously and this requires a lot of CPU power. Not to mention a lot of hardware knowledge. Technically, it’s like emulating the SNES on a clock level. Clock level, which here means, matching the NTST, which is 21Mhz.
So, the BSNES is a very well known emulator. Actually, to emulate the SA-1, as per the author’s words, with about 98% of accuracy you’ll need a 3Ghz Intel Core i7. So, far more than a Pentium IV… Exactly, because you have to run sequential instructions.
It’s impossible to run in parallel because on a computer sometimes a part will run faster, other part will run slower and so on. And on the real hardware everything runs perfect harmony. So, all components stay synchronized all the time.
– All right! Well, it was very nice to have you here, it was like having a masterclass! Thank you very much for the interview. And specially, thank you for dedicating a lot of time and energy to improve our beloved games, allowing them to be enjoyed on its fullest. It’s incredible we are able to enjoy playing our good old games in such way, because people like you decided to make it happen. Again, thank you very much!
Velela: I thank you too, it was a great honor to give this interview. I hope to have more SA-1 games soon to make the community even happier. I’ll make myself always available! Please feel free to contact me on your favorite social media. Be it on Twitter, Discord, YouTube… Oh, yeah, the Patreon channel too… I’m very active on social media. That’s my hobby, that’s what I do on my free time. It’s my passion since I was a kid. Once again, thank you very much for inviting me.
YouTube: ShmupsBR “Entrevista c/ Vitor Vilela – O cara que consertou Gradius 3!”