Summary of my experiment with AI in RPGMaker

The plugin

The basic idea of the experiment was to write a plugin that lets the game developer (me) use arbitrary LLMs as NPCs—in RPGMaker MZ within the scope of an event. For that I needed, in addition to plugin commands to tell the LLM what to do, both actions the LLM can perform in RPGMaker—move the event, say something, give away an item, flip a switch—as well as ways for the player to simply react to the NPC’s actions: actively initiate a chat, reply to a chat message, give an item.

And the LLM had to be taught enough about the situation to make reasonable decisions: who is the NPC? Where is it located? Where are the player and other events? What has happened so far? What items does it have? And what does it want?

With the help of the Cursor development environment and an LLM—I recently switched from Claude 4.5 to GPT-5—developing a plugin with these features was straightforward. I was amazed at the machine’s comprehensive knowledge regarding RPGMaker plugin development, which isn’t exactly the most common kind of development work. A few small things didn’t work at first. But “vibe coding” (I hate the term) worked. I only roughly know how my plugin works.

BTW the plugin is freely available on Github: This way to the Plugin

The joys and woes of LLMs

The differences between the big proprietary models on the big servers and what you can run locally are enormous. So are the costs.

If you entrust your NPC’s fate to GPT-5, almost everything is fine. The NPC behaves as you’d expect from a human. It pursues goals in a reasonable way, can handle items realistically, and adheres to constraints in the prompt. However, the model is very slow—who wants to wait 5 seconds for every action? And it’s very expensive.

Originally I worked exclusively with Mistral Large. It’s not quite as clever as GPT-5, but much faster. One action per second. Almost usable. After 3 days of tinkering came the cost shock: yes, 2 NPCs making one LLM call per second is 120 calls per minute, and if you include detailed prompts with environment description, items, history, you end up paying 6 euros for not all that much testing. Amusingly, Mistral Large is more expensive than GPT-5, and because it’s also five times faster, that difference is quintupled.

In the subsequent comprehensive test with 11 LLMs, it became clear that, on the one hand, local models don’t really feel like a real person and can’t handle normal NPC tasks and constraints, and on the other hand, GPT-5-mini most effectively hits the price-performance sweet spot. If only it weren’t also so slow—5 seconds doesn’t work for most NPC situations.

I’ll now see how I get on with it in a real small game, but I’m afraid it will still take the LLM eternity of half a year before you can build RPGMaker games with LLM NPCs that are actually fun.

Independent of the technical issues—performance and capabilities of the LLMs—this is quite tricky. When the NPCs suddenly have a lot of freedom to act, it becomes difficult to realize a coherent storyline.