Codex / AI Coding Agents

Building a Custom IME with Codex and Automating Debugging

Using an AI agent to improve the input environment itself

This article describes a practical experiment in building a custom macOS IME with Codex. The point is not only code generation. It is about using an AI coding agent to build, test, and improve a tool that sits at the beginning of daily work: text input.

Why Build a Custom IME?

Japanese input is a tool used every day. Writing articles, searching, sending mail, writing code, and editing pages all begin with input.

That means even a small mismatch in IME behavior can affect the feel of the work. Candidate order, space handling, and frequently used words all matter when they appear many times a day.

The first reason for building a custom IME was simple: while using kana input, I wanted the default space to be a half-width space.

Once development started, it became clear that an IME is not just a conversion feature. It includes input modes, candidate display, conversion history, learning, dictionaries, settings, and the overall feel of daily text input.

Building Time IME with Codex

This experiment used Codex to build a custom IME for macOS.

For macOS to recognize an IME as an input source, the bundle, configuration files, signing, and registration location all need to be handled correctly. Careless changes can affect the normal input environment.

For that reason, the work was separated into a development bundle, production registration, rollback backups, and verification commands. Codex was used not only to write code, but also to help with registration, checks, logs, and fixes.

The first version was not complete. Input failed, candidates did not appear, candidate order was wrong, and arrow keys sometimes escaped to the input field. The work moved forward by fixing those failures one by one.

An IME Cannot Grow Without Data

One thing became clear very quickly: an IME cannot be built with code alone.

Dictionaries, history, learning, prediction, suppressed words, and user settings are what make an IME fit a person's real input. A conversion engine alone does not make daily input comfortable.

At the same time, learning everything is dangerous. If terms extracted from articles are added directly to candidates, strange fragments and uncertain readings can pollute the input experience.

A useful IME needs a buffer between learning and production candidates. Learning candidates should be isolated first, and only approved items should be promoted. An IME that learns is useful. An IME that gets dirty by itself is not.

Creating a Debug Window

An IME is harder to test than a normal app.

In a regular app, you can open a window and press a button. With an IME, the target app, input mode, key events, candidate window, and commit process all interact. Candidates can exist in code and still fail during real input.

Time IME therefore needed a debug window. It allows actual text input while checking candidate display and conversion results.

The debug flow also types a prepared sentence one character at a time, saves screenshots of candidate display, and records conversion results in logs. With that environment, Codex can receive a much clearer description of what failed.

Testing the Real Space-Key Path

The most important lesson was that conversion tests must pass through the real Space-key path.

Calling an internal function and confirming that a reading returns candidates is not enough. A candidate can exist, but pressing Space may insert a half-width space. A candidate can appear in the window, but still fail to commit.

Time IME therefore checks candidate generation, Space confirmation, reflection into the target input field, screenshots, and conversion history.

At that point, Codex is no longer only a code writer. It becomes a worker that helps verify behavior through a route close to real input.

Building the Work Tool Itself with an AI Agent

The interesting part of this experiment was that Codex did not only help build an app. It helped modify the work environment itself.

An IME is a very close tool. It sits before the browser and before the editor. Building that input layer with an AI agent changes how Codex feels as a development partner.

This is different from asking AI to write text. It is about working with AI to reshape the tools used every day.

Once Codex is used to build the test environment, keep logs, compare behavior, and run improvement loops, it starts to feel less like a code-completion tool and more like a worker inside the development environment.

Summary

Codex was used to build Time IME, a custom macOS input method.

The work included input source registration, kana and alphanumeric switching, conversion candidates, Space confirmation, settings, learning, a debug window, and automated input tests. It is not finished. Candidate order and vocabulary quality still need real usage and adjustment.

Even so, the experiment showed that an AI coding agent can help build and debug a daily input environment.

AI usage is no longer limited to text generation or code generation. It can also help improve the tools used every day. Using Codex as a partner for that kind of work is becoming practical.