Skip to content
BACK_TO_INDEX

AI GAME CREATOR // TYPESCRIPT + PYTHON

PROMPTBLOX

AI-powered Roblox game creator — type a prompt, get a complete playable .rbxlx. A conversation-first agent orchestrates frontier LLMs on a hybrid backend where LLMs design (WHAT) and deterministic TypeScript builds (WHERE). A self-hosted GPU pipeline generates themed 3D meshes on demand at a fraction of hosted-API cost. Ongoing work.

TYPESCRIPT LLM ORCHESTRATION GPU INFRA ROBLOX NEXT.JS 15 R3F
Isometric voxel landscape: neon obby track, candy kingdom, medieval castle — sample worlds generated by PromptBlox

TECH_STACK

TypeScript Next.js 15 Claude (Anthropic) Gemini (Google) Modal (self-hosted GPU) React Three Fiber Drizzle ORM Neon Postgres Upstash Redis Railway Stripe Vitest Playwright

KEY_FEATURES

Conversation-First Agent

An LLM classifies every prompt along four intent axes (clear, novel, ambiguous, impossible), asks experience-based clarifying questions when needed, then hands off to a designer LLM. No template selector, no wizard.

LLMs Design, TypeScript Builds

The designer outputs a semantic GameSpec (zones, mechanics, economy). A deterministic TypeScript layout engine with a small set of spatial archetypes turns the spec into coordinates — LLMs never predict raw XYZ, which sidesteps the spatial-reasoning failure mode.

Self-Hosted 3D Asset Pipeline

A text-to-3D + texturing chain runs on self-hosted GPU workers behind a typed RPC boundary. The resulting unit cost is order-of-magnitude cheaper than hosted-API equivalents, which is what makes per-user free tiers economically viable.

Resolver Pattern

A single data-driven document is the source of truth for prompt-to-pipeline routing — keywords, aesthetics, archetypes, decoration rules. It is versioned, greppable, and covered by trigger-eval regression tests in CI. Shadow-mode migrations validate new routing against the legacy classifier in production before any cutover.

Multi-Agent Orchestration

Built using a planner / engineer / reviewer pipeline of autonomous coding agents. File-claim isolation and per-track git worktrees let several engineer agents work in parallel without merge conflicts. Peak throughput has hit 35-80 commits in a single session.

Complete .rbxlx Output

Outputs a fully playable Roblox game file with 100+ injected Luau gameplay snippets (currency, checkpoints, kill bricks, pet hatching, CTF flags, round timers, etc.). Users publish straight to Roblox from the browser via OAuth.

APPROACH

PromptBlox uses a resolver-driven routing layer to keep generation logic consistent as game types scale. LLMs handle creative decisions (what the game is, how it feels); deterministic TypeScript handles spatial decisions (where parts go, how they connect). Details on request.

system_overview.txt PROMPT → PLAYABLE GAME
user prompt
     │
     ▼
┌─────────────────┐
│   CLASSIFIER──► trigger-eval tests
│  (LLM, intent)  │     (CI-gated regression)
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│    RESOLVER──► single source of truth —
│ (data-driven)   │     routing, aesthetics,
└────────┬────────┘     archetypes, decoration
         │
   ┌─────┼─────┬─────┐
   ▼     ▼     ▼     ▼
 ┌────┐┌────┐┌────┐┌────┐
 │STRU││DECO││ASSET││SCRIPT│
 └──┬─┘└──┬─┘└──┬─┘└──┬─┘
    │     │     │     │
    └─────┴──┬──┴─────┘
             ▼
     ┌───────────────┐
     │   STITCHER    │
     └───────┬───────┘
             ▼
     playable game file

legend:
  STRU   structure gen — deterministic TypeScript
  DECO   decoration — LLM code-gen + theme context
  ASSET  asset pipeline — self-hosted GPU 3D + texturing
  SCRIPT script injector — gameplay snippets + runtime APIs

Each fan-out stage has its own observability, fallback path, and test suite. The Resolver is the pressure point: every routing change is gated by a trigger-eval regression harness so behavioral regressions show up in CI, not in prod.

build_pipeline.txt HOW FEATURES SHIP
# How features actually ship.

┌──────────────┐
│   PLANNER ──► plan doc on disk
└──────┬───────┘
       │
       ▼
┌──────────────┐
│ ENGINEER A ─┐
├──────────────┤   │
│ ENGINEER B ─┤──► REVIEWER ──► PASS / FAIL
├──────────────┤   │                        │
│ ENGINEER C ─┘                        │
└──────────────┘                            ▼
  (file-claim                  parent merges
   isolation,parallel)production deploy

Each stage emits artifacts on disk (plan docs, diffs, review reports) rather than passing state through inline prompts — lets multiple agents work in parallel without tripping over each other, and leaves an audit trail when something goes sideways.

SOURCE_CODE

Sketches, not production code — model names, prompts, timeouts, and cache strategies are elided on purpose.

classifier.ts INTENT → RESOLVER
// Intent classifier — LLM returns a 4-way decision.
// Output schema is validated before anything downstream touches it.

export async function classifyPrompt(prompt: string): Promise<ClassifierResult> {
  const { object } = await generateObject({
    model:  /* frontier LLM */,
    schema: ClassifierSchema,
    prompt: buildClassifierPrompt(prompt),
  });

  // Ambiguous prompts trigger a clarifying-question conversation loop
  // before anything expensive runs.
  if (object.intent === "ambiguous") {
    return { intent: "ambiguous", questions: object.questions };
  }

  // Resolver: single source of truth for prompt -> pipeline.
  // Versioned as data, not baked into code paths.
  const routed = await resolve(prompt, object);
  return { intent: "clear", ...routed };
}
structure-generator.ts DETERMINISTIC LAYOUT
// LLMs design (WHAT), TypeScript builds (WHERE).
// Spatial archetypes replace O(n) template duplication with
// a small set of parameterized layout algorithms.

export function generateStructure(spec: GameSpec): GameState {
  const archetype = ARCHETYPES[spec.archetype];
  const parts: Part[] = [];

  for (const zone of spec.zones) {
    const layout    = archetype.layoutFor(zone);
    const obstacles = pickObstacles(zone);

    for (const slot of layout.slots) {
      parts.push(placePart(obstacles.next(), slot, zone));
      // position is computed, never LLM-predicted — avoids
      // the spatial-reasoning failure mode entirely.
    }
  }

  return { parts, scripts: collectSnippets(spec) };
}
mesh_worker.py SELF-HOSTED GPU WORKER
# Self-hosted GPU worker: text prompt -> textured 3D mesh.
# Exact model chain, timeouts, and cache-key strategy redacted;
# the architectural point is the separation of concerns.

@app.cls(gpu="…")
class MeshWorker:
    @modal.enter()
    def load(self):
        self.geometry_model = load_geometry(…)   # text -> untextured mesh
        self.texturing_model = load_texturing(…) # image -> PBR texture

    @modal.method()
    async def generate(self, prompt, theme):
        if hit := await cache.get(prompt, theme):
            return hit                         # warm path

        mesh    = self.geometry_model.run(prompt)
        concept = await render_concept_image(prompt, theme)
        glb     = self.texturing_model.run(mesh, concept)
        return await cache.put(prompt, theme, glb)

LAYERS

Prompt Layer

  • LLM intent classifier
  • Data-driven Resolver
  • Clarifying-question conversation
  • Streaming progress (SSE)

Design + Build

  • LLM game designer (GameSpec)
  • Spatial archetypes (TypeScript)
  • Hardcoded templates for known types
  • LLM decorator + image concepts

Asset + Output

  • Self-hosted GPU text-to-3D
  • Self-hosted GPU texturing
  • Content-addressed mesh cache
  • .rbxlx export + Roblox OAuth publish

SCALE_METRICS

120K+
Lines TypeScript
2,100+
Tests Passing
100+
Luau Snippets
675+
Commits Shipped
8
Spatial Archetypes
83%
Prompt Coverage
10x
Cheaper Meshes vs Hosted API
14
Build Sessions to V3

PRODUCT_NOTES

Positioned for parents as education + entrepreneurship: kids learn real game design and can publish to Roblox with one click. Free tier runs on a daily credit allowance with a signup bonus; Pro ($14.99/mo) and Pro+ ($29.99/mo) scale the allowance up. Credits price a map higher than a mod higher than a concept, so the most expensive actions are the ones users ask for least often.

Still-ongoing work: observability MVP is scoped but not deployed, a heavier "conductor" architecture is deferred pending real prod data, and a Roblox Studio plugin is built but shelved until user demand signals it. Core wedge remains first-shot game quality in the browser.

BACK_TO_INDEX