All work AI Agent

Jarvis

An Iron Man–style assistant that builds holograms

Built

A voice-driven personal AI assistant with a holographic fabrication engine. It sees through a camera with Gemini vision and runs on an always-on wake word. Hand it a reference image and it turns that into a 3D hologram, then designs and renders its own builds. Each result gets compared against the reference, with fixes hot-swapped in a self-correction loop.

12-30parts per component

The problem

Voice assistants stop at conversation. They can't see what's in front of them. They can't build anything you can inspect, and they have no way to check whether their own output is actually correct.

What I built

Built and working. The assistant sees through a camera, deconstructs a reference image into a 3D hologram, then designs and renders its own multi-part 3D build. From there it critiques and corrects that build against the reference in a closing loop. The self-correction loop renders, critiques, and refines its own 3D builds.

How it works

A FastAPI backend ties three layers together. The input layer stays always-on: a wake word opens a voice channel, and a camera feed goes to Gemini vision so the assistant can reason about what it sees. The graphics layer renders in the browser on Three.js/WebGL. Its Hologram Deconstructor takes a reference image and uses Depth Anything V2 in-browser to lift it into a depth-mapped 3D hologram, which is the Relief engine. The key build is the Fabrication Engine. Rather than treating schematics as flat pictures, it generates them as JARVIS-designed 3D builds described in a rich geometry language, assembling roughly 12 to 30 parts per component. What makes it more than a renderer is the Iron Man self-correction loop. The engine renders its own build, sends that render back through Gemini vision to compare against the reference, then hot-swaps fixes and re-renders. A REFINE control triggers another pass on demand.

Self-correction loop renders, critiques, and refines its own 3D builds

Highlights

  • Always-on wake word and voice, FastAPI backend with Gemini vision
  • Hologram Deconstructor plus Relief engine using Depth Anything V2, in-browser
  • Fabrication Engine builds schematics as 3D geometry, 12-30 parts per component
  • Iron Man self-correction loop: render the build, vision-critique against the reference, refine
  • REFINE control re-runs the render-compare-fix pass on demand
  • Browser graphics on Three.js/WebGL, no server round-trip for rendering
PythonFastAPIGeminiThree.jsDepth Anything V2WebGL
Want one of these?

Let's build yours.

Tell me what you're trying to ship — you'll get a scoped plan and a straight answer.