AGENTV / VTV Review Packet

One public door for the screen prototype.

AGENTV is the wall-mounted embodied endpoint of the Spirit Live Encounter primitive: a Raspberry Pi powered portrait display with mic and speaker, paired with any nearby browser-capable screen as the agent's external canvas. This page mirrors the review packet so VTV can read, answer, and share it without private GitHub or planning repo access.

What VTV Is Reviewing

AGENTV is not a separate app. It is the hardware body for Live Encounter.

The frame carries the agent face, mic, speaker, and always-present chamber. Any nearby browser-capable screen can become the canvas: a TV, iPhone, laptop, Raspberry Pi display, Google Cast receiver, or second tab. The question for VTV is whether the physical object, pairing ritual, privacy posture, and safe-zone composition make that primitive feel trusted in a home.

Apr 28 Sync

The product shape that emerged from the working demo.

Product Model

Hardware body for Spirit

AGENTV is not a SOLIENNE-only frame. It is the hardware endpoint for Spirit agents: a persistent face/voice presence on the wall, with any addressable screen becoming the agent's canvas.

Product Model

Voice first, app second

The default experience should work after unboxing with no app ritual: speak to the resident agent, change channel by voice, and let the canvas appear on nearby screens. A phone/web app is for setup, power, mic privacy, ordering, and manual channel management.

Product Model

Launch fleet, then drops

The first unit can ship with a small built-in fleet: SOLIENNE, SAL, COLTRANE, HENRI, and one or two others. Later agents arrive as drops: artist collaborations, protocol agents, and commercial/service agents.

Product Model

Canvas is the cheat code

The avatar can stay low-resolution, ghostly, and framed by the frosted acrylic. Intelligence and usefulness come from the agent's speech plus its ability to take over a second screen with images, diagrams, notes, checkout cards, or field guidance.

Frame Constraints

Design for the object we are actually building.

Hardware

1024 square safe zone

The current physical direction crops the 1024x1280 panel into an approximately 1024x1024 visible square. Critical face, status, captions, and setup prompts must remain centered inside that field.

Hardware

Frosted acrylic is a feature

The diffuser makes avatars feel less like video calls and more like spirits/portraits. Treat blur, monochrome, and softened motion as a design language, not a defect to optimize away.

Hardware

Provisioning without screen precision

Do not rely on tiny QR codes or dense text on the front screen. Prefer QR/serial engraved on the back or card-in-box, phone setup, and Raspberry Pi first-boot hotspot fallback.

Hardware

Kiosk launch recipe

The prototype Pi now boots through a user-level AGENTV launcher pointed at /agentv/frame?agent=solienne&wall=1&autostart=1. Required Chromium flags include password-store=basic to avoid keyring prompts, use-fake-ui-for-media-stream for mic grant, autoplay-policy=no-user-gesture-required, and disable-async-dns after WebRTC host resolution proved flaky on the Pi.

Hardware

Power and cooling are reliability

The prototype reported throttled=0x50005 and temperatures around 76C during live Runway playback. Treat a stable Pi 5 power supply, airflow/thermal path, and cable discipline as part of the product, not bench setup.

Hardware

No camera by default

Camera creates immediate privacy creep. Keep v1 mic/speaker/display only; expose ports for future optional modules if needed, but do not make camera part of the default promise.

Go-To-Market

Limited art hardware first, Spirit operating system underneath.

Launch Thesis

SOLIENNE walks so others can fly

SOLIENNE is the first compelling proof: people want live encounters. AGENTV packages that proof as a home object, then reveals the broader Spirit agent operating system.

Launch Thesis

Three agent categories

Built-in Spirit fleet, artist/creative drops, and commercial/service agents. COLTRANE behaves more like Keynote for meetings; HENRI like Photos for photographic thinking; service agents can eventually handle ordering or commerce.

Launch Thesis

Limited edition first

Early units should be numbered, constrained, and treated like art hardware, not mass electronics. The goal is to prove desire, reliability, and language before scale.

VTV Ask

If time is short, answer one P0.

We want VTV to review the product, hardware, privacy, and pairing assumptions before the first screen prototype review. If time is short, answer one P0 question with a concrete recommendation.

Go To P0 Questions

Read Order

The packet, condensed for public access.

01Product Spec

What AGENTV is, who it is for, the wall-mounted home use case, the launch-fleet model, and how the agent/canvas split should feel.

02Architecture

The live encounter stack: Raspberry Pi kiosk, Vercel web surface, RPC cognition service, Runway avatar lane, paired canvas lane, storage, and secrets.

03Code Map

Where the relevant code lives in the clean Codex-built Live Encounter reference repo and where production SOLIENNE pieces remain separate.

04Hardware

Frame, safe zone, mic/speaker, Raspberry Pi 5, panel, acrylic diffuser, physical privacy indicators, and failure modes.

05Open Questions

Fourteen questions grouped by P0/P1/P2. This is the most important page for VTV to answer.

06Glossary

Shared vocabulary for AGENTV, Spirit Live Encounter, canvas, chamber, receiver, Runway avatar, and AgentV home mode.

P0 Questions

These block the first shippable unit.

Q1 / P0

Is the Live Encounter primitive the right shape for an unsupervised always-on home unit?

AGENTV consumes the same chamber plus canvas primitive used by SOLIENNE and COLTRANE, but home mode may run for hours or days rather than a bounded gallery encounter.

  • Is direct_address_only the right wake posture, or does a home need silent-unless-addressed-and-human-present?
  • Does long-lived home presence need a new ambient phase, or should idle plus presenceMode cover it?
  • Can ignored noise still pollute participant.turn logs, and what transcript hygiene is required?
Q2 / P0

Is VAD-before-cloud defensible as a privacy posture for a European home?

The intended claim is that mic audio does not leave the device until voice activity is detected, with a visible indicator while the agent lane is listening.

  • Is browser-side VAD enough, or must enforcement be firmware/OS-level to be credible?
  • What consent regime applies in an EU private home, especially for guests?
  • Does emotion or relational tagging introduce EU AI Act risk?
Q3 / P0

Is the pairing UX defensible as both a security surface and a human ritual?

The primitive supports QR, short pair codes, direct display links, and Google Cast. The product question is which one should lead in a home.

  • Is a read-only 8-character pair code acceptable, or does /display need per-session auth?
  • Should the home ritual lead with Cast, QR, or short code?
  • If the canvas drops mid-encounter, should the agent notice and acknowledge it?