-
Component conventions — the authoring contract Maestro AI is built against
M1 of the four-milestone Maestro AI roadmap. A new docs/conventions/ tree formalizes how skills, agents, scores, and orchestras get authored — hand-written content, machine-readable form, both human authors and LLM authors (Maestro AI v0.6+) read the same pages. The canonical Score document Zod schema (packages/types/src/score-document.ts) finally exists as a single artifact representing 'a score' instead of being implicit across three DB tables; validateScoreDocument() is the shared validation gate every consumer uses (Composer save, future Maestro AI proposals, future skill-author proposals). New GET /api/registry/skills endpoint surfaces a workspace-agnostic spec of every installed skill — operations, input schemas, output ports — for downstream tooling. The @operation decorator now accepts ports=[...] so multi-port skill ops (find_contact's exists/not_found, eventually) can declare their branching outcomes. manifest.schema.json published for Monaco binding in M2's in-browser skill editor. Example skills (Slack + HubSpot) ship in v0.4.1.
-
M1 foundations: ScoreDocument spec + shared validator + registry introspection
Three substrate pieces M2/M3/M4 reuse, shipped early so the lab box can validate them independently. (1) Canonical ScoreDocument Zod schema covering score + nodes + edges as one tree; validateScoreDocument returns all errors at once (not first-error-only) so an LLM emitter sees every issue per turn. (2) The inline validateGraph in composer.ts is replaced with a validateScoreDocument call; the save handler now projects current DB rows + diff into a full ScoreDocument and runs it through the shared gate. (3) New GET /api/registry/skills endpoint — workspace-agnostic, machine-readable: name, version, description, operations + input schemas + reserved outputPorts.
-
Cold-outreach naming neutralized + Apollo find_leads cycles pages per run
Two related fixes. (1) 'B2B SaaS Outbound' was a misnomer — the score graph works for any ICP once the Apollo filters and LLM prompts are tuned. Renamed score / pipeline / orchestra display names + descriptions to be ICP-neutral; score version bump triggers re-seed on existing installs. (2) The seeded find_leads node passed page=1 every run, so a daily cron orchestra against an unchanged ICP re-pulled the same 10 candidates forever. The skill operation now accepts random_page_max and picks a random page when set; seed config wires random_page_max=20 — ~200 distinct candidates per ICP per run window, dedup downstream still skips already-contacted leads.
-
Score runners renamed to orchestras — Maestro conducts orchestras playing scores
The conductor metaphor lands properly. The cron-attached deployment of a score is now an 'orchestra' — Maestro conducts orchestras playing scores from agents and skills. Every noun lines up. The sidebar's bottom section reads 'Orchestras · N'. URLs flip from /scores/$id to /orchestras/$id (clean break, the legacy URL 404s — operators bookmark from the dashboard, which auto-updates). The sidebar's '+ New' button now opens a dedicated OrchestraBuilder modal: pick a score, name the orchestra, set tag (auto-defaults from the score's pipelineKind), set cron, optionally describe. Server creates with enabled: false; flip Enable on the dashboard once secrets are wired. Migration 0015 renames the score_runners table to orchestras + every FK column (orchestra_id everywhere) + the SSE notify trigger function emits orchestra_id instead of agent_id. Single coordinated commit across schema + API + web + runtime + skills + docs. Breaking external API rename — minor-version bump.
-
Composer + Agents catalog — operators author scores end-to-end
The Composer ships. Operators clone a seeded score, drag skills from the palette, drop new LLM and control nodes from buttons in the same palette, draw edges between port handles, edit per-node config (including JSON-Schema-driven args forms for deterministic skill ops, structured map_over editors for control nodes, and an agent-picker for LLM nodes), and Save for a single version-bumped commit to the score graph. The Agents catalog at /agents promotes LLM-node configs to first-class workspace entities — pick an existing agent or create one inline, edit its system prompt + model + allowed tools on a dedicated detail page, see which scores reference it, delete safely with API-side guard rails. Multi-port handles, hover-highlight to trace a node's upstream + downstream path, bezier ↔ smoothstep edge toggle, dagre-driven Reset Layout for messy canvases. The legacy LLM-loop hero-score runners (cold-leads, reply-triage) retire here — the graph-decomposed v2 versions have proven themselves on the lab box (~9× cheaper on cold-leads, single-shot reply classification). v0.2.0 is the milestone where 'configure-and-run-only' ends.
-
Workspace skill authoring (Composer 1c, deferrable)
Form-driven new skill creation: name, description, secrets needed, operations with input/output JSON Schemas, Python body in an in-browser Monaco editor. Save to a workspace_skills DB row; the runtime exec()s the source into a registry-aware namespace. Same dispatch path as catalog skills. Sandboxing deferred to v2's multi-tenant work — self-host operators trust their own composer-authored code. May slip off the v1.x roadmap entirely if catalog skills cover real demand from design partners.
-
Hero score: cold-leads + reply-triage agents, in review mode
First end-to-end run of the v1 B2B SaaS Outbound score. Cold-leads finds candidates from Apollo, drafts personalized openers with Compose, writes them to the Pipeline as note activities at stage 'ready', and pings the operator via the bell icon. The operator reviews each draft on the contact detail page and clicks Send via Gmail when satisfied — the activity flips to 'contacted' and the email goes out. Reply-triage classifies inbound replies and notifies on 'interested' / 'needs_review' intents.
-
Notify skill + bell-icon UI
Agents can now ping the operator when something needs attention. Two ops: send_event (informational) and send_attention (needs review). Bell icon in the topbar with unread count, dropdown of recent notifications with mark-read / dismiss / follow-link affordances, real-time via Postgres LISTEN/NOTIFY → SSE.
-
Pipeline skill — agent-facing reads + writes against contacts
Agents can now write to the contacts and activities tables. Three operations: find_contact, add_contact, log_activity (with optional stage advancement). Writes stream to the Pipelines UI in real time so the operator sees what the agent is doing as it works.
-
Compose skill — LLM-backed openers + reply triage
draft_personalized_opener writes a cold-email subject + body grounded in a recipient's recent signal. classify_reply_intent triages an inbound reply into one of six intents (interested, not_interested, out_of_office, wrong_person, unsubscribe, needs_review). Both use Anthropic tool-use to force structured output. Defaults to Haiku 4.5; pass model='claude-sonnet-4-6' for higher-quality drafts on hard prospects.
-
Apollo skill — find_leads, enrich_person, enrich_domain
B2B prospecting + enrichment via apollo.io's API. Trims Apollo's 50-field firehose down to the fields the agent + Pipeline can use. Empty email is a real signal — free-tier reveals are limited; the agent treats those as 'needs enrichment'.
-
Gmail skill — OAuth + read + send + label
Connect a Gmail account via OAuth (refresh-on-401 baked into the SDK), read the inbox, fetch full thread bodies (MIME-parsed text + html), send messages (plain or with html alternative, with optional thread_id for replies), apply labels by name. queue_paced_send is planned — needs a persistent send queue.
-
Encrypted secrets vault
API keys and OAuth tokens are now AES-256-GCM encrypted at rest in Postgres. Master key in MAESTRO_SECRET_KEY env var, never in the DB. Skill SDK resolves secrets via a chain: env vars (dev override) → DB vault. Web UI for listing, adding, rotating, deleting secrets. Compromised DB backup is no longer compromised secrets.
-
Pipeline foundation — multi-pipeline support, real-time UI
Workspace can have many named pipelines. Email uniqueness is per-pipeline (same lead can exist across A/B-test pipelines). Companies stay workspace-scoped + de-duped. Every contact/activity write streams live to the Pipelines UI via Postgres LISTEN/NOTIFY → SSE.
-
Skills SDK — @skill, @operation, auto-derived JSON Schema
Decorator-based skill authoring: @skill('name') on the class, @operation on each method. Pydantic input models generate JSON Schema automatically; the agent loop hands those schemas to Claude as tool definitions. Manifest YAML declares secrets + concurrency; runtime auto-discovers skills under skills/catalog/.
-
Runtime — scheduler + worker pool + agent loop
Python runtime with psycopg connection pool, croniter-based scheduler, SELECT FOR UPDATE SKIP LOCKED for safe job claim across workers, full agent loop with Anthropic tool-use dispatch back into skill operations. Run steps stream to the dashboard via SSE in real time.
-
Web app + API + design system
Vite + React 19 + Tailwind v4 web app on the Hono + Drizzle TS API. Sidebar / topbar / agent dashboard / run rows / run cards / run log / agent builder / pipelines kanban / contact detail. Mobile-optimized. Manuscript-inspired design system with custom typography (Geist body + Instrument Serif display + JetBrains Mono).
Changelog
Builder's log
Reverse-chronological record of what's shipped while we work toward the v1 hero score and the closed beta.