CURTISION • OS ASSISTANT • UI-FIRST AUTOMATION

The OS Assistant That Feels Like Windows

The goal isn’t “chat with an agent.” The goal is: right-click verbs, zero disconnect,
and Windows-grade reliability — background jobs, queues, retries, rollback, and receipts.

UI-Baby • Click verbs, not prompts

99% Reliable • Playbooks + safety tiers

Always Aware • Selection + active app + optional screen

Background Jobs • Queue • retry • fail gracefully

Big idea: Humans choose what action happens (“Optimize”, “Expand”, “Lookup”).
The assistant automates the thinking — picks the best preset and parameters — while deterministic playbooks handle execution.
That’s why it stays stable.

What we’re building (in one sentence)

A local-first OS layer where you can invoke an assistant anytime from a right-click menu or hotkey,
and it instantly understands your current context, runs safe automations in the background,
tracks failures, and always leaves clean outputs.

The core UX: verbs everywhere

🖱️Context menu verbs in Explorer/Finder: Optimize • Expand • Lookup • Chat about this • Run preset • Task Center
⌨️Command palette (Ctrl+Space): type a verb, hit Enter — no “prompt engineering” required
🧭Task Center like a downloads manager: queued/running/done/failed, with logs + receipts + rollback
💬Chat side panel is optional: it’s for “why?”, “explain”, “make a preset”, and “teach me” — not required to operate

Zero disconnect: how it knows what you’re doing

The assistant builds a Context Pack every time you click a verb — so you never need to copy/paste or explain.
It’s layered for privacy and reliability:

Selection context: selected files/folders/text, metadata, current directory.
Active app adapters: browser URL/title/selection, VS Code file/selection, terminal cwd + last output (bounded).
Optional screen awareness: active window capture + local OCR/vision for visible errors and UI state.

Context Pack (conceptual)
{
“verb”: “optimize”,
“selection”: [“D:\\Downloads\\images\\…”],
“active_app”: “explorer.exe”,
“adapter”: { “folder”: “D:\\Downloads\\images” },
“screen”: { “enabled”: false },
“preset”: “WebOptimizeImages”,
“safety”: { “keep_originals”: true, “two_phase_commit”: true }
}

Why this beats fully-agentic automation

Agentic systems fail because they must interpret ambiguous language, guess missing context,
and execute risky actions without tight guardrails.
This OS assistant stays stable because:

Intent is explicit: you clicked “Optimize” — no guessing what you meant.
Scope is bounded: it applies to your selection / current app.
Execution is standardized: playbooks do the work (preflight, validation, rollback).
Chat is optional: no constant corrections, no “watch the agent think.”

Windows-grade reliability stack

1) Human picks a verb
explicit intent

Right-click / palette. The verb is the “contract.”

2) Assistant selects preset
smart defaults

AI picks parameters (format, size, plan), but it doesn’t execute raw commands.

3) Playbook engine runs
deterministic

Preflight → steps → validation → receipts → rollback-ready output.

4) Job queue + Task Center
retry / pause

Background execution, graceful failure, always recoverable.

Task states (simple + powerful)

Queued → waiting
Running → working
Needs Input → only when truly ambiguous
Failed → classified + retry options
Done → receipts + outputs

Safety tiers (automation without drama)

Green: auto-run (safe, reversible)
Yellow: dry-run + confirm commit
Red: never auto-run; always explicit + preview

Full feature list (what the app includes)

1) UI & Integrations

Explorer/Finder context menu: Optimize • Expand • Lookup • Chat about this • Run preset • Task Center
Global command palette: verbs + quick search + suggestions
Dockable side chat: context-aware, optional, “Create task” button from chat replies
Notifications/toasts: started/done/failed + “open output folder”
App adapters: browser / VS Code / terminal context extraction
Optional screen awareness: capture + local OCR/vision (toggle + privacy controls)

2) Task Center & Job Queue

Persistent queue: survives restarts
Concurrency limits: per drive/app/type
Priority: interactive vs background
Pause/resume/cancel
Retry/backoff with error classification
Task receipts: logs, file lists, diffs, validation outputs
Rollback: restore originals or undo changes

3) Context Pack System

Selection capture: files/folders/text + metadata
Active app capture: adapter snapshots
Optional screen pack: screenshot + extracted text
User prefs & presets: embedded so actions become predictable
Audit trail: every task stores the context pack used

4) Playbook Library (deterministic automation)

Playbooks are the “Windows copy dialog” equivalent: boring on purpose, dependable, and testable.

Optimize: images (resize/compress/convert/dedupe), audio normalize/convert, video transcode presets
Folder hygiene: rename clean, sort into structure, dedupe, archive, checksum reports
Module packaging: create ZIP “module pack” + manifest + health check + README
Dev workflows: format/lint/build/test; parse errors; generate patch suggestions
Docs: expand text into outline/checklist/docs; extract action items
Lookup: identify formats, parse logs, summarize error causes, fact-check (optional web)

5) Presets & Rules (so you never reconfigure)

Preset manager: “Web Optimize Images”, “Archive Lossless”, “Game Project Packager”
Per-folder defaults: auto-pick preset based on path pattern
Confidence gating: auto-run green tasks; yellow uses dry-run; red requires confirm

6) Learning (optional, practical, not creepy)

Episode memory: “when I click Optimize here, I usually want WebP 1920px quality 82”
Outcome scoring: faster completion, fewer retries, user satisfaction
Personal defaults: improves preset selection over time

Structure (what the app is made of)

Runtime components

Shell integration (thin): context menu + passes context pack
Local daemon (the brain + reliability): queue + playbooks + safety enforcement
UI client: Task Center + side chat + palette + notifications

Internal module layout

context/
selection_reader
app_adapters/ (browser, vscode, terminal)
screen_capture (optional)
context_pack_builder

router/
rules_engine
preset_picker
ai_classifier (optional)

playbooks/
optimize_images
pack_module_zip
rename_clean
dedupe_folder
summarize_logs
project_format_build

engine/
executor
preflight
validation
rollback_manager
sandbox_outputs

queue/
scheduler
persistence
retry_backoff
task_state_machine

ui/
task_center
side_panel_chat
command_palette
notifications

security/
safety_tiers
policies
redaction (screen mode)

storage/
tasks_db
artifacts_index
settings_presets

Build stages (so it becomes “real” fast)

MVP (already feels like an OS feature)

Local daemon + Task Center
4 verbs: Optimize / Expand / Lookup / Chat
5–10 playbooks with safe outputs + rollback
Persistent queue + retries + receipts

v1 (daily driver)

Explorer/Finder integration
App adapters (browser/VS Code/terminal)
Preset manager + per-folder defaults
Better failure classification + fallbacks

v2 (true “zero disconnect”)

Optional screen awareness (local OCR/vision)
Suggested Next Actions (only high-confidence)
Autopilot for Green-tier tasks
Plugin SDK for adding playbooks

Next step for Curtision

The fastest way to make this feel like “Windows with AI” is to ship a rock-solid
Task Center + Playbooks first, then layer in context menus and app adapters.
Chat stays optional — your workflow stays click-driven.

Use as a spec
Turn into a roadmap

Right-Click Intelligence: Building a Real OS Assistant (Without the Agent Chaos)