CURTISION • OS ASSISTANT • UI-FIRST AUTOMATION
The OS Assistant That Feels Like Windows
The goal isn’t “chat with an agent.” The goal is: right-click verbs, zero disconnect,
and Windows-grade reliability — background jobs, queues, retries, rollback, and receipts.
Big idea: Humans choose what action happens (“Optimize”, “Expand”, “Lookup”).
The assistant automates the thinking — picks the best preset and parameters — while deterministic playbooks handle execution.
That’s why it stays stable.
What we’re building (in one sentence)
A local-first OS layer where you can invoke an assistant anytime from a right-click menu or hotkey,
and it instantly understands your current context, runs safe automations in the background,
tracks failures, and always leaves clean outputs.
The core UX: verbs everywhere
- 🖱️Context menu verbs in Explorer/Finder: Optimize • Expand • Lookup • Chat about this • Run preset • Task Center
- ⌨️Command palette (Ctrl+Space): type a verb, hit Enter — no “prompt engineering” required
- 🧭Task Center like a downloads manager: queued/running/done/failed, with logs + receipts + rollback
- 💬Chat side panel is optional: it’s for “why?”, “explain”, “make a preset”, and “teach me” — not required to operate
Zero disconnect: how it knows what you’re doing
The assistant builds a Context Pack every time you click a verb — so you never need to copy/paste or explain.
It’s layered for privacy and reliability:
- Selection context: selected files/folders/text, metadata, current directory.
- Active app adapters: browser URL/title/selection, VS Code file/selection, terminal cwd + last output (bounded).
- Optional screen awareness: active window capture + local OCR/vision for visible errors and UI state.
{
“verb”: “optimize”,
“selection”: [“D:\\Downloads\\images\\…”],
“active_app”: “explorer.exe”,
“adapter”: { “folder”: “D:\\Downloads\\images” },
“screen”: { “enabled”: false },
“preset”: “WebOptimizeImages”,
“safety”: { “keep_originals”: true, “two_phase_commit”: true }
}
Why this beats fully-agentic automation
Agentic systems fail because they must interpret ambiguous language, guess missing context,
and execute risky actions without tight guardrails.
This OS assistant stays stable because:
- Intent is explicit: you clicked “Optimize” — no guessing what you meant.
- Scope is bounded: it applies to your selection / current app.
- Execution is standardized: playbooks do the work (preflight, validation, rollback).
- Chat is optional: no constant corrections, no “watch the agent think.”
Full feature list (what the app includes)
1) UI & Integrations
- Explorer/Finder context menu: Optimize • Expand • Lookup • Chat about this • Run preset • Task Center
- Global command palette: verbs + quick search + suggestions
- Dockable side chat: context-aware, optional, “Create task” button from chat replies
- Notifications/toasts: started/done/failed + “open output folder”
- App adapters: browser / VS Code / terminal context extraction
- Optional screen awareness: capture + local OCR/vision (toggle + privacy controls)
2) Task Center & Job Queue
- Persistent queue: survives restarts
- Concurrency limits: per drive/app/type
- Priority: interactive vs background
- Pause/resume/cancel
- Retry/backoff with error classification
- Task receipts: logs, file lists, diffs, validation outputs
- Rollback: restore originals or undo changes
3) Context Pack System
- Selection capture: files/folders/text + metadata
- Active app capture: adapter snapshots
- Optional screen pack: screenshot + extracted text
- User prefs & presets: embedded so actions become predictable
- Audit trail: every task stores the context pack used
4) Playbook Library (deterministic automation)
Playbooks are the “Windows copy dialog” equivalent: boring on purpose, dependable, and testable.
- Optimize: images (resize/compress/convert/dedupe), audio normalize/convert, video transcode presets
- Folder hygiene: rename clean, sort into structure, dedupe, archive, checksum reports
- Module packaging: create ZIP “module pack” + manifest + health check + README
- Dev workflows: format/lint/build/test; parse errors; generate patch suggestions
- Docs: expand text into outline/checklist/docs; extract action items
- Lookup: identify formats, parse logs, summarize error causes, fact-check (optional web)
5) Presets & Rules (so you never reconfigure)
- Preset manager: “Web Optimize Images”, “Archive Lossless”, “Game Project Packager”
- Per-folder defaults: auto-pick preset based on path pattern
- Confidence gating: auto-run green tasks; yellow uses dry-run; red requires confirm
6) Learning (optional, practical, not creepy)
- Episode memory: “when I click Optimize here, I usually want WebP 1920px quality 82”
- Outcome scoring: faster completion, fewer retries, user satisfaction
- Personal defaults: improves preset selection over time
Structure (what the app is made of)
Runtime components
- Shell integration (thin): context menu + passes context pack
- Local daemon (the brain + reliability): queue + playbooks + safety enforcement
- UI client: Task Center + side chat + palette + notifications
Internal module layout
selection_reader
app_adapters/ (browser, vscode, terminal)
screen_capture (optional)
context_pack_builder
router/
rules_engine
preset_picker
ai_classifier (optional)
playbooks/
optimize_images
pack_module_zip
rename_clean
dedupe_folder
summarize_logs
project_format_build
engine/
executor
preflight
validation
rollback_manager
sandbox_outputs
queue/
scheduler
persistence
retry_backoff
task_state_machine
ui/
task_center
side_panel_chat
command_palette
notifications
security/
safety_tiers
policies
redaction (screen mode)
storage/
tasks_db
artifacts_index
settings_presets
Build stages (so it becomes “real” fast)
MVP (already feels like an OS feature)
- Local daemon + Task Center
- 4 verbs: Optimize / Expand / Lookup / Chat
- 5–10 playbooks with safe outputs + rollback
- Persistent queue + retries + receipts
v1 (daily driver)
- Explorer/Finder integration
- App adapters (browser/VS Code/terminal)
- Preset manager + per-folder defaults
- Better failure classification + fallbacks
v2 (true “zero disconnect”)
- Optional screen awareness (local OCR/vision)
- Suggested Next Actions (only high-confidence)
- Autopilot for Green-tier tasks
- Plugin SDK for adding playbooks
Next step for Curtision
The fastest way to make this feel like “Windows with AI” is to ship a rock-solid
Task Center + Playbooks first, then layer in context menus and app adapters.
Chat stays optional — your workflow stays click-driven.