Technical architecture and AI pipeline.
Central service in Node.js/TypeScript with WebSocket connections. Lightweight agent process on customer servers executes SSH-constrained commands and streams structured results back. PostgreSQL for audit logs and incident history. Hosted at app.mttrly.com.
Two-model architecture. Claude Haiku handles real-time triage, intent parsing, and Telegram message formatting — optimized for sub-second response. Claude Sonnet handles multi-step incident diagnosis, log interpretation, and remediation planning — optimized for reasoning quality.
The split reduces cost and latency while maintaining quality on reasoning-intensive tasks.
Every design decision assumes the AI can be wrong.
The AI can only suggest commands from an operator-defined allowlist per workspace. No arbitrary code execution.
Restart, rollback, config changes, secret rotation — none execute without explicit Telegram approval from the workspace owner.
Every action logged with complete model input, model output, and execution result. You can see exactly why the AI recommended each step.
When the model lacks sufficient context, it escalates to human review instead of attempting a low-confidence fix.
Diagnostic commands run automatically. State-changing commands require approval. Enforced at architecture level, not prompt level.
19+ built-in validated operations: