Nexus Agent Engine

LLM agent engine focused on reliability and extensibility

Nexus Agent Engine

Detailed Project 2024–2025
PythonFastAPISQLAlchemyAlembicPostgreSQLRedisCeleryDocker ComposeOpenAIOllamaSearXNGTavily

Description

Nexus is a production-ready agent engine with a FastAPI backend and a modular core. It unifies provider semantics (OpenAI, Ollama) behind consistent adapters, streams tokens via SSE for responsive UIs, and manages context windows with trimming/summarization to fit model limits. Reliability comes from structured errors, retries, and caching. The platform exposes REST APIs for users, threads, agent runs, tools, and scheduled tasks, backed by Docker Compose services (Postgres, Redis, SearXNG) for a complete dev stack.

Key Features

  • API — health, users, threads, agent runs, tools, and scheduler endpoints form a clean contract for frontend or CLI clients.
  • Providers — OpenAI/Ollama adapters are switchable via .env so deployments can swap models without code changes.
  • Streaming — SSE token streaming delivers partial outputs and events for responsive, progressively rendered experiences.
  • Context — trimming and summarization keep prompts within model windows while preserving task‑relevant history.
  • Tokens — provider‑specific counting/accounting enables performance tracking and cost governance across workloads.
  • Caching — cache keys with hit/miss stats reduce repeated calls and provide token savings reports.
  • Tools — web search (SearXNG/Tavily), browser, calculator, smart scraper integrate behind a common tool interface.
  • Scheduler — Celery‑backed reminders and scheduled jobs run reliably with retries and status introspection.
  • Sandbox — isolated code execution with configurable drivers for safe, inspectable tool runs.
  • Examples — openai_simple, agent_usage, and smart_scraper_examples demonstrate core patterns and best practices.
  • Tests — unit/integration/e2e with performance assertions ensure regressions are caught early.

Challenges and Solutions

  • Provider semantics — unify SDKs/models and streaming behaviors to keep agent logic consistent across backends.
  • Streaming reliability — SSE delivery with heartbeats and graceful completion under network churn.
  • Token accuracy — consistent counting and cache invalidation across providers for trustworthy metrics.
  • Concurrency — thread/message lifecycle guarded by DB transactions and backpressure to avoid race conditions.
  • Scheduling — coordinate Celery/Redis tasks with durable state and retry semantics.
  • Search integration — SearXNG/Tavily setup with sane rate limits and safety filtering.
  • Migrations — Alembic schema evolution across environments without breaking running services.