7.3 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
AI Robot Core is a multi-tenant AI service platform providing intelligent chat, RAG (Retrieval-Augmented Generation) knowledge base, and LLM integration capabilities. The system consists of a Python FastAPI backend (ai-service) and a Vue.js admin frontend (ai-service-admin).
Architecture
Service Components
-
ai-service: FastAPI backend (port 8080 internal, 8182 external)
- Multi-tenant isolation via
X-Tenant-Idheader - SSE streaming support via
Accept: text/event-stream - RAG-powered responses with confidence scoring
- Intent-driven script flows and metadata governance
- Multi-tenant isolation via
-
ai-service-admin: Vue 3 + Element Plus frontend (port 80 internal, 8183 external)
- Admin UI for knowledge base, LLM config, RAG experiments
- Nginx reverse proxy to backend at
/api/*
-
postgres: PostgreSQL 15 database (port 5432)
- Chat sessions, messages, knowledge base metadata
- Multi-tenant data isolation
-
qdrant: Vector database (port 6333)
- Document embeddings and vector search
- Collections prefixed with
kb_
-
ollama: Local embedding model service (port 11434)
- Default:
nomic-embed-text(768 dimensions) - Recommended:
toshk0/nomic-embed-text-v2-moe:Q6_K
- Default:
Key Architectural Patterns
Multi-Tenancy: All data is scoped by tenant_id. The TenantContextMiddleware extracts tenant from X-Tenant-Id header and stores in request state. Database queries must filter by tenant_id.
Service Layer Organization:
app/services/llm/: LLM provider adapters (OpenAI, DeepSeek, Ollama)app/services/embedding/: Embedding providers (OpenAI, Ollama, Nomic)app/services/retrieval/: Vector search and indexingapp/services/document/: Document parsers (PDF, Word, Excel, Text)app/services/flow/: Intent-driven script flow engineapp/services/guardrail/: Input/output filtering and safetyapp/services/monitoring/: Dashboard metrics and loggingapp/services/mid/: Mid-platform dialogue and session management
API Structure:
/ai/chat: Main chat endpoint (supports JSON and SSE streaming)/ai/health: Health check/admin/*: Admin endpoints for configuration and management/mid/*: Mid-platform API for dialogue sessions and messages
Configuration: Uses pydantic-settings with AI_SERVICE_ prefix. All settings in app/core/config.py can be overridden via environment variables.
Database: SQLModel (SQLAlchemy + Pydantic) with async PostgreSQL. Entities in app/models/entities.py. Session factory in app/core/database.py.
Development Commands
Backend (ai-service)
cd ai-service
# Install dependencies (development)
pip install -e ".[dev]"
# Run development server
uvicorn app.main:app --reload --port 8000
# Run tests
pytest
# Run specific test
pytest tests/test_confidence.py -v
# Run tests with coverage
pytest --cov=app --cov-report=html
# Lint with ruff
ruff check app/
ruff format app/
# Type check with mypy
mypy app/
# Database migrations
python scripts/migrations/run_migration.py scripts/migrations/005_create_mid_tables.sql
Frontend (ai-service-admin)
cd ai-service-admin
# Install dependencies
npm install
# Run development server (port 5173)
npm run dev
# Build for production
npm run build
# Preview production build
npm run preview
Docker Compose
# Start all services
docker compose up -d
# Rebuild and start
docker compose up -d --build
# View logs
docker compose logs -f ai-service
docker compose logs -f ai-service-admin
# Stop all services
docker compose down
# Stop and remove volumes (clears data)
docker compose down -v
# Pull embedding model in Ollama container
docker exec -it ai-ollama ollama pull toshk0/nomic-embed-text-v2-moe:Q6_K
Important Conventions
Acceptance Criteria Traceability
All code must reference AC (Acceptance Criteria) codes in docstrings and comments:
- Format:
[AC-AISVC-XX]for ai-service - Example:
[AC-AISVC-01] Centralized configuration - See
spec/contracting.mdfor contract maturity levels (L0-L3)
OpenAPI Contract Management
- Provider API:
openapi.provider.yaml(APIs this module provides) - Consumer API:
openapi.deps.yaml(APIs this module depends on) - Must declare
info.x-contract-level: L0|L1|L2|L3 - L2 required before merging to main
- Run contract checks:
scripts/check-openapi-level.sh,scripts/check-openapi-diff.sh
Multi-Tenant Data Access
Always filter by tenant_id when querying database:
from app.core.tenant import get_current_tenant_id
tenant_id = get_current_tenant_id()
result = await session.exec(
select(ChatSession).where(ChatSession.tenant_id == tenant_id)
)
SSE Streaming Pattern
Check Accept header for streaming mode:
from app.core.sse import create_sse_response
if request.headers.get("accept") == "text/event-stream":
return create_sse_response(event_generator())
else:
return JSONResponse({"response": "..."})
Error Handling
Use custom exceptions from app/core/exceptions.py:
from app.core.exceptions import AIServiceException, ErrorCode
raise AIServiceException(
code=ErrorCode.KNOWLEDGE_BASE_NOT_FOUND,
message="Knowledge base not found",
details={"kb_id": kb_id}
)
Configuration Access
from app.core.config import get_settings
settings = get_settings() # Cached singleton
llm_model = settings.llm_model
Embedding Configuration
Embedding config is persisted to config/embedding_config.json and loaded at startup. Use app/services/embedding/factory.py to get the configured provider.
Testing Strategy
- Unit tests in
tests/test_*.py - Use
pytest-asynciofor async tests - Fixtures in
tests/conftest.py - Mock external services (LLM, Qdrant) in tests
- Integration tests require running services (use
docker compose up -d)
Session Handoff Protocol
For complex multi-phase tasks, use the Session Handoff Protocol (v2.0):
- Progress docs in
docs/progproject]-progress.md - Must include: task overview, requirements reference, technical context, next steps
- See
~/.claude/specs/session-handoff-protocol-ai-ref.mdfor details - Stop proactively after 40-50 tool calls or phase completion
Common Pitfalls
- Tenant Isolation: Never query without tenant_id filter
- Embedding Model Changes: Switching models requires rebuilding all knowledge bases (vectors are incompatible)
- SSE Streaming: Must yield events in correct format:
data: {json}\n\n - API Key: Backend auto-generates default key on first startup (check logs)
- Long-Running Commands: Don't use
--watchmodes in tests/dev servers via bash tool - Windows Paths: Use forward slashes in code, even on Windows (Python handles conversion)
Key Files
- app/main.py: FastAPI app entry point
- app/core/config.py: Configuration settings
- app/models/entities.py: Database models
- app/api/chat.py: Main chat endpoint
- app/services/flow/engine.py: Intent-driven flow engine
- [docker-compose.yaml](docker-compose.ya orchestration
- spec/contracting.md: Contract maturity rules