diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..33dab9e --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,233 @@ +# CLAUDE.md + +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. + +## Project Overview + +AI Robot Core is a multi-tenant AI service platform providing intelligent chat, RAG (Retrieval-Augmented Generation) knowledge base, and LLM integration capabilities. The system consists of a Python FastAPI backend (ai-service) and a Vue.js admin frontend (ai-service-admin). + +## Architecture + +### Service Components + +- **ai-service**: FastAPI backend (port 8080 internal, 8182 external) + - Multi-tenant isolation via `X-Tenant-Id` header + - SSE streaming support via `Accept: text/event-stream` + - RAG-powered responses with confidence scoring + - Intent-driven script flows and metadata governance + +- **ai-service-admin**: Vue 3 + Element Plus frontend (port 80 internal, 8183 external) + - Admin UI for knowledge base, LLM config, RAG experiments + - Nginx reverse proxy to backend at `/api/*` + +- **postgres**: PostgreSQL 15 database (port 5432) + - Chat sessions, messages, knowledge base metadata + - Multi-tenant data isolation + +- **qdrant**: Vector database (port 6333) + - Document embeddings and vector search + - Collections prefixed with `kb_` + +- **ollama**: Local embedding model service (port 11434) + - Default: `nomic-embed-text` (768 dimensions) + - Recommended: `toshk0/nomic-embed-text-v2-moe:Q6_K` + +### Key Architectural Patterns + +**Multi-Tenancy**: All data is scoped by `tenant_id`. The `TenantContextMiddleware` extracts tenant from `X-Tenant-Id` header and stores in request state. Database queries must filter by tenant_id. + +**Service Layer Organization**: +- `app/services/llm/`: LLM provider adapters (OpenAI, DeepSeek, Ollama) +- `app/services/embedding/`: Embedding providers (OpenAI, Ollama, Nomic) +- `app/services/retrieval/`: Vector search and indexing +- `app/services/document/`: Document parsers (PDF, Word, Excel, Text) +- `app/services/flow/`: Intent-driven script flow engine +- `app/services/guardrail/`: Input/output filtering and safety +- `app/services/monitoring/`: Dashboard metrics and logging +- `app/services/mid/`: Mid-platform dialogue and session management + +**API Structure**: +- `/ai/chat`: Main chat endpoint (supports JSON and SSE streaming) +- `/ai/health`: Health check +- `/admin/*`: Admin endpoints for configuration and management +- `/mid/*`: Mid-platform API for dialogue sessions and messages + +**Configuration**: Uses `pydantic-settings` with `AI_SERVICE_` prefix. All settings in `app/core/config.py` can be overridden via environment variables. + +**Database**: SQLModel (SQLAlchemy + Pydantic) with async PostgreSQL. Entities in `app/models/entities.py`. Session factory in `app/core/database.py`. + +## Development Commands + +### Backend (ai-service) + +```bash +cd ai-service + +# Install dependencies (development) +pip install -e ".[dev]" + +# Run development server +uvicorn app.main:app --reload --port 8000 + +# Run tests +pytest + +# Run specific test +pytest tests/test_confidence.py -v + +# Run tests with coverage +pytest --cov=app --cov-report=html + +# Lint with ruff +ruff check app/ +ruff format app/ + +# Type check with mypy +mypy app/ + +# Database migrations +python scripts/migrations/run_migration.py scripts/migrations/005_create_mid_tables.sql +``` + +### Frontend (ai-service-admin) + +```bash +cd ai-service-admin + +# Install dependencies +npm install + +# Run development server (port 5173) +npm run dev + +# Build for production +npm run build + +# Preview production build +npm run preview +``` + +### Docker Compose + +```bash +# Start all services +docker compose up -d + +# Rebuild and start +docker compose up -d --build + +# View logs +docker compose logs -f ai-service +docker compose logs -f ai-service-admin + +# Stop all services +docker compose down + +# Stop and remove volumes (clears data) +docker compose down -v + +# Pull embedding model in Ollama container +docker exec -it ai-ollama ollama pull toshk0/nomic-embed-text-v2-moe:Q6_K +``` + +## Important Conventions + +### Acceptance Criteria Traceability + +All code must reference AC (Acceptance Criteria) codes in docstrings and comments: +- Format: `[AC-AISVC-XX]` for ai-service +- Example: `[AC-AISVC-01] Centralized configuration` +- See `spec/contracting.md` for contract maturity levels (L0-L3) + +### OpenAPI Contract Management + +- Provider API: `openapi.provider.yaml` (APIs this module provides) +- Consumer API: `openapi.deps.yaml` (APIs this module depends on) +- Must declare `info.x-contract-level: L0|L1|L2|L3` +- L2 required before merging to main +- Run contract checks: `scripts/check-openapi-level.sh`, `scripts/check-openapi-diff.sh` + +### Multi-Tenant Data Access + +Always filter by tenant_id when querying database: +```python +from app.core.tenant import get_current_tenant_id + +tenant_id = get_current_tenant_id() +result = await session.exec( + select(ChatSession).where(ChatSession.tenant_id == tenant_id) +) +``` + +### SSE Streaming Pattern + +Check `Accept` header for streaming mode: +```python +from app.core.sse import create_sse_response + +if request.headers.get("accept") == "text/event-stream": + return create_sse_response(event_generator()) +else: + return JSONResponse({"response": "..."}) +``` + +### Error Handling + +Use custom exceptions from `app/core/exceptions.py`: +```python +from app.core.exceptions import AIServiceException, ErrorCode + +raise AIServiceException( + code=ErrorCode.KNOWLEDGE_BASE_NOT_FOUND, + message="Knowledge base not found", + details={"kb_id": kb_id} +) +``` + +### Configuration Access + +```python +from app.core.config import get_settings + +settings = get_settings() # Cached singleton +llm_model = settings.llm_model +``` + +### Embedding Configuration + +Embedding config is persisted to `config/embedding_config.json` and loaded at startup. Use `app/services/embedding/factory.py` to get the configured provider. + +## Testing Strategy + +- Unit tests in `tests/test_*.py` +- Use `pytest-asyncio` for async tests +- Fixtures in `tests/conftest.py` +- Mock external services (LLM, Qdrant) in tests +- Integration tests require running services (use `docker compose up -d`) + +## Session Handoff Protocol + +For complex multi-phase tasks, use the Session Handoff Protocol (v2.0): +- Progress docs in `docs/progproject]-progress.md` +- Must include: task overview, requirements reference, technical context, next steps +- See `~/.claude/specs/session-handoff-protocol-ai-ref.md` for details +- Stop proactively after 40-50 tool calls or phase completion + +## Common Pitfalls + +1. **Tenant Isolation**: Never query without tenant_id filter +2. **Embedding Model Changes**: Switching models requires rebuilding all knowledge bases (vectors are incompatible) +3. **SSE Streaming**: Must yield events in correct format: `data: {json}\n\n` +4. **API Key**: Backend auto-generates default key on first startup (check logs) +5. **Long-Running Commands**: Don't use `--watch` modes in tests/dev servers via bash tool +6. **Windows Paths**: Use forward slashes in code, even on Windows (Python handles conversion) + +## Key Files + +- [app/main.py](ai-service/app/main.py): FastAPI app entry point +- [app/core/config.py](ai-service/app/core/config.py): Configuration settings +- [app/models/entities.py](ai-service/app/models/entities.py): Database models +- [app/api/chat.py](ai-service/app/api/chat.py): Main chat endpoint +- [app/services/flow/engine.py](ai-service/app/services/flow/engine.py): Intent-driven flow engine +- [docker-compose.yaml](docker-compose.ya orchestration +- [spec/contracting.md](spec/coing.md): Contract maturity rules