docs: add CLAUDE.md for project guidance

This commit is contained in:
MerCry 2026-03-05 20:36:41 +08:00
parent e69fab7bb2
commit 9c40509225
1 changed files with 233 additions and 0 deletions

233
CLAUDE.md Normal file
View File

@ -0,0 +1,233 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
AI Robot Core is a multi-tenant AI service platform providing intelligent chat, RAG (Retrieval-Augmented Generation) knowledge base, and LLM integration capabilities. The system consists of a Python FastAPI backend (ai-service) and a Vue.js admin frontend (ai-service-admin).
## Architecture
### Service Components
- **ai-service**: FastAPI backend (port 8080 internal, 8182 external)
- Multi-tenant isolation via `X-Tenant-Id` header
- SSE streaming support via `Accept: text/event-stream`
- RAG-powered responses with confidence scoring
- Intent-driven script flows and metadata governance
- **ai-service-admin**: Vue 3 + Element Plus frontend (port 80 internal, 8183 external)
- Admin UI for knowledge base, LLM config, RAG experiments
- Nginx reverse proxy to backend at `/api/*`
- **postgres**: PostgreSQL 15 database (port 5432)
- Chat sessions, messages, knowledge base metadata
- Multi-tenant data isolation
- **qdrant**: Vector database (port 6333)
- Document embeddings and vector search
- Collections prefixed with `kb_`
- **ollama**: Local embedding model service (port 11434)
- Default: `nomic-embed-text` (768 dimensions)
- Recommended: `toshk0/nomic-embed-text-v2-moe:Q6_K`
### Key Architectural Patterns
**Multi-Tenancy**: All data is scoped by `tenant_id`. The `TenantContextMiddleware` extracts tenant from `X-Tenant-Id` header and stores in request state. Database queries must filter by tenant_id.
**Service Layer Organization**:
- `app/services/llm/`: LLM provider adapters (OpenAI, DeepSeek, Ollama)
- `app/services/embedding/`: Embedding providers (OpenAI, Ollama, Nomic)
- `app/services/retrieval/`: Vector search and indexing
- `app/services/document/`: Document parsers (PDF, Word, Excel, Text)
- `app/services/flow/`: Intent-driven script flow engine
- `app/services/guardrail/`: Input/output filtering and safety
- `app/services/monitoring/`: Dashboard metrics and logging
- `app/services/mid/`: Mid-platform dialogue and session management
**API Structure**:
- `/ai/chat`: Main chat endpoint (supports JSON and SSE streaming)
- `/ai/health`: Health check
- `/admin/*`: Admin endpoints for configuration and management
- `/mid/*`: Mid-platform API for dialogue sessions and messages
**Configuration**: Uses `pydantic-settings` with `AI_SERVICE_` prefix. All settings in `app/core/config.py` can be overridden via environment variables.
**Database**: SQLModel (SQLAlchemy + Pydantic) with async PostgreSQL. Entities in `app/models/entities.py`. Session factory in `app/core/database.py`.
## Development Commands
### Backend (ai-service)
```bash
cd ai-service
# Install dependencies (development)
pip install -e ".[dev]"
# Run development server
uvicorn app.main:app --reload --port 8000
# Run tests
pytest
# Run specific test
pytest tests/test_confidence.py -v
# Run tests with coverage
pytest --cov=app --cov-report=html
# Lint with ruff
ruff check app/
ruff format app/
# Type check with mypy
mypy app/
# Database migrations
python scripts/migrations/run_migration.py scripts/migrations/005_create_mid_tables.sql
```
### Frontend (ai-service-admin)
```bash
cd ai-service-admin
# Install dependencies
npm install
# Run development server (port 5173)
npm run dev
# Build for production
npm run build
# Preview production build
npm run preview
```
### Docker Compose
```bash
# Start all services
docker compose up -d
# Rebuild and start
docker compose up -d --build
# View logs
docker compose logs -f ai-service
docker compose logs -f ai-service-admin
# Stop all services
docker compose down
# Stop and remove volumes (clears data)
docker compose down -v
# Pull embedding model in Ollama container
docker exec -it ai-ollama ollama pull toshk0/nomic-embed-text-v2-moe:Q6_K
```
## Important Conventions
### Acceptance Criteria Traceability
All code must reference AC (Acceptance Criteria) codes in docstrings and comments:
- Format: `[AC-AISVC-XX]` for ai-service
- Example: `[AC-AISVC-01] Centralized configuration`
- See `spec/contracting.md` for contract maturity levels (L0-L3)
### OpenAPI Contract Management
- Provider API: `openapi.provider.yaml` (APIs this module provides)
- Consumer API: `openapi.deps.yaml` (APIs this module depends on)
- Must declare `info.x-contract-level: L0|L1|L2|L3`
- L2 required before merging to main
- Run contract checks: `scripts/check-openapi-level.sh`, `scripts/check-openapi-diff.sh`
### Multi-Tenant Data Access
Always filter by tenant_id when querying database:
```python
from app.core.tenant import get_current_tenant_id
tenant_id = get_current_tenant_id()
result = await session.exec(
select(ChatSession).where(ChatSession.tenant_id == tenant_id)
)
```
### SSE Streaming Pattern
Check `Accept` header for streaming mode:
```python
from app.core.sse import create_sse_response
if request.headers.get("accept") == "text/event-stream":
return create_sse_response(event_generator())
else:
return JSONResponse({"response": "..."})
```
### Error Handling
Use custom exceptions from `app/core/exceptions.py`:
```python
from app.core.exceptions import AIServiceException, ErrorCode
raise AIServiceException(
code=ErrorCode.KNOWLEDGE_BASE_NOT_FOUND,
message="Knowledge base not found",
details={"kb_id": kb_id}
)
```
### Configuration Access
```python
from app.core.config import get_settings
settings = get_settings() # Cached singleton
llm_model = settings.llm_model
```
### Embedding Configuration
Embedding config is persisted to `config/embedding_config.json` and loaded at startup. Use `app/services/embedding/factory.py` to get the configured provider.
## Testing Strategy
- Unit tests in `tests/test_*.py`
- Use `pytest-asyncio` for async tests
- Fixtures in `tests/conftest.py`
- Mock external services (LLM, Qdrant) in tests
- Integration tests require running services (use `docker compose up -d`)
## Session Handoff Protocol
For complex multi-phase tasks, use the Session Handoff Protocol (v2.0):
- Progress docs in `docs/progproject]-progress.md`
- Must include: task overview, requirements reference, technical context, next steps
- See `~/.claude/specs/session-handoff-protocol-ai-ref.md` for details
- Stop proactively after 40-50 tool calls or phase completion
## Common Pitfalls
1. **Tenant Isolation**: Never query without tenant_id filter
2. **Embedding Model Changes**: Switching models requires rebuilding all knowledge bases (vectors are incompatible)
3. **SSE Streaming**: Must yield events in correct format: `data: {json}\n\n`
4. **API Key**: Backend auto-generates default key on first startup (check logs)
5. **Long-Running Commands**: Don't use `--watch` modes in tests/dev servers via bash tool
6. **Windows Paths**: Use forward slashes in code, even on Windows (Python handles conversion)
## Key Files
- [app/main.py](ai-service/app/main.py): FastAPI app entry point
- [app/core/config.py](ai-service/app/core/config.py): Configuration settings
- [app/models/entities.py](ai-service/app/models/entities.py): Database models
- [app/api/chat.py](ai-service/app/api/chat.py): Main chat endpoint
- [app/services/flow/engine.py](ai-service/app/services/flow/engine.py): Intent-driven flow engine
- [docker-compose.yaml](docker-compose.ya orchestration
- [spec/contracting.md](spec/coing.md): Contract maturity rules