zpc bd1e1201f1 12

2025-12-27 16:21:09 +08:00

22 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

MCP Database Server is a WebSocket/SSE-based PostgreSQL tooling service that exposes database operations through the Model Context Protocol (MCP). It allows AI clients to interact with multiple PostgreSQL databases through a unified, authenticated interface.

Background and Goals

Problem Statement: The original MCP implementation used STDIO transport, requiring each AI client to install and maintain a local copy of the codebase. This caused maintenance overhead and made it difficult to share database access across multiple clients or hosts.

Solution: A long-running server that:

Runs as a daemon/container accessible from multiple AI clients
Supports remote transports (WebSocket and SSE)
Provides centralized authentication and audit logging
Enables multi-database/multi-schema access through a single endpoint
Eliminates per-client installation requirements

Key Design Decisions (from v1.0.0):

Transport Layer: MCP SDK lacks server-side WebSocket support, so we implemented custom WebSocketServerTransport and SSEServerTransport classes
Multi-Schema Access: Single configuration supports multiple PostgreSQL databases with different schemas accessible via environment parameter
Authentication: Token-based (Bearer) authentication by default; mTLS support reserved for future
Concurrency Model: Per-client session isolation with independent connection pools
Code Separation: Complete separation from original STDIO-based codebase; this is a standalone server implementation

Build and Development Commands

Build

npm run build

Compiles TypeScript to JavaScript in the dist/ directory.

Start Server

# Production
npm start

# Development (with hot reload)
npm run dev

Generate Authentication Token

node scripts/generate-token.js

Generates a secure 64-character hex token for Bearer authentication.

Test Database Connection

npx tsx scripts/test-connection.ts

IMPORTANT: After making any code changes, always update changelog.json with version number, date, and description of changes. See "Version History and Roadmap" section for details.

Architecture

Core Layers

The codebase is organized into distinct layers:

server.ts - Main entry point that:
- Loads and validates configuration
- Initializes the UnifiedServerManager (handles both WebSocket and SSE)
- Creates PostgresMcp instance for database operations
- Manages session lifecycle and graceful shutdown
transport/ - Multi-transport support:
- unified-server.ts: Single HTTP server handling both WebSocket and SSE transports
- websocket-server-transport.ts: WebSocket client transport implementation
- sse-server-transport.ts: Server-Sent Events client transport
- Both transports share the same authentication and session management
core/ - Database abstraction layer (PostgresMcp):
- connection-manager.ts: Pool management for multiple database environments
- query-runner.ts: SQL query execution with schema path handling
- transaction-manager.ts: Transaction lifecycle (BEGIN/COMMIT/ROLLBACK)
- metadata-browser.ts: Schema introspection (tables, views, functions, etc.)
- bulk-helpers.ts: Batch insert operations
- diagnostics.ts: Query analysis and performance diagnostics
tools/ - MCP tool registration:
- Each file (metadata.ts, query.ts, data.ts, diagnostics.ts) registers a group of MCP tools
- Tools use zod schemas for input validation
- Tools delegate to PostgresMcp core methods
session/ - Session management:
- Per-client session tracking with unique session IDs
- Transaction-to-session binding (transactions are bound to the session's client)
- Query concurrency limits per session
- Automatic stale session cleanup
config/ - Configuration system:
- Supports JSON configuration files with environment variable resolution (ENV:VAR_NAME syntax)
- Three-tier override: config file → environment variables → CLI arguments
- Validation using zod schemas
- Multiple database environments per server
auth/ - Authentication:
- Token-based authentication (Bearer tokens in WebSocket/SSE handshake)
- Verification occurs at connection time (both WebSocket upgrade and SSE endpoint)
audit/ - Audit logging:
- JSON Lines format for structured logging
- SQL parameter redaction for security
- Configurable output (stdout or file)
health/ - Health monitoring:
- /health endpoint provides server status and per-environment connection status
- Includes active connection counts and pool statistics
changelog/ - Version tracking:

/changelog endpoint exposes version history without authentication
Version information automatically synced from changelog.json
Used for tracking system updates and changes

Key Design Patterns

Environment Isolation: Each configured database "environment" (e.g., "drworks", "ipworkstation") has:

Isolated connection pool
Independent schema search paths
Separate permission modes (readonly/readwrite/ddl)
Per-environment query timeouts

Session-Transaction Binding:

When pg_begin_transaction is called, a dedicated database client is bound to that session
All subsequent queries in that session use the same client until commit/rollback
Sessions are automatically cleaned up on disconnect or timeout
This prevents transaction leaks across different AI clients

Schema Path Resolution:

Tools accept optional schema parameter
Resolution order: tool parameter → environment defaultSchema → environment searchPath
Search path is set per-query using PostgreSQL's SET search_path

Unified Transport Architecture:

Single HTTP server handles both WebSocket (upgrade requests) and SSE (GET /sse)
Transport-agnostic MCP server implementation
Both transports use the same authentication, session management, and tool registration

Configuration

Configuration File Structure

Configuration uses config/database.json (see config/database.example.json for template).

{
  "server": {
    "listen": { "host": "0.0.0.0", "port": 7700 },
    "auth": { "type": "token", "token": "ENV:MCP_AUTH_TOKEN" },
    "allowUnauthenticatedRemote": false,
    "maxConcurrentClients": 50,
    "logLevel": "info"
  },
  "environments": {
    "drworks": {
      "type": "postgres",
      "connection": {
        "host": "localhost",
        "port": 5432,
        "database": "shcis_drworks_cpoe_pg",
        "user": "postgres",
        "password": "ENV:MCP_DRWORKS_PASSWORD",
        "ssl": { "require": true }
      },
      "defaultSchema": "dbo",
      "searchPath": ["dbo", "api", "nurse"],
      "pool": { "max": 10, "idleTimeoutMs": 30000 },
      "statementTimeoutMs": 60000,
      "slowQueryMs": 2000,
      "mode": "readwrite"
    }
  },
  "audit": {
    "enabled": true,
    "output": "stdout",
    "format": "json",
    "redactParams": true,
    "maxSqlLength": 200
  }
}

Configuration Fields

server - Global server settings:

listen.host/port: Listen address (default: 0.0.0.0:7700)
auth.type: Authentication type (token | mtls | none)
auth.token: Bearer token value (supports ENV: prefix)
allowUnauthenticatedRemote: Allow listening on non-localhost without auth (default: false, use with caution)
maxConcurrentClients: Max WebSocket connections (default: 50)
logLevel: Log level (debug | info | warn | error)

environments - Database connection configurations:

Each environment is an isolated connection pool with unique name
type: Database type (currently only postgres)
connection: Standard PostgreSQL connection parameters
defaultSchema: Default schema when not specified in tool calls
searchPath: Array of schemas for PostgreSQL search_path
pool.max: Max connections in pool (default: 10)
pool.idleTimeoutMs: Idle connection timeout (default: 30000)
statementTimeoutMs: Query timeout (default: 60000)
slowQueryMs: Slow query threshold for warnings (default: 2000)
mode: Permission mode (readonly | readwrite | ddl)

audit - Audit logging configuration:

enabled: Enable audit logging (default: true)
output: Output destination (stdout or file path)
format: Log format (json recommended)
redactParams: Redact SQL parameters (default: true)
maxSqlLength: Max SQL preview length (default: 200)

Environment Variable Resolution

Environment variables can be referenced using ENV:VAR_NAME syntax:

{
  "password": "ENV:MCP_DRWORKS_PASSWORD"
}

Naming Convention: MCP_<ENVIRONMENT>_<FIELD>

Environment names are uppercased
Non-alphanumeric chars become underscores
Example: drworks → MCP_DRWORKS_PASSWORD
Nested fields use double underscore: ssl.ca → MCP_DRWORKS_SSL__CA

Priority order (highest to lowest):

CLI arguments (--auth-token, --listen, --log-level)
Environment variables (MCP_AUTH_TOKEN, MCP_LISTEN, MCP_LOG_LEVEL)
Configuration file

Multi-Schema Access Examples

Scenario 1: Use default schema

// Uses environment's defaultSchema ("dbo")
await client.callTool('pg_list_tables', {
  environment: 'drworks'
});

Scenario 2: Switch to specific schema

// Temporarily use "api" schema
await client.callTool('pg_list_tables', {
  environment: 'drworks',
  schema: 'api'
});

Scenario 3: Custom search path

// Query with custom search path priority
await client.callTool('pg_query', {
  environment: 'drworks',
  searchPath: ['nurse', 'dbo', 'api'],
  query: 'SELECT * FROM patient_info'  // Searches nurse → dbo → api
});

Key Implementation Notes

Security and Authentication

Token Authentication (default in v1.0.0):

Uses Bearer tokens in HTTP Authorization header during WebSocket/SSE handshake
Token format: 64-character hex string (256-bit random, generated via crypto.randomBytes(32))
Token verification happens at connection time via verifyClient hook
Failed authentication returns HTTP 401 or WebSocket close code 1008
Token can be configured via: CLI args → env vars → config file

Token Transmission:

GET / HTTP/1.1
Upgrade: websocket
Authorization: Bearer <64-char-token>

Security Best Practices:

Always use wss:// (WebSocket over TLS) in production
Never log tokens in audit logs (only first 8 chars as clientId)
Rotate tokens periodically
Use environment variables for token storage, not config files
Enable SSL for database connections (ssl.require: true)
Use mode: "readonly" for read-only access scenarios

mTLS Support (reserved for future):

Mutual TLS authentication via client certificates
Configured via auth.type: "mtls" with CA/cert/key paths

Audit Logging

Format: JSON Lines (one JSON object per line)

{
  "timestamp": "2025-12-23T10:30:45.123Z",
  "level": "audit",
  "sessionId": "550e8400-e29b-41d4-a716-446655440000",
  "clientId": "abc12345",
  "environment": "drworks",
  "tool": "pg_query",
  "sqlHash": "a3b2c1d4",
  "sqlPreview": "SELECT * FROM dbo.users WHERE id = $1 LIMIT 100",
  "params": "[REDACTED]",
  "durationMs": 234,
  "rowCount": 15,
  "status": "success"
}

Privacy Protection:

SQL parameters are never logged (always "[REDACTED]")
SQL preview is truncated to maxSqlLength (default: 200 chars)
String literals in SQL are replaced with [STRING], numbers with [NUMBER]
Error messages are sanitized to remove potential data leaks

Slow Query Warnings:

Queries exceeding slowQueryMs (default: 2000ms) trigger warning logs
Includes sqlHash for correlation with audit logs
Used for performance monitoring without exposing full queries

Concurrency and Session Management

Architecture:

WebSocket Connection → Session (UUID) → Query Queue → Connection Pool
                                      ↓
                              Transaction Client (if active)

Concurrency Limits:

Global: maxConcurrentClients (default: 50) - max WebSocket connections
Per-Session: maxQueriesPerSession (default: 5) - concurrent queries per client
Per-Environment: pool.max (default: 10) - database connections per environment

Session Isolation:

Each WebSocket connection gets a unique sessionId (UUID v4)
Sessions track activeQueries count and enforce per-session limits
Sessions automatically timeout after sessionTimeout (default: 1 hour)
Stale sessions are cleaned up every 60 seconds

Transaction Binding:

pg_begin_transaction acquires a dedicated client from the pool
This client is stored in session.transactionClient and bound to the session
All subsequent queries in that environment use this client (not the pool)
pg_commit_transaction or pg_rollback_transaction releases the client
If client disconnects during transaction, automatic ROLLBACK occurs
This prevents transaction state leaks across different AI clients

Implementation Detail (src/session/session-manager.ts:80-120):

// Transaction begins
session.transactionClient = await pool.connect();
session.transactionEnv = environmentName;

// Subsequent queries route to transaction client
if (session.transactionClient && session.transactionEnv === env) {
  return session.transactionClient.query(sql);
}

// On disconnect or timeout
if (session.transactionClient) {
  await session.transactionClient.query('ROLLBACK');
  session.transactionClient.release();
}

Transaction Safety:

The TransactionManager stores transaction clients in a WeakMap keyed by session ID
On disconnect, the SessionManager automatically rolls back active transactions
Never use pool.query() for operations within a transaction; always use the session-bound client

Connection Pool Lifecycle:

Pools are lazily created on first use (getPool method)
Each pool has configurable max connections, idle timeout, and statement timeout
Graceful shutdown closes all pools via PostgresConnectionManager.closeAll()

MCP Tool Registration:

Tools are registered in tools/index.ts by calling registration functions from each tool category
Each tool must have a unique name (prefixed with pg_)
Tool schemas use zod for validation; the MCP SDK handles schema conversion

Error Handling:

Database errors are caught and returned as MCP error responses
The server never crashes on query errors; only fatal startup errors exit the process
Audit logger sanitizes SQL and redacts parameters before logging

Client Configuration

MCP Client Setup

AI clients (Claude Code, Cursor, etc.) connect via MCP client configuration:

{
  "mcpServers": {
    "database": {
      "transport": "websocket",
      "endpoint": "ws://localhost:7700",
      "headers": {
        "Authorization": "Bearer your-token-here"
      }
    }
  }
}

For SSE transport (added in v1.0.0.1):

{
  "mcpServers": {
    "database": {
      "url": "http://localhost:7700/sse",
      "headers": {
        "Authorization": "Bearer your-token-here"
      }
    }
  }
}

Available MCP Tools

The server exposes 30+ PostgreSQL tools grouped by category:

Metadata Tools:

pg_list_environments - List configured environments
pg_list_schemas - List schemas in environment
pg_list_tables - List tables in schema
pg_describe_table - Get table structure (columns, types, constraints)
pg_list_views - List views
pg_list_functions - List functions
pg_list_indexes - List indexes
pg_list_constraints - List constraints
pg_list_triggers - List triggers

Query Tools:

pg_query - Execute read-only SELECT query
pg_explain - Get query execution plan (EXPLAIN)

Data Manipulation Tools:

pg_insert - Insert single row
pg_update - Update rows
pg_delete - Delete rows
pg_upsert - Insert or update (ON CONFLICT)
pg_bulk_insert - Batch insert multiple rows

Transaction Tools:

pg_begin_transaction - Start transaction
pg_commit_transaction - Commit transaction
pg_rollback_transaction - Rollback transaction

Diagnostic Tools:

pg_analyze_query - Analyze query performance
pg_check_connection - Verify database connectivity

All tools require environment parameter to specify which database to use.

Version History and Roadmap

Changelog Maintenance (IMPORTANT)

Every code change MUST be documented in the changelog. This is a critical project requirement.

How to Update Changelog

When making any code changes:

Update changelog.json in the project root:
- Increment the version number following semantic versioning
- Use format: major.minor.patch or major.minor.patch-buildnumber (e.g., 1.0.1.03)
- Add/update the version entry with:
  - version: New version number
  - date: Current date (YYYY-MM-DD format)
  - description: Brief summary of changes (Chinese or English)
  - changes: Array of specific changes made
Version number is auto-synced:
- The server automatically reads currentVersion from changelog.json
- No need to manually update package.json or server.ts
- The /changelog endpoint exposes full version history
Example changelog entry:

{
  "version": "1.0.1.04",
  "date": "2024-12-25",
  "description": "添加新功能",
  "changes": [
    "新增 XXX 功能",
    "修复 YYY bug",
    "优化 ZZZ 性能"
  ]
}

View changelog:

# Via HTTP endpoint (no authentication required)
curl http://localhost:7700/changelog

# Or check the file directly
cat changelog.json

Remember: Documentation is as important as code. Always update the changelog before committing!

Version History

v1.0.0 (Initial Release)

WebSocket transport with custom WebSocketServerTransport implementation
Token-based authentication
Multi-environment configuration with per-environment connection pools
Multi-schema access via defaultSchema and searchPath
Session-based transaction management
Audit logging with SQL parameter redaction
Health check endpoint
Docker support with graceful shutdown
30+ PostgreSQL MCP tools

v1.0.1 (2024-12-21)

Added SSE transport to support clients without WebSocket (e.g., cursor-browser-extension)
Unified server now handles both WebSocket and SSE on same port
SSE endpoint: GET /sse for stream, POST /messages for client messages
Backward compatible with v1.0.0 WebSocket clients

v1.0.1.01 (2024-12-22)

Fixed connection pool leak issue
Fixed SSE disconnect/reconnect logic
Improved error handling

v1.0.1.02 (2024-12-23)

Fixed ssl.require: false configuration not taking effect
Improved SSL configuration validation logic
Updated documentation for SSL configuration

v1.0.1.03 (2024-12-24)

Added allowUnauthenticatedRemote configuration option
Allow explicit enabling of unauthenticated remote access in trusted networks
Improved security validation error messages
New /changelog endpoint to view version update history (no authentication required)

Future Roadmap

Multi-database support (SQL Server, MySQL adapters)
mTLS authentication implementation
RBAC (role-based access control) for fine-grained permissions
Rate limiting and quota management per client
Configuration hot reload
Metrics and Prometheus export

Testing Notes

The project currently has placeholder tests. When adding tests:

Create tests under __tests__/ directory
Use the connection manager's withClient method for database interaction in tests
Test files should use .test.ts extension
Consider testing transaction rollback behavior and session cleanup

Deployment and Operations

Docker Deployment

Build Image:

docker build -t mcp-database-server:1.0.1 .

Run Container:

docker run -d \
  --name mcp-database-server \
  -p 7700:7700 \
  -v $(pwd)/config/database.json:/app/config/database.json:ro \
  -e MCP_AUTH_TOKEN=your-token \
  -e MCP_DRWORKS_PASSWORD=your-password \
  mcp-database-server:1.0.1

Docker Compose:

docker compose up -d

Health Check

curl http://localhost:7700/health

Response format:

{
  "status": "ok",
  "uptime": 3600,
  "version": "1.0.0",
  "clients": 5,
  "environments": [
    {
      "name": "drworks",
      "status": "connected",
      "poolSize": 10,
      "activeConnections": 2
    }
  ],
  "timestamp": "2025-12-23T10:30:00.000Z"
}

Status values: ok (all connected) | degraded (some disconnected) | error (critical failure)

Changelog Endpoint

View version history and system updates:

curl http://localhost:7700/changelog