database/CLAUDE.md
2025-12-27 16:21:09 +08:00

22 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project Overview

MCP Database Server is a WebSocket/SSE-based PostgreSQL tooling service that exposes database operations through the Model Context Protocol (MCP). It allows AI clients to interact with multiple PostgreSQL databases through a unified, authenticated interface.

Background and Goals

Problem Statement: The original MCP implementation used STDIO transport, requiring each AI client to install and maintain a local copy of the codebase. This caused maintenance overhead and made it difficult to share database access across multiple clients or hosts.

Solution: A long-running server that:

  • Runs as a daemon/container accessible from multiple AI clients
  • Supports remote transports (WebSocket and SSE)
  • Provides centralized authentication and audit logging
  • Enables multi-database/multi-schema access through a single endpoint
  • Eliminates per-client installation requirements

Key Design Decisions (from v1.0.0):

  1. Transport Layer: MCP SDK lacks server-side WebSocket support, so we implemented custom WebSocketServerTransport and SSEServerTransport classes
  2. Multi-Schema Access: Single configuration supports multiple PostgreSQL databases with different schemas accessible via environment parameter
  3. Authentication: Token-based (Bearer) authentication by default; mTLS support reserved for future
  4. Concurrency Model: Per-client session isolation with independent connection pools
  5. Code Separation: Complete separation from original STDIO-based codebase; this is a standalone server implementation

Build and Development Commands

Build

npm run build

Compiles TypeScript to JavaScript in the dist/ directory.

Start Server

# Production
npm start

# Development (with hot reload)
npm run dev

Generate Authentication Token

node scripts/generate-token.js

Generates a secure 64-character hex token for Bearer authentication.

Test Database Connection

npx tsx scripts/test-connection.ts

IMPORTANT: After making any code changes, always update changelog.json with version number, date, and description of changes. See "Version History and Roadmap" section for details.

Architecture

Core Layers

The codebase is organized into distinct layers:

  1. server.ts - Main entry point that:

    • Loads and validates configuration
    • Initializes the UnifiedServerManager (handles both WebSocket and SSE)
    • Creates PostgresMcp instance for database operations
    • Manages session lifecycle and graceful shutdown
  2. transport/ - Multi-transport support:

    • unified-server.ts: Single HTTP server handling both WebSocket and SSE transports
    • websocket-server-transport.ts: WebSocket client transport implementation
    • sse-server-transport.ts: Server-Sent Events client transport
    • Both transports share the same authentication and session management
  3. core/ - Database abstraction layer (PostgresMcp):

    • connection-manager.ts: Pool management for multiple database environments
    • query-runner.ts: SQL query execution with schema path handling
    • transaction-manager.ts: Transaction lifecycle (BEGIN/COMMIT/ROLLBACK)
    • metadata-browser.ts: Schema introspection (tables, views, functions, etc.)
    • bulk-helpers.ts: Batch insert operations
    • diagnostics.ts: Query analysis and performance diagnostics
  4. tools/ - MCP tool registration:

    • Each file (metadata.ts, query.ts, data.ts, diagnostics.ts) registers a group of MCP tools
    • Tools use zod schemas for input validation
    • Tools delegate to PostgresMcp core methods
  5. session/ - Session management:

    • Per-client session tracking with unique session IDs
    • Transaction-to-session binding (transactions are bound to the session's client)
    • Query concurrency limits per session
    • Automatic stale session cleanup
  6. config/ - Configuration system:

    • Supports JSON configuration files with environment variable resolution (ENV:VAR_NAME syntax)
    • Three-tier override: config file → environment variables → CLI arguments
    • Validation using zod schemas
    • Multiple database environments per server
  7. auth/ - Authentication:

    • Token-based authentication (Bearer tokens in WebSocket/SSE handshake)
    • Verification occurs at connection time (both WebSocket upgrade and SSE endpoint)
  8. audit/ - Audit logging:

    • JSON Lines format for structured logging
    • SQL parameter redaction for security
    • Configurable output (stdout or file)
  9. health/ - Health monitoring:

    • /health endpoint provides server status and per-environment connection status
    • Includes active connection counts and pool statistics
  10. changelog/ - Version tracking:

  • /changelog endpoint exposes version history without authentication
  • Version information automatically synced from changelog.json
  • Used for tracking system updates and changes

Key Design Patterns

Environment Isolation: Each configured database "environment" (e.g., "drworks", "ipworkstation") has:

  • Isolated connection pool
  • Independent schema search paths
  • Separate permission modes (readonly/readwrite/ddl)
  • Per-environment query timeouts

Session-Transaction Binding:

  • When pg_begin_transaction is called, a dedicated database client is bound to that session
  • All subsequent queries in that session use the same client until commit/rollback
  • Sessions are automatically cleaned up on disconnect or timeout
  • This prevents transaction leaks across different AI clients

Schema Path Resolution:

  • Tools accept optional schema parameter
  • Resolution order: tool parameter → environment defaultSchema → environment searchPath
  • Search path is set per-query using PostgreSQL's SET search_path

Unified Transport Architecture:

  • Single HTTP server handles both WebSocket (upgrade requests) and SSE (GET /sse)
  • Transport-agnostic MCP server implementation
  • Both transports use the same authentication, session management, and tool registration

Configuration

Configuration File Structure

Configuration uses config/database.json (see config/database.example.json for template).

{
  "server": {
    "listen": { "host": "0.0.0.0", "port": 7700 },
    "auth": { "type": "token", "token": "ENV:MCP_AUTH_TOKEN" },
    "allowUnauthenticatedRemote": false,
    "maxConcurrentClients": 50,
    "logLevel": "info"
  },
  "environments": {
    "drworks": {
      "type": "postgres",
      "connection": {
        "host": "localhost",
        "port": 5432,
        "database": "shcis_drworks_cpoe_pg",
        "user": "postgres",
        "password": "ENV:MCP_DRWORKS_PASSWORD",
        "ssl": { "require": true }
      },
      "defaultSchema": "dbo",
      "searchPath": ["dbo", "api", "nurse"],
      "pool": { "max": 10, "idleTimeoutMs": 30000 },
      "statementTimeoutMs": 60000,
      "slowQueryMs": 2000,
      "mode": "readwrite"
    }
  },
  "audit": {
    "enabled": true,
    "output": "stdout",
    "format": "json",
    "redactParams": true,
    "maxSqlLength": 200
  }
}

Configuration Fields

server - Global server settings:

  • listen.host/port: Listen address (default: 0.0.0.0:7700)
  • auth.type: Authentication type (token | mtls | none)
  • auth.token: Bearer token value (supports ENV: prefix)
  • allowUnauthenticatedRemote: Allow listening on non-localhost without auth (default: false, use with caution)
  • maxConcurrentClients: Max WebSocket connections (default: 50)
  • logLevel: Log level (debug | info | warn | error)

environments - Database connection configurations:

  • Each environment is an isolated connection pool with unique name
  • type: Database type (currently only postgres)
  • connection: Standard PostgreSQL connection parameters
  • defaultSchema: Default schema when not specified in tool calls
  • searchPath: Array of schemas for PostgreSQL search_path
  • pool.max: Max connections in pool (default: 10)
  • pool.idleTimeoutMs: Idle connection timeout (default: 30000)
  • statementTimeoutMs: Query timeout (default: 60000)
  • slowQueryMs: Slow query threshold for warnings (default: 2000)
  • mode: Permission mode (readonly | readwrite | ddl)

audit - Audit logging configuration:

  • enabled: Enable audit logging (default: true)
  • output: Output destination (stdout or file path)
  • format: Log format (json recommended)
  • redactParams: Redact SQL parameters (default: true)
  • maxSqlLength: Max SQL preview length (default: 200)

Environment Variable Resolution

Environment variables can be referenced using ENV:VAR_NAME syntax:

{
  "password": "ENV:MCP_DRWORKS_PASSWORD"
}

Naming Convention: MCP_<ENVIRONMENT>_<FIELD>

  • Environment names are uppercased
  • Non-alphanumeric chars become underscores
  • Example: drworksMCP_DRWORKS_PASSWORD
  • Nested fields use double underscore: ssl.caMCP_DRWORKS_SSL__CA

Priority order (highest to lowest):

  1. CLI arguments (--auth-token, --listen, --log-level)
  2. Environment variables (MCP_AUTH_TOKEN, MCP_LISTEN, MCP_LOG_LEVEL)
  3. Configuration file

Multi-Schema Access Examples

Scenario 1: Use default schema

// Uses environment's defaultSchema ("dbo")
await client.callTool('pg_list_tables', {
  environment: 'drworks'
});

Scenario 2: Switch to specific schema

// Temporarily use "api" schema
await client.callTool('pg_list_tables', {
  environment: 'drworks',
  schema: 'api'
});

Scenario 3: Custom search path

// Query with custom search path priority
await client.callTool('pg_query', {
  environment: 'drworks',
  searchPath: ['nurse', 'dbo', 'api'],
  query: 'SELECT * FROM patient_info'  // Searches nurse → dbo → api
});

Key Implementation Notes

Security and Authentication

Token Authentication (default in v1.0.0):

  • Uses Bearer tokens in HTTP Authorization header during WebSocket/SSE handshake
  • Token format: 64-character hex string (256-bit random, generated via crypto.randomBytes(32))
  • Token verification happens at connection time via verifyClient hook
  • Failed authentication returns HTTP 401 or WebSocket close code 1008
  • Token can be configured via: CLI args → env vars → config file

Token Transmission:

GET / HTTP/1.1
Upgrade: websocket
Authorization: Bearer <64-char-token>

Security Best Practices:

  • Always use wss:// (WebSocket over TLS) in production
  • Never log tokens in audit logs (only first 8 chars as clientId)
  • Rotate tokens periodically
  • Use environment variables for token storage, not config files
  • Enable SSL for database connections (ssl.require: true)
  • Use mode: "readonly" for read-only access scenarios

mTLS Support (reserved for future):

  • Mutual TLS authentication via client certificates
  • Configured via auth.type: "mtls" with CA/cert/key paths

Audit Logging

Format: JSON Lines (one JSON object per line)

{
  "timestamp": "2025-12-23T10:30:45.123Z",
  "level": "audit",
  "sessionId": "550e8400-e29b-41d4-a716-446655440000",
  "clientId": "abc12345",
  "environment": "drworks",
  "tool": "pg_query",
  "sqlHash": "a3b2c1d4",
  "sqlPreview": "SELECT * FROM dbo.users WHERE id = $1 LIMIT 100",
  "params": "[REDACTED]",
  "durationMs": 234,
  "rowCount": 15,
  "status": "success"
}

Privacy Protection:

  • SQL parameters are never logged (always "[REDACTED]")
  • SQL preview is truncated to maxSqlLength (default: 200 chars)
  • String literals in SQL are replaced with [STRING], numbers with [NUMBER]
  • Error messages are sanitized to remove potential data leaks

Slow Query Warnings:

  • Queries exceeding slowQueryMs (default: 2000ms) trigger warning logs
  • Includes sqlHash for correlation with audit logs
  • Used for performance monitoring without exposing full queries

Concurrency and Session Management

Architecture:

WebSocket Connection → Session (UUID) → Query Queue → Connection Pool
                                      ↓
                              Transaction Client (if active)

Concurrency Limits:

  1. Global: maxConcurrentClients (default: 50) - max WebSocket connections
  2. Per-Session: maxQueriesPerSession (default: 5) - concurrent queries per client
  3. Per-Environment: pool.max (default: 10) - database connections per environment

Session Isolation:

  • Each WebSocket connection gets a unique sessionId (UUID v4)
  • Sessions track activeQueries count and enforce per-session limits
  • Sessions automatically timeout after sessionTimeout (default: 1 hour)
  • Stale sessions are cleaned up every 60 seconds

Transaction Binding:

  • pg_begin_transaction acquires a dedicated client from the pool
  • This client is stored in session.transactionClient and bound to the session
  • All subsequent queries in that environment use this client (not the pool)
  • pg_commit_transaction or pg_rollback_transaction releases the client
  • If client disconnects during transaction, automatic ROLLBACK occurs
  • This prevents transaction state leaks across different AI clients

Implementation Detail (src/session/session-manager.ts:80-120):

// Transaction begins
session.transactionClient = await pool.connect();
session.transactionEnv = environmentName;

// Subsequent queries route to transaction client
if (session.transactionClient && session.transactionEnv === env) {
  return session.transactionClient.query(sql);
}

// On disconnect or timeout
if (session.transactionClient) {
  await session.transactionClient.query('ROLLBACK');
  session.transactionClient.release();
}

Transaction Safety:

  • The TransactionManager stores transaction clients in a WeakMap keyed by session ID
  • On disconnect, the SessionManager automatically rolls back active transactions
  • Never use pool.query() for operations within a transaction; always use the session-bound client

Connection Pool Lifecycle:

  • Pools are lazily created on first use (getPool method)
  • Each pool has configurable max connections, idle timeout, and statement timeout
  • Graceful shutdown closes all pools via PostgresConnectionManager.closeAll()

MCP Tool Registration:

  • Tools are registered in tools/index.ts by calling registration functions from each tool category
  • Each tool must have a unique name (prefixed with pg_)
  • Tool schemas use zod for validation; the MCP SDK handles schema conversion

Error Handling:

  • Database errors are caught and returned as MCP error responses
  • The server never crashes on query errors; only fatal startup errors exit the process
  • Audit logger sanitizes SQL and redacts parameters before logging

Client Configuration

MCP Client Setup

AI clients (Claude Code, Cursor, etc.) connect via MCP client configuration:

{
  "mcpServers": {
    "database": {
      "transport": "websocket",
      "endpoint": "ws://localhost:7700",
      "headers": {
        "Authorization": "Bearer your-token-here"
      }
    }
  }
}

For SSE transport (added in v1.0.0.1):

{
  "mcpServers": {
    "database": {
      "url": "http://localhost:7700/sse",
      "headers": {
        "Authorization": "Bearer your-token-here"
      }
    }
  }
}

Available MCP Tools

The server exposes 30+ PostgreSQL tools grouped by category:

Metadata Tools:

  • pg_list_environments - List configured environments
  • pg_list_schemas - List schemas in environment
  • pg_list_tables - List tables in schema
  • pg_describe_table - Get table structure (columns, types, constraints)
  • pg_list_views - List views
  • pg_list_functions - List functions
  • pg_list_indexes - List indexes
  • pg_list_constraints - List constraints
  • pg_list_triggers - List triggers

Query Tools:

  • pg_query - Execute read-only SELECT query
  • pg_explain - Get query execution plan (EXPLAIN)

Data Manipulation Tools:

  • pg_insert - Insert single row
  • pg_update - Update rows
  • pg_delete - Delete rows
  • pg_upsert - Insert or update (ON CONFLICT)
  • pg_bulk_insert - Batch insert multiple rows

Transaction Tools:

  • pg_begin_transaction - Start transaction
  • pg_commit_transaction - Commit transaction
  • pg_rollback_transaction - Rollback transaction

Diagnostic Tools:

  • pg_analyze_query - Analyze query performance
  • pg_check_connection - Verify database connectivity

All tools require environment parameter to specify which database to use.

Version History and Roadmap

Changelog Maintenance (IMPORTANT)

Every code change MUST be documented in the changelog. This is a critical project requirement.

How to Update Changelog

When making any code changes:

  1. Update changelog.json in the project root:

    • Increment the version number following semantic versioning
    • Use format: major.minor.patch or major.minor.patch-buildnumber (e.g., 1.0.1.03)
    • Add/update the version entry with:
      • version: New version number
      • date: Current date (YYYY-MM-DD format)
      • description: Brief summary of changes (Chinese or English)
      • changes: Array of specific changes made
  2. Version number is auto-synced:

    • The server automatically reads currentVersion from changelog.json
    • No need to manually update package.json or server.ts
    • The /changelog endpoint exposes full version history
  3. Example changelog entry:

{
  "version": "1.0.1.04",
  "date": "2024-12-25",
  "description": "添加新功能",
  "changes": [
    "新增 XXX 功能",
    "修复 YYY bug",
    "优化 ZZZ 性能"
  ]
}
  1. View changelog:
# Via HTTP endpoint (no authentication required)
curl http://localhost:7700/changelog

# Or check the file directly
cat changelog.json

Remember: Documentation is as important as code. Always update the changelog before committing!


Version History

v1.0.0 (Initial Release)

  • WebSocket transport with custom WebSocketServerTransport implementation
  • Token-based authentication
  • Multi-environment configuration with per-environment connection pools
  • Multi-schema access via defaultSchema and searchPath
  • Session-based transaction management
  • Audit logging with SQL parameter redaction
  • Health check endpoint
  • Docker support with graceful shutdown
  • 30+ PostgreSQL MCP tools

v1.0.1 (2024-12-21)

  • Added SSE transport to support clients without WebSocket (e.g., cursor-browser-extension)
  • Unified server now handles both WebSocket and SSE on same port
  • SSE endpoint: GET /sse for stream, POST /messages for client messages
  • Backward compatible with v1.0.0 WebSocket clients

v1.0.1.01 (2024-12-22)

  • Fixed connection pool leak issue
  • Fixed SSE disconnect/reconnect logic
  • Improved error handling

v1.0.1.02 (2024-12-23)

  • Fixed ssl.require: false configuration not taking effect
  • Improved SSL configuration validation logic
  • Updated documentation for SSL configuration

v1.0.1.03 (2024-12-24)

  • Added allowUnauthenticatedRemote configuration option
  • Allow explicit enabling of unauthenticated remote access in trusted networks
  • Improved security validation error messages
  • New /changelog endpoint to view version update history (no authentication required)

Future Roadmap

  • Multi-database support (SQL Server, MySQL adapters)
  • mTLS authentication implementation
  • RBAC (role-based access control) for fine-grained permissions
  • Rate limiting and quota management per client
  • Configuration hot reload
  • Metrics and Prometheus export

Testing Notes

The project currently has placeholder tests. When adding tests:

  • Create tests under __tests__/ directory
  • Use the connection manager's withClient method for database interaction in tests
  • Test files should use .test.ts extension
  • Consider testing transaction rollback behavior and session cleanup

Deployment and Operations

Docker Deployment

Build Image:

docker build -t mcp-database-server:1.0.1 .

Run Container:

docker run -d \
  --name mcp-database-server \
  -p 7700:7700 \
  -v $(pwd)/config/database.json:/app/config/database.json:ro \
  -e MCP_AUTH_TOKEN=your-token \
  -e MCP_DRWORKS_PASSWORD=your-password \
  mcp-database-server:1.0.1

Docker Compose:

docker compose up -d

Health Check

curl http://localhost:7700/health

Response format:

{
  "status": "ok",
  "uptime": 3600,
  "version": "1.0.0",
  "clients": 5,
  "environments": [
    {
      "name": "drworks",
      "status": "connected",
      "poolSize": 10,
      "activeConnections": 2
    }
  ],
  "timestamp": "2025-12-23T10:30:00.000Z"
}

Status values: ok (all connected) | degraded (some disconnected) | error (critical failure)

Changelog Endpoint

View version history and system updates:

curl http://localhost:7700/changelog

Response format:

{
  "currentVersion": "1.0.1.03",
  "changelog": [
    {
      "version": "1.0.1.03",
      "date": "2024-12-24",
      "description": "增强安全配置灵活性和更新日志功能",
      "changes": [
        "添加 allowUnauthenticatedRemote 配置选项",
        "允许在受信任网络中显式启用无认证远程访问",
        "改进安全验证错误提示信息",
        "新增 /changelog 端点查看版本更新历史"
      ]
    }
  ]
}

Note: This endpoint does not require authentication and can be accessed publicly.

Graceful Shutdown

The server handles SIGTERM and SIGINT signals:

  1. Stops accepting new connections
  2. Rolls back active transactions
  3. Closes all sessions
  4. Closes database connection pools
  5. Exits cleanly
# Docker
docker stop mcp-database-server

# Direct process
kill -TERM $(pgrep -f "node dist/src/server.js")

Command Line Options

node dist/src/server.js [options]

Options:
  --config <path>       Configuration file path (default: ./config/database.json)
  --listen <host:port>  Listen address (overrides config)
  --auth-token <token>  Auth token (overrides config)
  --log-level <level>   Log level (overrides config: debug/info/warn/error)

Environment variables override configuration file:

  • MCP_CONFIG - Configuration file path
  • MCP_LISTEN - Listen address (host:port)
  • MCP_AUTH_TOKEN - Authentication token
  • MCP_LOG_LEVEL - Log level
  • MCP_<ENV>_PASSWORD - Database password for environment