359 lines
7.1 KiB
Markdown
359 lines
7.1 KiB
Markdown
# HTML to PDF 服务 (MVP版本)
|
||
|
||
基于 PuppeteerSharp + Chromium 的 HTML 转 PDF 微服务。
|
||
|
||
## 功能特性
|
||
|
||
- ✅ HTML 内容转 PDF
|
||
- ✅ URL 转 PDF
|
||
- ✅ **HTML 内容转图片**(PNG/JPEG/WebP)
|
||
- ✅ **URL 转图片**(支持自定义分辨率)
|
||
- ✅ 浏览器实例池化(高性能)
|
||
- ✅ 并发控制
|
||
- ✅ 本地文件存储(可选)
|
||
- ✅ 回调机制(全局 + 请求级)
|
||
- ✅ Docker 容器化部署
|
||
- ✅ 健康检查接口
|
||
|
||
## 技术栈
|
||
|
||
- .NET 8
|
||
- PuppeteerSharp 20.2.5
|
||
- Chromium (Headless)
|
||
- Docker
|
||
|
||
## 快速开始
|
||
|
||
### 方式一:Docker Compose(推荐)
|
||
|
||
```bash
|
||
# 1. 构建并启动服务
|
||
docker-compose up -d
|
||
|
||
# 2. 查看日志
|
||
docker-compose logs -f pdf-service
|
||
|
||
# 3. 测试服务
|
||
curl http://localhost:5000/health
|
||
```
|
||
|
||
### 方式二:Docker 构建
|
||
|
||
```bash
|
||
# 1. 构建镜像
|
||
docker build -t html-to-pdf-service:latest .
|
||
|
||
# 2. 运行容器
|
||
docker run -d \
|
||
--name pdf-service \
|
||
-p 5000:5000 \
|
||
-v $(pwd)/pdfs:/app/pdfs \
|
||
-e PdfService__BrowserPool__MaxConcurrent=5 \
|
||
html-to-pdf-service:latest
|
||
|
||
# 3. 查看日志
|
||
docker logs -f pdf-service
|
||
```
|
||
|
||
### 方式三:本地开发
|
||
|
||
```bash
|
||
# 1. 切换到 src 目录
|
||
cd src
|
||
|
||
# 2. 还原依赖
|
||
dotnet restore
|
||
|
||
# 3. 运行服务
|
||
cd HtmlToPdfService.Api
|
||
dotnet run
|
||
```
|
||
|
||
## API 使用示例
|
||
|
||
### 1. HTML 内容转 PDF
|
||
|
||
```bash
|
||
curl -X POST http://localhost:5000/api/pdf/convert/html \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"html": "<html><body><h1>Hello World</h1><p>这是一个测试</p></body></html>",
|
||
"options": {
|
||
"format": "A4",
|
||
"landscape": false,
|
||
"printBackground": true,
|
||
"margin": {
|
||
"top": "10mm",
|
||
"right": "10mm",
|
||
"bottom": "10mm",
|
||
"left": "10mm"
|
||
}
|
||
},
|
||
"saveLocal": true
|
||
}' \
|
||
--output test.pdf
|
||
```
|
||
|
||
### 2. URL 转 PDF
|
||
|
||
```bash
|
||
curl -X POST http://localhost:5000/api/pdf/convert/url \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"url": "https://www.baidu.com",
|
||
"waitUntil": "networkidle2",
|
||
"timeout": 30000,
|
||
"options": {
|
||
"format": "A4"
|
||
}
|
||
}' \
|
||
--output baidu.pdf
|
||
```
|
||
|
||
### 3. 带回调的转换
|
||
|
||
```bash
|
||
curl -X POST http://localhost:5000/api/pdf/convert/html \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"html": "<html><body><h1>Test with Callback</h1></body></html>",
|
||
"callback": {
|
||
"url": "https://your-callback-server.com/webhook",
|
||
"headers": {
|
||
"X-API-Key": "your-api-key"
|
||
},
|
||
"includePdfData": false
|
||
},
|
||
"saveLocal": true
|
||
}' \
|
||
--output test.pdf
|
||
```
|
||
|
||
### 4. 下载已生成的 PDF
|
||
|
||
```bash
|
||
curl http://localhost:5000/api/pdf/download/{requestId} \
|
||
--output downloaded.pdf
|
||
```
|
||
|
||
### 5. HTML 转图片
|
||
|
||
```bash
|
||
# 基本用法
|
||
curl -X POST http://localhost:5000/api/image/convert/html \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"html": "<html><body><h1>转成图片</h1></body></html>",
|
||
"options": {
|
||
"format": "png",
|
||
"fullPage": true,
|
||
"width": 1920,
|
||
"height": 1080
|
||
}
|
||
}' \
|
||
--output screenshot.png
|
||
|
||
# JPEG 格式带质量设置
|
||
curl -X POST http://localhost:5000/api/image/convert/html \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"html": "<html><body><h1>高质量图片</h1></body></html>",
|
||
"options": {
|
||
"format": "jpeg",
|
||
"quality": 90,
|
||
"width": 1920,
|
||
"height": 1080
|
||
}
|
||
}' \
|
||
--output screenshot.jpg
|
||
```
|
||
|
||
### 6. URL 转图片
|
||
|
||
```bash
|
||
curl -X POST http://localhost:5000/api/image/convert/url \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"url": "https://www.baidu.com",
|
||
"options": {
|
||
"format": "png",
|
||
"fullPage": true,
|
||
"width": 1920,
|
||
"height": 1080
|
||
}
|
||
}' \
|
||
--output baidu.png
|
||
```
|
||
|
||
### 7. 健康检查
|
||
|
||
```bash
|
||
curl http://localhost:5000/health
|
||
```
|
||
|
||
#### 图片转换参数说明
|
||
|
||
| 参数 | 说明 | 默认值 |
|
||
|------|------|--------|
|
||
| `format` | 图片格式:png, jpeg, webp | png |
|
||
| `quality` | 图片质量 (0-100),仅 jpeg/webp | 90 |
|
||
| `fullPage` | 是否全页截图 | true |
|
||
| `width` | 视口宽度(像素) | 1920 |
|
||
| `height` | 视口高度(像素) | 1080 |
|
||
| `omitBackground` | 是否透明背景(仅 png) | false |
|
||
| `clip` | 截图区域(x, y, width, height) | null |
|
||
|
||
```
|
||
|
||
响应示例:
|
||
```json
|
||
{
|
||
"status": "Healthy",
|
||
"timestamp": "2024-12-10T10:30:00Z",
|
||
"browserPool": {
|
||
"totalInstances": 3,
|
||
"availableInstances": 2,
|
||
"maxInstances": 10
|
||
},
|
||
"queue": {
|
||
"currentTasks": 1,
|
||
"maxConcurrent": 5
|
||
}
|
||
}
|
||
```
|
||
|
||
## 配置说明
|
||
|
||
主要配置在 `appsettings.json` 中:
|
||
|
||
```json
|
||
{
|
||
"PdfService": {
|
||
"BrowserPool": {
|
||
"MaxInstances": 10, // 最大浏览器实例数
|
||
"MinInstances": 2, // 最小保持实例数(预热)
|
||
"MaxConcurrent": 5, // 最大并发转换数
|
||
"AcquireTimeout": 30000 // 获取实例超时(毫秒)
|
||
},
|
||
"Storage": {
|
||
"SaveLocalCopy": true, // 是否保存本地副本
|
||
"LocalPath": "/app/pdfs", // 存储路径
|
||
"RetentionHours": 24 // 文件保留时间(小时)
|
||
},
|
||
"Callback": {
|
||
"Enabled": true, // 是否启用回调
|
||
"DefaultUrl": "", // 默认回调 URL
|
||
"Timeout": 30000, // 回调超时(毫秒)
|
||
"IncludePdfData": false // 是否在回调中包含 PDF Base64
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### 环境变量配置
|
||
|
||
可以通过环境变量覆盖配置:
|
||
|
||
```bash
|
||
export PdfService__BrowserPool__MaxInstances=20
|
||
export PdfService__BrowserPool__MaxConcurrent=10
|
||
export PdfService__Storage__SaveLocalCopy=false
|
||
export PdfService__Callback__DefaultUrl=https://your-callback.com/webhook
|
||
```
|
||
|
||
## 回调格式
|
||
|
||
转换完成后,服务会向配置的回调 URL 发送 POST 请求:
|
||
|
||
```json
|
||
{
|
||
"requestId": "uuid-generated",
|
||
"status": "success",
|
||
"timestamp": "2024-12-10T10:30:00Z",
|
||
"duration": 1523,
|
||
"result": {
|
||
"fileSize": 102400,
|
||
"downloadUrl": "http://service/api/pdf/download/{requestId}",
|
||
"pdfBase64": "JVBERi0xLjQK...", // 如果 includePdfData: true
|
||
"expiresAt": "2024-12-11T10:30:00Z"
|
||
},
|
||
"source": {
|
||
"type": "html",
|
||
"content": "...",
|
||
"options": {}
|
||
},
|
||
"error": null
|
||
}
|
||
```
|
||
|
||
失败时:
|
||
```json
|
||
{
|
||
"requestId": "uuid-generated",
|
||
"status": "failed",
|
||
"timestamp": "2024-12-10T10:30:00Z",
|
||
"duration": 523,
|
||
"error": {
|
||
"code": "CONVERSION_FAILED",
|
||
"message": "页面加载超时",
|
||
"details": "..."
|
||
}
|
||
}
|
||
```
|
||
|
||
## 性能参数
|
||
|
||
| 指标 | 建议值 |
|
||
|------|--------|
|
||
| 内存 | 1-2GB |
|
||
| CPU | 1-2 核心 |
|
||
| 并发数 | 5-10 |
|
||
| 单次转换时间 | 0.5-5秒 |
|
||
|
||
## 故障排查
|
||
|
||
### 1. Chromium 启动失败
|
||
|
||
确保 Docker 容器有足够权限:
|
||
```bash
|
||
docker run --cap-add=SYS_ADMIN ...
|
||
```
|
||
|
||
或使用 `--no-sandbox` 参数(已默认配置)。
|
||
|
||
### 2. 中文字体显示问题
|
||
|
||
Dockerfile 已包含中文字体安装:
|
||
- fonts-wqy-zenhei
|
||
- fonts-wqy-microhei
|
||
|
||
### 3. 内存不足
|
||
|
||
调整 Docker 容器内存限制:
|
||
```yaml
|
||
deploy:
|
||
resources:
|
||
limits:
|
||
memory: 4G
|
||
```
|
||
|
||
## 开发计划
|
||
|
||
- [x] 核心转换功能
|
||
- [x] 浏览器池化
|
||
- [x] 回调机制
|
||
- [x] Docker 部署
|
||
- [ ] 任务异步处理
|
||
- [ ] 认证授权
|
||
- [ ] 管理后台
|
||
- [ ] Kubernetes 支持
|
||
|
||
## 许可证
|
||
|
||
MIT License
|
||
|
||
## 联系方式
|
||
|
||
如有问题或建议,请提交 Issue。
|
||
|