471 lines
10 KiB
Markdown
471 lines
10 KiB
Markdown
<div align="center">
|
||
|
||
# 🎨 HtmlToPdf Service
|
||
|
||
**高性能 HTML/URL 转 PDF/图片 微服务**
|
||
|
||
基于 .NET 9 + PuppeteerSharp + Chromium 构建
|
||
|
||
[](https://dotnet.microsoft.com/)
|
||
[](https://www.puppeteersharp.com/)
|
||
[](https://www.docker.com/)
|
||
[](LICENSE)
|
||
|
||
[功能特性](#-功能特性) •
|
||
[快速开始](#-快速开始) •
|
||
[API 文档](#-api-接口) •
|
||
[配置说明](#-配置说明) •
|
||
[部署指南](#-部署指南) •
|
||
[贡献指南](#-贡献)
|
||
|
||
</div>
|
||
|
||
---
|
||
|
||
## ✨ 功能特性
|
||
|
||
### 核心转换能力
|
||
- 🔄 **HTML → PDF** — 支持复杂 HTML 内容转换
|
||
- 🌐 **URL → PDF** — 支持任意网页转换,完美渲染 SPA(React/Vue/Angular)
|
||
- 🖼️ **HTML/URL → 图片** — 支持 PNG、JPEG、WebP 格式
|
||
- 📐 **自定义选项** — 纸张大小、方向、边距、页眉页脚等
|
||
|
||
### 高性能架构
|
||
- ⚡ **异步任务队列** — 基于 Redis 的任务队列,支持高并发
|
||
- 🔧 **浏览器池化** — 实例复用、自动重启、健康检查
|
||
- 📦 **批量处理** — 一次提交多个转换任务
|
||
- 💾 **结果缓存** — 相同内容避免重复转换
|
||
|
||
### 企业级安全
|
||
- 🔐 **API Key 认证** — 支持多租户 API Key 管理
|
||
- 🛡️ **请求限流** — IP/用户维度的速率限制
|
||
- 🚫 **SSRF 防护** — 阻止内网地址访问
|
||
- 🔑 **幂等性支持** — Idempotency-Key 防止重复提交
|
||
|
||
### 运维友好
|
||
- 📊 **Prometheus 监控** — 完整的指标采集
|
||
- 🏥 **健康检查** — Ready/Live 探针,支持 K8s
|
||
- 📝 **结构化日志** — Serilog 日志输出
|
||
- 🎛️ **管理后台** — Vue 3 构建的可视化管理界面
|
||
|
||
### 灵活存储
|
||
- 💿 **本地存储** — 适合单机部署
|
||
- ☁️ **阿里云 OSS** — 支持 CDN 加速
|
||
- ☁️ **腾讯云 COS** — 多云存储方案
|
||
|
||
---
|
||
|
||
## 📁 项目结构
|
||
|
||
```
|
||
html-to-pdf/
|
||
├── src/ # 正式版源代码
|
||
│ ├── HtmlToPdfService.Api/ # Web API 层
|
||
│ ├── HtmlToPdfService.Core/ # 核心业务层
|
||
│ ├── HtmlToPdfService.Queue/ # 任务队列层
|
||
│ ├── HtmlToPdfService.Infrastructure/ # 基础设施层
|
||
│ ├── HtmlToPdfService.Admin/ # 管理后台(Vue 3)
|
||
│ ├── HtmlToPdfService.Tests/ # 测试项目
|
||
│ ├── docker-compose.yml # Docker 编排文件
|
||
│ └── Dockerfile # Docker 构建文件
|
||
├── mvp/ # MVP 简化版(同步模式)
|
||
├── docs/ # 设计文档
|
||
└── README.md # 本文件
|
||
```
|
||
|
||
---
|
||
|
||
## 🚀 快速开始
|
||
|
||
### 方式一:Docker Compose(推荐)
|
||
|
||
```bash
|
||
# 克隆项目
|
||
git clone https://github.com/your-username/html-to-pdf.git
|
||
cd html-to-pdf/src
|
||
|
||
# 启动服务
|
||
docker-compose up -d
|
||
|
||
# 查看日志
|
||
docker-compose logs -f htmltopdf
|
||
```
|
||
|
||
服务启动后访问:
|
||
- **API**: http://localhost:5000
|
||
- **Swagger**: http://localhost:5000/swagger
|
||
- **健康检查**: http://localhost:5000/health
|
||
- **管理后台**: http://localhost:5000/admin
|
||
|
||
### 方式二:本地开发
|
||
|
||
```bash
|
||
# 1. 启动 Redis
|
||
docker run -d --name redis -p 6379:6379 redis:7-alpine
|
||
|
||
# 2. 进入源码目录
|
||
cd src
|
||
|
||
# 3. 还原依赖
|
||
dotnet restore
|
||
|
||
# 4. 运行服务
|
||
dotnet run --project HtmlToPdfService.Api
|
||
```
|
||
|
||
### 方式三:使用 MVP 简化版
|
||
|
||
如果你只需要简单的同步转换功能,可以使用 MVP 版本:
|
||
|
||
```bash
|
||
cd mvp
|
||
docker-compose up -d
|
||
```
|
||
|
||
> **MVP vs 正式版对比:**
|
||
> - MVP:同步模式,无需 Redis,适合低并发场景
|
||
> - 正式版:异步队列,支持高并发、任务管理、监控告警
|
||
|
||
---
|
||
|
||
## 📖 API 接口
|
||
|
||
### 异步任务接口(推荐)
|
||
|
||
#### 提交 PDF 转换任务
|
||
|
||
```bash
|
||
curl -X POST http://localhost:5000/api/tasks/pdf \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"source": {
|
||
"type": "html",
|
||
"content": "<h1>Hello World</h1><p>这是测试内容</p>"
|
||
},
|
||
"options": {
|
||
"format": "A4",
|
||
"printBackground": true
|
||
},
|
||
"callback": {
|
||
"url": "https://your-app.com/webhook"
|
||
}
|
||
}'
|
||
```
|
||
|
||
**响应:**
|
||
```json
|
||
{
|
||
"taskId": "550e8400-e29b-41d4-a716-446655440000",
|
||
"status": "pending",
|
||
"message": "任务已创建,正在排队处理",
|
||
"estimatedWaitTime": 3
|
||
}
|
||
```
|
||
|
||
#### 查询任务状态
|
||
|
||
```bash
|
||
curl http://localhost:5000/api/tasks/{taskId}
|
||
```
|
||
|
||
#### 下载结果文件
|
||
|
||
```bash
|
||
curl -O http://localhost:5000/api/tasks/{taskId}/download
|
||
```
|
||
|
||
#### 提交图片转换任务
|
||
|
||
```bash
|
||
curl -X POST http://localhost:5000/api/tasks/image \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"source": {
|
||
"type": "url",
|
||
"content": "https://www.example.com"
|
||
},
|
||
"options": {
|
||
"format": "png",
|
||
"fullPage": true,
|
||
"width": 1920,
|
||
"height": 1080
|
||
}
|
||
}'
|
||
```
|
||
|
||
#### 批量提交任务
|
||
|
||
```bash
|
||
curl -X POST http://localhost:5000/api/tasks/batch \
|
||
-H "Content-Type: application/json" \
|
||
-d '{
|
||
"tasks": [
|
||
{ "type": "pdf", "source": { "type": "html", "content": "<h1>Doc 1</h1>" } },
|
||
{ "type": "pdf", "source": { "type": "html", "content": "<h1>Doc 2</h1>" } }
|
||
]
|
||
}'
|
||
```
|
||
|
||
### 同步接口(MVP 兼容)
|
||
|
||
```bash
|
||
# HTML 转 PDF(同步返回)
|
||
curl -X POST http://localhost:5000/api/pdf/convert/html \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"html": "<h1>Hello World</h1>"}' \
|
||
-o document.pdf
|
||
|
||
# URL 转 PDF
|
||
curl -X POST http://localhost:5000/api/pdf/convert/url \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"url": "https://example.com"}' \
|
||
-o document.pdf
|
||
```
|
||
|
||
### 系统接口
|
||
|
||
```bash
|
||
# 健康检查
|
||
curl http://localhost:5000/health
|
||
|
||
# 就绪探针(K8s)
|
||
curl http://localhost:5000/health/ready
|
||
|
||
# Prometheus 指标
|
||
curl http://localhost:5000/metrics
|
||
```
|
||
|
||
---
|
||
|
||
## ⚙️ 配置说明
|
||
|
||
主要配置在 `appsettings.json` 中:
|
||
|
||
```json
|
||
{
|
||
"PdfService": {
|
||
"BrowserPool": {
|
||
"MaxInstances": 10,
|
||
"MinInstances": 2,
|
||
"MaxConcurrent": 5,
|
||
"MaxTasksPerBrowserInstance": 100,
|
||
"BrowserRestartMemoryMb": 600
|
||
},
|
||
"TaskQueue": {
|
||
"Type": "Redis",
|
||
"Redis": {
|
||
"ConnectionString": "localhost:6379"
|
||
},
|
||
"WorkerCount": 5
|
||
},
|
||
"Storage": {
|
||
"Type": "Local",
|
||
"LocalPath": "/app/files",
|
||
"RetentionHours": 168
|
||
},
|
||
"Security": {
|
||
"RequireAuthentication": false,
|
||
"BlockPrivateNetworks": true
|
||
},
|
||
"RateLimit": {
|
||
"Enabled": true,
|
||
"RequestsPerMinutePerIp": 60
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### 存储配置
|
||
|
||
<details>
|
||
<summary><b>本地存储(默认)</b></summary>
|
||
|
||
```json
|
||
{
|
||
"Storage": {
|
||
"Type": "Local",
|
||
"LocalPath": "/app/files",
|
||
"RetentionHours": 168,
|
||
"AutoCleanup": true
|
||
}
|
||
}
|
||
```
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>阿里云 OSS</b></summary>
|
||
|
||
```json
|
||
{
|
||
"Storage": {
|
||
"Type": "OSS",
|
||
"OSS": {
|
||
"Endpoint": "oss-cn-hangzhou.aliyuncs.com",
|
||
"AccessKeyId": "your-access-key-id",
|
||
"AccessKeySecret": "your-access-key-secret",
|
||
"BucketName": "your-bucket-name",
|
||
"PathPrefix": "htmltopdf",
|
||
"UrlExpireSeconds": 3600
|
||
}
|
||
}
|
||
}
|
||
```
|
||
</details>
|
||
|
||
<details>
|
||
<summary><b>腾讯云 COS</b></summary>
|
||
|
||
```json
|
||
{
|
||
"Storage": {
|
||
"Type": "COS",
|
||
"COS": {
|
||
"Region": "ap-guangzhou",
|
||
"SecretId": "your-secret-id",
|
||
"SecretKey": "your-secret-key",
|
||
"BucketName": "your-bucket-1250000000",
|
||
"PathPrefix": "htmltopdf",
|
||
"UrlExpireSeconds": 3600
|
||
}
|
||
}
|
||
}
|
||
```
|
||
</details>
|
||
|
||
### 环境变量配置
|
||
|
||
所有配置都可以通过环境变量覆盖:
|
||
|
||
```bash
|
||
export PdfService__BrowserPool__MaxInstances=20
|
||
export PdfService__TaskQueue__Redis__ConnectionString=redis:6379
|
||
export PdfService__Security__RequireAuthentication=true
|
||
```
|
||
|
||
---
|
||
|
||
## 🐳 部署指南
|
||
|
||
### Docker 单实例部署
|
||
|
||
```bash
|
||
cd src
|
||
docker-compose up -d
|
||
```
|
||
|
||
### Kubernetes 部署
|
||
|
||
```yaml
|
||
apiVersion: apps/v1
|
||
kind: Deployment
|
||
metadata:
|
||
name: htmltopdf-service
|
||
spec:
|
||
replicas: 3
|
||
template:
|
||
spec:
|
||
containers:
|
||
- name: htmltopdf
|
||
image: htmltopdf-service:2.0
|
||
resources:
|
||
limits:
|
||
memory: "2Gi"
|
||
cpu: "2000m"
|
||
env:
|
||
- name: PdfService__TaskQueue__Redis__ConnectionString
|
||
valueFrom:
|
||
secretKeyRef:
|
||
name: redis-secret
|
||
key: connection-string
|
||
```
|
||
|
||
### 生产环境建议
|
||
|
||
| 资源 | 单实例 | 集群(3实例) |
|
||
|------|--------|--------------|
|
||
| 内存 | 2-4 GB | 6-12 GB |
|
||
| CPU | 2-4 核心 | 6-12 核心 |
|
||
| 并发能力 | 50+ QPS | 150+ QPS |
|
||
|
||
---
|
||
|
||
## 📊 监控与告警
|
||
|
||
### Prometheus 指标
|
||
|
||
服务暴露 `/metrics` 端点,包含以下关键指标:
|
||
|
||
- `conversion_tasks_total` - 转换任务总数
|
||
- `conversion_duration_seconds` - 转换耗时分布
|
||
- `browser_pool_instances` - 浏览器实例状态
|
||
- `task_queue_length` - 队列长度
|
||
|
||
### Grafana Dashboard
|
||
|
||
推荐监控面板:
|
||
- 任务成功率
|
||
- 平均处理时间
|
||
- 队列积压情况
|
||
- 浏览器池使用率
|
||
|
||
---
|
||
|
||
## 🛠️ 技术栈
|
||
|
||
| 组件 | 技术 | 版本 |
|
||
|------|------|------|
|
||
| 运行时 | .NET | 9.0 |
|
||
| 浏览器控制 | PuppeteerSharp | 20.x |
|
||
| 渲染引擎 | Chromium | 自动下载 |
|
||
| 任务队列 | Redis | 7.0+ |
|
||
| 前端管理 | Vue 3 + Element Plus | 3.x |
|
||
| 容器化 | Docker | 20.x |
|
||
| 监控 | Prometheus | 2.40+ |
|
||
|
||
---
|
||
|
||
## 📈 性能指标
|
||
|
||
| 场景 | 吞吐量 | 响应时间 |
|
||
|------|--------|----------|
|
||
| 简单 HTML → PDF | 100+ QPS | < 3s |
|
||
| 复杂 HTML → PDF | 50+ QPS | < 10s |
|
||
| URL → PDF | 30+ QPS | < 15s |
|
||
| HTML → 图片 | 150+ QPS | < 2s |
|
||
|
||
---
|
||
|
||
## 🤝 贡献
|
||
|
||
欢迎提交 Issue 和 Pull Request!
|
||
|
||
1. Fork 本仓库
|
||
2. 创建功能分支 (`git checkout -b feature/AmazingFeature`)
|
||
3. 提交更改 (`git commit -m 'Add some AmazingFeature'`)
|
||
4. 推送到分支 (`git push origin feature/AmazingFeature`)
|
||
5. 提交 Pull Request
|
||
|
||
---
|
||
|
||
## 📄 许可证
|
||
|
||
本项目采用 [MIT 许可证](LICENSE) 开源。
|
||
|
||
---
|
||
|
||
## 📞 联系方式
|
||
|
||
- **Issues**: [GitHub Issues](https://github.com/your-username/html-to-pdf/issues)
|
||
- **Discussions**: [GitHub Discussions](https://github.com/your-username/html-to-pdf/discussions)
|
||
|
||
---
|
||
|
||
<div align="center">
|
||
|
||
**如果这个项目对你有帮助,请给一个 ⭐ Star!**
|
||
|
||
Made with ❤️ by the HtmlToPdf Team
|
||
|
||
</div>
|
||
|