chore: snapshot current working tree changes
Save all currently accumulated repository changes as a backup snapshot for Gitea so no local work is lost.
This commit is contained in:
404
pdf_processor/API_DOCUMENTATION.md
Normal file
404
pdf_processor/API_DOCUMENTATION.md
Normal file
@@ -0,0 +1,404 @@
|
||||
# 📚 PDF Processor Microservice - API Документация
|
||||
|
||||
## 🚀 Быстрый старт
|
||||
|
||||
**Base URL:** `http://147.45.146.17:8300`
|
||||
|
||||
**Аутентификация:** Все POST endpoints требуют заголовок `X-API-Key`
|
||||
|
||||
---
|
||||
|
||||
## 📋 Endpoints
|
||||
|
||||
### 1. Health Check
|
||||
|
||||
```http
|
||||
GET /health
|
||||
```
|
||||
|
||||
**Ответ:**
|
||||
```json
|
||||
{
|
||||
"status": "healthy",
|
||||
"timestamp": "2025-12-29T12:00:00",
|
||||
"version": "1.0.0",
|
||||
"redis_connected": true
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Синхронная обработка
|
||||
|
||||
```http
|
||||
POST /process
|
||||
Headers:
|
||||
X-API-Key: your-api-key
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
**Запрос:**
|
||||
```json
|
||||
{
|
||||
"data": {
|
||||
"files": [
|
||||
{
|
||||
"file": {
|
||||
"url": "https://example.com/document.pdf",
|
||||
"file_name": "document.pdf"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
"mode": "--pdf-merge"
|
||||
}
|
||||
```
|
||||
|
||||
**Ответ (успех):**
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"result": [
|
||||
{
|
||||
"group": "group_0",
|
||||
"session_token": "...",
|
||||
"pages": 5,
|
||||
"merged_base64": "JVBERi0xLjQKJeLjz9MK..."
|
||||
}
|
||||
],
|
||||
"processing_time": 2.5,
|
||||
"logs": "..."
|
||||
}
|
||||
```
|
||||
|
||||
**Ответ (ошибка):**
|
||||
```json
|
||||
{
|
||||
"success": false,
|
||||
"error": "Script execution failed",
|
||||
"processing_time": 1.2,
|
||||
"logs": "..."
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 3. Асинхронная обработка
|
||||
|
||||
```http
|
||||
POST /process/async
|
||||
Headers:
|
||||
X-API-Key: your-api-key
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
**Запрос:** (тот же формат, что и синхронный)
|
||||
|
||||
**Ответ:**
|
||||
```json
|
||||
{
|
||||
"task_id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"status": "pending",
|
||||
"message": "Task created successfully",
|
||||
"status_url": "/process/status/550e8400-e29b-41d4-a716-446655440000"
|
||||
}
|
||||
```
|
||||
|
||||
**Когда использовать:**
|
||||
- Большие файлы (> 10MB)
|
||||
- Множественные файлы (> 5 файлов)
|
||||
- Долгая обработка (> 30 секунд)
|
||||
|
||||
---
|
||||
|
||||
### 4. Проверка статуса задачи
|
||||
|
||||
```http
|
||||
GET /process/status/{task_id}
|
||||
Headers:
|
||||
X-API-Key: your-api-key
|
||||
```
|
||||
|
||||
**Ответ (pending/processing):**
|
||||
```json
|
||||
{
|
||||
"task_id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"status": "processing",
|
||||
"created_at": "2025-12-29T12:00:00",
|
||||
"updated_at": "2025-12-29T12:01:30",
|
||||
"result": null,
|
||||
"error": null,
|
||||
"processing_time": null,
|
||||
"logs": null
|
||||
}
|
||||
```
|
||||
|
||||
**Ответ (completed):**
|
||||
```json
|
||||
{
|
||||
"task_id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"status": "completed",
|
||||
"created_at": "2025-12-29T12:00:00",
|
||||
"updated_at": "2025-12-29T12:05:30",
|
||||
"result": [...],
|
||||
"processing_time": 330.5,
|
||||
"logs": "..."
|
||||
}
|
||||
```
|
||||
|
||||
**Ответ (failed):**
|
||||
```json
|
||||
{
|
||||
"task_id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"status": "failed",
|
||||
"created_at": "2025-12-29T12:00:00",
|
||||
"updated_at": "2025-12-29T12:02:15",
|
||||
"result": null,
|
||||
"error": "Script execution timed out",
|
||||
"processing_time": 300.0,
|
||||
"logs": "..."
|
||||
}
|
||||
```
|
||||
|
||||
**Статусы:**
|
||||
- `pending` - Задача создана, ожидает обработки
|
||||
- `processing` - Задача выполняется
|
||||
- `completed` - Задача завершена успешно
|
||||
- `failed` - Задача завершена с ошибкой
|
||||
|
||||
---
|
||||
|
||||
### 5. Метрики (JSON)
|
||||
|
||||
```http
|
||||
GET /metrics
|
||||
Headers:
|
||||
X-API-Key: your-api-key
|
||||
```
|
||||
|
||||
**Ответ:**
|
||||
```json
|
||||
{
|
||||
"total_requests": 150,
|
||||
"successful_requests": 142,
|
||||
"failed_requests": 8,
|
||||
"async_tasks_created": 25,
|
||||
"async_tasks_completed": 23,
|
||||
"average_processing_time": 3.45,
|
||||
"errors_by_type": {
|
||||
"timeout": 3,
|
||||
"script_error": 4,
|
||||
"json_parse_error": 1
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 6. Метрики (Prometheus)
|
||||
|
||||
```http
|
||||
GET /metrics/prometheus
|
||||
Headers:
|
||||
X-API-Key: your-api-key
|
||||
Accept: text/plain
|
||||
```
|
||||
|
||||
**Ответ:**
|
||||
```
|
||||
# HELP pdf_processor_total_requests Total number of requests
|
||||
# TYPE pdf_processor_total_requests counter
|
||||
pdf_processor_total_requests 150
|
||||
|
||||
# HELP pdf_processor_successful_requests Number of successful requests
|
||||
# TYPE pdf_processor_successful_requests counter
|
||||
pdf_processor_successful_requests 142
|
||||
|
||||
...
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 🔧 Примеры использования
|
||||
|
||||
### Python
|
||||
|
||||
```python
|
||||
import requests
|
||||
|
||||
API_URL = "http://147.45.146.17:8300"
|
||||
API_KEY = "your-api-key"
|
||||
headers = {"X-API-Key": API_KEY}
|
||||
|
||||
# Синхронная обработка
|
||||
response = requests.post(
|
||||
f"{API_URL}/process",
|
||||
headers=headers,
|
||||
json={
|
||||
"data": {
|
||||
"files": [{
|
||||
"file": {
|
||||
"url": "https://example.com/file.pdf",
|
||||
"file_name": "document.pdf"
|
||||
}
|
||||
}]
|
||||
},
|
||||
"mode": "--pdf-merge"
|
||||
}
|
||||
)
|
||||
print(response.json())
|
||||
|
||||
# Асинхронная обработка
|
||||
response = requests.post(
|
||||
f"{API_URL}/process/async",
|
||||
headers=headers,
|
||||
json={
|
||||
"data": {
|
||||
"files": [{
|
||||
"file": {
|
||||
"url": "https://example.com/large-file.pdf",
|
||||
"file_name": "large-document.pdf"
|
||||
}
|
||||
}]
|
||||
},
|
||||
"mode": "--pdf-merge"
|
||||
}
|
||||
)
|
||||
task_id = response.json()["task_id"]
|
||||
|
||||
# Проверка статуса
|
||||
import time
|
||||
while True:
|
||||
status_response = requests.get(
|
||||
f"{API_URL}/process/status/{task_id}",
|
||||
headers=headers
|
||||
)
|
||||
status = status_response.json()
|
||||
|
||||
if status["status"] == "completed":
|
||||
print("Результат:", status["result"])
|
||||
break
|
||||
elif status["status"] == "failed":
|
||||
print("Ошибка:", status["error"])
|
||||
break
|
||||
|
||||
time.sleep(2) # Проверяем каждые 2 секунды
|
||||
```
|
||||
|
||||
### cURL
|
||||
|
||||
```bash
|
||||
# Синхронная обработка
|
||||
curl -X POST http://147.45.146.17:8300/process \
|
||||
-H "X-API-Key: your-api-key" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"data": {
|
||||
"files": [{
|
||||
"file": {
|
||||
"url": "https://example.com/file.pdf",
|
||||
"file_name": "document.pdf"
|
||||
}
|
||||
}]
|
||||
},
|
||||
"mode": "--pdf-merge"
|
||||
}'
|
||||
|
||||
# Асинхронная обработка
|
||||
TASK_ID=$(curl -X POST http://147.45.146.17:8300/process/async \
|
||||
-H "X-API-Key: your-api-key" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"data": {
|
||||
"files": [{
|
||||
"file": {
|
||||
"url": "https://example.com/large-file.pdf",
|
||||
"file_name": "large-document.pdf"
|
||||
}
|
||||
}]
|
||||
},
|
||||
"mode": "--pdf-merge"
|
||||
}' | jq -r '.task_id')
|
||||
|
||||
# Проверка статуса
|
||||
curl -X GET http://147.45.146.17:8300/process/status/$TASK_ID \
|
||||
-H "X-API-Key: your-api-key"
|
||||
|
||||
# Метрики
|
||||
curl -X GET http://147.45.146.17:8300/metrics \
|
||||
-H "X-API-Key: your-api-key"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 📊 Мониторинг
|
||||
|
||||
### Интеграция с Prometheus
|
||||
|
||||
Добавьте в `prometheus.yml`:
|
||||
|
||||
```yaml
|
||||
scrape_configs:
|
||||
- job_name: 'pdf_processor'
|
||||
static_configs:
|
||||
- targets: ['147.45.146.17:8300']
|
||||
metrics_path: '/metrics/prometheus'
|
||||
basic_auth:
|
||||
username: 'api_key'
|
||||
password: 'your-api-key'
|
||||
```
|
||||
|
||||
### Grafana Dashboard
|
||||
|
||||
Метрики доступны для визуализации в Grafana:
|
||||
- Общее количество запросов
|
||||
- Успешные/неудачные запросы
|
||||
- Среднее время обработки
|
||||
- Ошибки по типам
|
||||
- Асинхронные задачи
|
||||
|
||||
---
|
||||
|
||||
## 🔒 Безопасность
|
||||
|
||||
1. **API Key** - Все POST/GET endpoints требуют валидный API ключ
|
||||
2. **TTL задач** - Результаты асинхронных задач хранятся 1 час
|
||||
3. **Таймауты** - Максимальное время обработки: 5 минут
|
||||
4. **Валидация** - Все входные данные валидируются через Pydantic
|
||||
|
||||
---
|
||||
|
||||
## 🐛 Обработка ошибок
|
||||
|
||||
### Типы ошибок:
|
||||
|
||||
- `script_error` - Ошибка выполнения bash скрипта
|
||||
- `json_parse_error` - Ошибка парсинга JSON результата
|
||||
- `timeout` - Превышено время ожидания (5 минут)
|
||||
- `exception` - Неожиданная ошибка
|
||||
- `unhandled_exception` - Критическая ошибка
|
||||
|
||||
### Коды HTTP:
|
||||
|
||||
- `200` - Успех
|
||||
- `401` - Не указан API ключ
|
||||
- `403` - Неверный API ключ
|
||||
- `404` - Задача не найдена
|
||||
- `500` - Внутренняя ошибка сервера
|
||||
- `501` - Функция не реализована
|
||||
- `504` - Таймаут
|
||||
|
||||
---
|
||||
|
||||
## 📝 Swagger UI
|
||||
|
||||
Интерактивная документация доступна по адресу:
|
||||
- **Swagger UI:** http://147.45.146.17:8300/docs
|
||||
- **ReDoc:** http://147.45.146.17:8300/redoc
|
||||
|
||||
---
|
||||
|
||||
**Версия:** 1.0.0
|
||||
**Дата обновления:** 2025-12-29
|
||||
|
||||
|
||||
Reference in New Issue
Block a user