japi/apps/rmbg/settings.py
jingrow 319e6aad34 refactor: 实现严格的流水线式方案B,每GPU独立worker处理队列
- 架构重构:为每个GPU启动独立的队列处理worker,避免worker间竞争
- 单卡batch收集:每个worker只收集batch_size个请求,不再乘以GPU数量
- 设备绑定:每个worker固定绑定自己的model和device,不再轮询调度
- 处理逻辑:直接使用worker的model/device进行批处理,移除多GPU拆分逻辑
- 降级处理:OOM时使用当前worker的model/device进行单张处理
- 资源管理:更新cleanup方法,正确停止所有worker任务
- API更新:修复已弃用的PYTORCH_CUDA_ALLOC_CONF和torch_dtype参数

优势:
- 避免worker之间竞争和批次冲突
- 资源隔离,每个worker只使用自己的GPU
- 负载均衡,多worker并行处理提高吞吐量
- 易于扩展,GPU数量变化时自动调整worker数量
2025-12-16 16:36:41 +00:00

54 lines
2.0 KiB
Python
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

from pydantic_settings import BaseSettings
from typing import Optional
from functools import lru_cache
class Settings(BaseSettings):
# Japi Server 配置
host: str = "0.0.0.0"
port: int = 8106
debug: bool = False
# API路由配置
router_prefix: str = "/rmbg"
file_route: str = "/file"
batch_route: str = "/batch"
api_name: str = "remove_background"
upload_url: str = "http://images.jingrow.com:8080/api/v1/image"
# 图片保存配置
save_dir: str = "/home/www/public/files"
# Japi 静态资源下载URL
download_url: str = "http://files.jingrow.com/files"
# Jingrow Jcloud API 配置
jingrow_api_url: str = "https://cloud.jingrow.com"
jingrow_api_key: Optional[str] = None
jingrow_api_secret: Optional[str] = None
# 模型配置
model_path: str = "./models" # 本地模型文件夹路径(包含 model.safetensors 和 config.json
# HTTP 客户端连接池配置(用于下载图片)
http_max_connections: int = 200 # httpx 最大并发连接数(根据上行带宽和对端能力调整)
http_max_keepalive_connections: int = 100 # httpx 最大 keep-alive 空闲连接数
# 并发控制配置(推理侧)
max_workers: int = 60 # 线程池最大工作线程数根据CPU核心数调整22核44线程可设置20-30
batch_size: int = 8 # GPU批处理大小模型显存占用较大8是安全值16会导致OOM
# 队列聚合配置方案B批处理+队列模式)
batch_collect_interval: float = 0.05 # 批处理收集间隔50ms收集一次平衡延迟和吞吐量
batch_collect_timeout: float = 0.5 # 批处理收集超时即使未满batch_size500ms后也处理
request_timeout: float = 60.0 # 单个请求超时时间(秒)
enable_queue_batch: bool = True # 是否启用队列批处理模式(推荐开启)
class Config:
env_file = ".env"
@lru_cache()
def get_settings() -> Settings:
return Settings()
# 创建全局配置实例
settings = get_settings()