第24章 项目结构与规范 学习目标 理解 Python 项目结构的演进与最佳实践 掌握现代 Python 包管理工具链(pyproject.toml、uv、poetry) 熟练运用代码规范工具(Ruff、Black、isort、mypy) 理解配置管理的分层架构与安全实践 掌握文档体系的建设方法 了解代码重构的原则与模式 24.1 项目组织 24.1.1 标准库项目结构 Python 社区推荐使用 src 布局(src-layout),其核心优势在于强制区分包源码与测试代码 ,避免测试时意外导入未安装的本地包:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 myproject/ ├── src/ │ └── myproject/ │ ├── __init__.py # 包初始化,暴露公共 API │ ├── py.typed # PEP 561 类型标记 │ ├── core/ │ │ ├── __init__.py │ │ ├── models.py # 数据模型 │ │ ├── services.py # 业务逻辑 │ │ └── repositories.py # 数据访问 │ ├── api/ │ │ ├── __init__.py │ │ ├── routes.py # 路由定义 │ │ └── schemas.py # 请求/响应模式 │ ├── utils/ │ │ ├── __init__.py │ │ ├── logging.py # 日志配置 │ │ └── helpers.py # 通用工具 │ └── config.py # 配置管理 ├── tests/ │ ├── __init__.py │ ├── conftest.py # pytest 共享 fixtures │ ├── unit/ │ │ ├── __init__.py │ │ ├── test_models.py │ │ └── test_services.py │ ├── integration/ │ │ ├── __init__.py │ │ └── test_api.py │ └── e2e/ │ ├── __init__.py │ └── test_workflow.py ├── docs/ │ ├── conf.py │ ├── index.rst │ └── Makefile ├── scripts/ │ ├── setup_dev.sh │ └── migrate_db.py ├── .github/ │ ├── workflows/ │ │ └── ci.yml │ ├── CODEOWNERS │ └── PULL_REQUEST_TEMPLATE.md ├── .gitignore ├── .pre-commit-config.yaml ├ ├── LICENSE ├── README.md ├── CHANGELOG.md ├── pyproject.toml └── Makefile
src 布局 vs flat 布局对比 :
维度 src 布局 flat 布局 测试隔离 必须安装包才能测试 可能意外导入未安装代码 打包安全 避免意外包含测试文件 需显式排除 IDE 支持 需配置源路径 开箱即用 社区推荐 PyPA 推荐 简单项目可用
24.1.2 Web 应用项目结构 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 webapp/ ├── app/ │ ├── __init__.py # 应用工厂 │ ├── extensions.py # 扩展初始化 │ ├── models/ │ │ ├── __init__.py │ │ ├── user.py │ │ ├── post.py │ │ └── mixins.py # 可复用模型混入 │ ├── views/ │ │ ├── __init__.py │ │ ├── auth.py │ │ ├── dashboard.py │ │ └── api/ │ │ ├── __init__.py │ │ ├── v1/ │ │ │ ├── __init__.py │ │ │ ├── users.py │ │ │ └── posts.py │ │ └── v2/ │ │ └── __init__.py │ ├── services/ │ │ ├── __init__.py │ │ ├── auth_service.py │ │ └── email_service.py │ ├── forms/ │ │ ├── __init__.py │ │ └── auth_forms.py │ ├── templates/ │ │ ├── base.html │ │ ├── components/ │ │ │ ├── navbar.html │ │ │ └── pagination.html │ │ ├── auth/ │ │ │ ├── login.html │ │ │ └── register.html │ │ └── dashboard/ │ │ └── index.html │ ├── static/ │ │ ├── css/ │ │ ├── js/ │ │ └── img/ │ ├── middleware/ │ │ ├── __init__.py │ │ └── auth.py │ └── utils/ │ ├── __init__.py │ └── decorators.py ├── migrations/ │ ├── env.py │ ├── versions/ │ └── alembic.ini ├── tests/ │ ├── conftest.py │ ├── factories.py # 测试数据工厂 │ ├── unit/ │ └── integration/ ├── requirements/ │ ├── base.txt │ ├── dev.txt │ └── prod.txt ├── config.py # 配置类 ├── .env.example ├── .env # 本地环境变量(不提交) └── run.py # 入口脚本
24.1.3 数据科学项目结构 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 datascience/ ├── data/ │ ├── raw/ # 原始数据(只读) │ ├── interim/ # 中间处理数据 │ ├── processed/ # 最终处理数据 │ └── external/ # 外部数据源 ├── models/ │ ├── trained/ # 训练好的模型 │ └── predictions/ # 模型预测结果 ├── notebooks/ │ ├── 01_exploratory.ipynb │ ├── 02_feature_engineering.ipynb │ └── 03_model_training.ipynb ├── src/ │ ├── __init__.py │ ├── data/ │ │ ├── __init__.py │ │ ├── make_dataset.py │ │ └── preprocess.py │ ├── features/ │ │ ├── __init__.py │ │ └── build_features.py │ ├── models/ │ │ ├── __init__.py │ │ ├── train.py │ │ └── predict.py │ └── visualization/ │ ├── __init__.py │ └── visualize.py ├── tests/ ├── pyproject.toml ├── Makefile ├── dvc.yaml # DVC 流水线 └── params.yaml # 超参数配置
24.2 包管理 24.2.1 pyproject.toml 详解 pyproject.toml 是 PEP 518/621 定义的现代 Python 项目配置标准,统一了构建系统、项目元数据和工具配置:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 [build-system] requires = ["hatchling" ]build-backend = "hatchling.build" [project] name = "myproject" version = "1.0.0" description = "A production-grade Python application" readme = "README.md" license = "MIT" license-files = ["LICENSE" ]authors = [ {name = "Your Name" , email = "your.email@example.com" }, ] maintainers = [ {name = "Maintainer Name" , email = "maintainer@example.com" }, ] classifiers = [ "Development Status :: 5 - Production/Stable" , "Intended Audience :: Developers" , "License :: OSI Approved :: MIT License" , "Programming Language :: Python :: 3" , "Programming Language :: Python :: 3.10" , "Programming Language :: Python :: 3.11" , "Programming Language :: Python :: 3.12" , "Programming Language :: Python :: Implementation :: CPython" , "Topic :: Software Development :: Libraries" , "Typing :: Typed" , ] requires-python = ">=3.10" dependencies = [ "requests>=2.31.0" , "click>=8.1.0" , "pydantic>=2.0.0" , "sqlalchemy>=2.0.0" , "structlog>=23.0.0" , ] [project.optional-dependencies] dev = [ "pytest>=8.0.0" , "pytest-cov>=5.0.0" , "pytest-asyncio>=0.23.0" , "ruff>=0.4.0" , "mypy>=1.10.0" , "pre-commit>=3.7.0" , "commitizen>=3.27.0" , ] docs = [ "sphinx>=7.0.0" , "sphinx-rtd-theme>=2.0.0" , "myst-parser>=3.0.0" , ] test = [ "pytest>=8.0.0" , "pytest-cov>=5.0.0" , "pytest-asyncio>=0.23.0" , "hypothesis>=6.100.0" , "factory-boy>=3.3.0" , ] [project.scripts] myproject = "myproject.cli:main" [project.gui-scripts] myproject-gui = "myproject.gui:run" [project.entry-points."myproject.plugins"] builtin = "myproject.plugins.builtin" [project.urls] Homepage = "https://github.com/user/myproject" Documentation = "https://myproject.readthedocs.io" Repository = "https://github.com/user/myproject" Changelog = "https://github.com/user/myproject/blob/main/CHANGELOG.md" Issues = "https://github.com/user/myproject/issues"
24.2.2 包管理工具对比 特性 pip + venv Poetry uv Hatch 依赖解析 有限 完整 极快 完整 锁文件 requirements.txt poetry.lock uv.lock 无(可配) 虚拟环境 手动管理 自动管理 自动管理 自动管理 构建后端 setuptools poetry-core 无(用其他) hatchling 发布支持 twine 内置 无 内置 性能 基准 中等 极快 中等 成熟度 高 高 快速增长 中等
uv 工具使用 :
1 2 3 4 5 6 7 uv init myproject uv add requests pydantic uv add --dev pytest ruff mypy uv sync uv run pytest uv build uv publish
Poetry 使用 :
1 2 3 4 5 6 7 poetry new myproject poetry add requests pydantic poetry add --group dev pytest ruff mypy poetry install poetry run pytest poetry build poetry publish
24.2.3 依赖版本管理策略 1 2 3 4 5 6 7 8 9 10 11 # 精确版本(确定性构建,推荐生产环境) requests==2.31.0 # 兼容版本(允许补丁更新,推荐库开发) requests>=2.31.0,<3.0.0 # 插入符版本(与 ~2.31.0 等价) requests~=2.31.0 # 最小版本(不推荐,可能引入破坏性变更) requests>=2.31.0
依赖分组最佳实践 :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 [project.optional-dependencies] dev = [ "ruff>=0.4.0" , "mypy>=1.10.0" , "pre-commit>=3.7.0" , "ipython>=8.0.0" , ] test = [ "pytest>=8.0.0" , "pytest-cov>=5.0.0" , "hypothesis>=6.100.0" , ] docs = [ "sphinx>=7.0.0" , "sphinx-rtd-theme>=2.0.0" , ] all = [ "myproject[dev,test,docs]" , ]
24.3 代码规范 24.3.1 PEP 8 与现代代码风格 PEP 8 是 Python 代码风格的基础规范,但现代项目应结合以下扩展规范:
规范 关注领域 核心要点 PEP 8 代码风格 缩进、行宽、命名、空行 PEP 257 文档字符串 docstring 格式与内容 PEP 484 类型注解 静态类型标注 PEP 526 变量注解 变量类型标注语法 PEP 3134 异常链 raise ... from ...Google Style Guide 综合规范 Google 内部 Python 规范
命名约定 :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 class UserRepository : pass class HTTPClient : pass MAX_CONNECTIONS = 100 DEFAULT_TIMEOUT = 30 def calculate_total (items: list [dict ] ) -> float : pass def _internal_helper (data: bytes ) -> str : pass user_count: int = 0
Ruff 是用 Rust 编写的极速 Python Linter 和 Formatter,替代了 flake8、isort、pyupgrade 等多个工具:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 [tool.ruff] target-version = "py310" line-length = 88 src = ["src" ][tool.ruff.lint] select = [ "E" , "W" , "F" , "I" , "N" , "UP" , "B" , "SIM" , "TCH" , "RUF" , "PERF" , "PGH" , "PLC" , "PLE" , "PLR" , "PLW" , ] ignore = [ "E501" , "PLR0913" , ] [tool.ruff.lint.per-file-ignores] "tests/**" = ["PLR2004" , "S101" ][tool.ruff.lint.isort] known-first-party = ["myproject" ]force-single-line = true [tool.ruff.lint.pylint] max-args = 7 [tool.ruff.format] quote-style = "double" indent-style = "space" skip-magic-trailing-comma = false line-ending = "auto"
1 2 3 4 ruff check . ruff check --fix . ruff format . ruff format --check .
24.3.3 类型检查:mypy 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [tool.mypy] python_version = "3.12" strict = true warn_return_any = true warn_unused_configs = true disallow_untyped_defs = true disallow_any_generics = true check_untyped_defs = true no_implicit_optional = true warn_redundant_casts = true warn_unused_ignores = true warn_no_return = true strict_optional = true [[tool.mypy.overrides]] module = "third_party_lib.*" ignore_missing_imports = true [[tool.mypy.overrides]] module = "tests.*" disallow_untyped_defs = false
类型注解最佳实践 :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 from __future__ import annotationsfrom collections.abc import Callable , Sequence from typing import Any , Protocol, TypeVar, runtime_checkableT = TypeVar("T" ) @runtime_checkable class Comparable (Protocol ): def __lt__ (self, other: Any ) -> bool : ... def sort_items ( items: Sequence [T], key: Callable [[T], Any ] | None = None , *, reverse: bool = False , ) -> list [T]: return sorted (items, key=key, reverse=reverse) class DataStore : def __init__ (self, data: dict [str , Any ] | None = None ) -> None : self ._data: dict [str , Any ] = data or {} def get (self, key: str , default: T ) -> str | T: return self ._data.get(key, default) def set (self, key: str , value: Any ) -> None : self ._data[key] = value
24.3.4 pre-commit 配置 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 repos: - repo: https://github.com/pre-commit/pre-commit-hooks rev: v4.6.0 hooks: - id: trailing-whitespace - id: end-of-file-fixer - id: check-yaml - id: check-toml - id: check-json - id: check-merge-conflict - id: check-added-large-files args: ['--maxkb=500' ] - id: detect-private-key - id: debug-statements - id: no -commit-to-branch args: ['--branch' , 'main' ] - repo: https://github.com/astral-sh/ruff-pre-commit rev: v0.4.0 hooks: - id: ruff args: ['--fix' , '--exit-non-zero-on-fix' ] - id: ruff-format - repo: https://github.com/pre-commit/mirrors-mypy rev: v1.10.0 hooks: - id: mypy additional_dependencies: [pydantic>=2.0 ] args: ['--strict' ]
24.4 配置管理 24.4.1 分层配置架构 1 2 3 4 5 6 7 8 9 10 11 12 13 ┌─────────────────────────────────────────────┐ │ Layer 1: 代码内默认值 │ │ (硬编码的合理默认值,零配置即可运行) │ ├─────────────────────────────────────────────┤ │ Layer 2: 配置文件 │ │ (pyproject.toml / config.yaml / .env) │ ├─────────────────────────────────────────────┤ │ Layer 3: 环境变量 │ │ (覆盖配置文件,适合容器化部署) │ ├─────────────────────────────────────────────┤ │ Layer 4: 命令行参数 │ │ (最高优先级,用于临时覆盖) │ └─────────────────────────────────────────────┘
24.4.2 基于类的配置 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 from __future__ import annotationsimport osfrom pathlib import Pathfrom dataclasses import dataclass, fieldfrom enum import Enumclass Environment (Enum ): DEVELOPMENT = "development" TESTING = "testing" STAGING = "staging" PRODUCTION = "production" @dataclass class DatabaseConfig : url: str = "sqlite:///app.db" pool_size: int = 5 max_overflow: int = 10 echo: bool = False connect_timeout: int = 30 @dataclass class RedisConfig : url: str = "redis://localhost:6379/0" max_connections: int = 50 socket_timeout: int = 5 @dataclass class SecurityConfig : secret_key: str = "change-me-in-production" algorithm: str = "HS256" access_token_expire_minutes: int = 30 refresh_token_expire_days: int = 7 bcrypt_rounds: int = 12 @dataclass class LoggingConfig : level: str = "INFO" format : str = "json" file_path: Path | None = None max_bytes: int = 10 * 1024 * 1024 backup_count: int = 5 @dataclass class AppConfig : env: Environment = Environment.DEVELOPMENT debug: bool = True base_dir: Path = field(default_factory=lambda : Path(__file__).parent.parent) database: DatabaseConfig = field(default_factory=DatabaseConfig) redis: RedisConfig = field(default_factory=RedisConfig) security: SecurityConfig = field(default_factory=SecurityConfig) logging: LoggingConfig = field(default_factory=LoggingConfig) @classmethod def from_env (cls ) -> AppConfig: env_name = os.getenv("APP_ENV" , "development" ) env = Environment(env_name) config = cls( env=env, debug=os.getenv("APP_DEBUG" , "true" ).lower() == "true" , database=DatabaseConfig( url=os.getenv("DATABASE_URL" , "sqlite:///app.db" ), pool_size=int (os.getenv("DB_POOL_SIZE" , "5" )), echo=os.getenv("DB_ECHO" , "false" ).lower() == "true" , ), security=SecurityConfig( secret_key=os.getenv("SECRET_KEY" , "change-me-in-production" ), access_token_expire_minutes=int ( os.getenv("ACCESS_TOKEN_EXPIRE_MINUTES" , "30" ) ), ), logging=LoggingConfig( level=os.getenv("LOG_LEVEL" , "INFO" ), format =os.getenv("LOG_FORMAT" , "json" ), file_path=Path(log_path) if (log_path := os.getenv("LOG_FILE_PATH" )) else None , ), ) if env == Environment.PRODUCTION: config.debug = False if config.security.secret_key == "change-me-in-production" : raise ValueError("SECRET_KEY must be set in production" ) return config
24.4.3 Pydantic Settings 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 from pydantic_settings import BaseSettings, SettingsConfigDictfrom pydantic import Field, SecretStr, field_validatorclass Settings (BaseSettings ): model_config = SettingsConfigDict( env_file=".env" , env_file_encoding="utf-8" , env_prefix="APP_" , env_nested_delimiter="__" , case_sensitive=False , ) app_name: str = "MyApp" environment: str = Field(default="development" , pattern="^(development|staging|production)$" ) debug: bool = False database_url: SecretStr = Field(default="sqlite:///app.db" ) database_pool_size: int = Field(default=5 , ge=1 , le=100 ) redis_url: str = "redis://localhost:6379/0" secret_key: SecretStr = Field(default="change-me" ) access_token_expire_minutes: int = Field(default=30 , ge=1 ) log_level: str = Field(default="INFO" , pattern="^(DEBUG|INFO|WARNING|ERROR|CRITICAL)$" ) @field_validator("secret_key" ) @classmethod def validate_secret_key (cls, v: SecretStr, info ) -> SecretStr: if info.data.get("environment" ) == "production" and v.get_secret_value() == "change-me" : raise ValueError("SECRET_KEY must be set in production" ) return v class Database : url: str = "sqlite:///app.db" pool_size: int = 5 settings = Settings()
24.4.4 环境变量安全实践 1 2 3 4 5 6 APP_ENV=production APP_DEBUG=false APP_DATABASE_URL=postgresql://user:pass@db:5432/app APP_SECRET_KEY=${VAULT_SECRET_KEY} APP_LOG_LEVEL=INFO APP_REDIS_URL=redis://redis:6379/0
1 2 3 .env .env.local .env.*.local
安全原则 :
绝不提交密钥 :所有敏感信息通过环境变量注入提供 .env.example :列出所有需要的环境变量及示例值运行时验证 :启动时检查必要的环境变量是否已设置最小权限 :每个环境仅配置该环境所需的最小权限密钥轮换 :定期更换密钥,支持无缝切换24.5 文档体系 24.5.1 README.md 规范 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 # Project Name > One-line description of what this project does ## Features - Feature 1- Feature 2- Feature 3## Quick Start ### Prerequisites - Python >= 3.10- uv (recommended) or pip### Installation ```bash uv pip install myproject
Basic Usage 1 2 3 4 from myproject import Appapp = App() app.run()
Documentation Full documentation is available at docs.example.com .
Development 1 2 3 4 git clone https://github.com/user/myproject.git cd myprojectuv sync uv run pytest
Contributing Please read CONTRIBUTING.md for details.
License This project is licensed under the MIT License - see LICENSE .
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 ### 24.5.2 API 文档 ```python from __future__ import annotations from typing import Any def calculate_discount( price: float, discount_rate: float, *, min_price: float = 0.0, max_discount: float | None = None, ) -> float: """Calculate the discounted price of an item. Applies the given discount rate to the original price, with optional constraints on minimum price and maximum discount amount. Args: price: The original price of the item. Must be non-negative. discount_rate: The discount rate as a decimal (e.g., 0.15 for 15%). Must be between 0.0 and 1.0. min_price: The minimum allowable price after discount. Defaults to 0.0. max_discount: The maximum discount amount allowed. If None, no cap is applied. Defaults to None. Returns: The discounted price, clamped to [min_price, price]. Raises: ValueError: If price is negative or discount_rate is out of range. Examples: >>> calculate_discount(100.0, 0.15) 85.0 >>> calculate_discount(100.0, 0.50, max_discount=30.0) 70.0 >>> calculate_discount(100.0, 0.90, min_price=20.0) 20.0 """ if price < 0: raise ValueError(f"price must be non-negative, got {price}") if not 0.0 <= discount_rate <= 1.0: raise ValueError( f"discount_rate must be between 0.0 and 1.0, got {discount_rate}" ) discount_amount = price * discount_rate if max_discount is not None: discount_amount = min(discount_amount, max_discount) discounted = price - discount_amount return max(discounted, min_price)
24.5.3 架构决策记录(ADR) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 # ADR-001: 选择 SQLAlchemy 作为 ORM ## 状态 已接受 ## 背景 项目需要与关系型数据库交互,需要选择一个 ORM 框架。 ## 决策 选择 SQLAlchemy 2.0 作为 ORM 框架。 ## 理由 1. **成熟度** :SQLAlchemy 是 Python 生态中最成熟的 ORM,社区支持广泛2. **2.0 版本改进** :原生支持 async、类型注解、dataclass 集成3. **灵活性** :支持从高层 ORM 到底层 SQL 的多级抽象4. **性能** :2.0 版本在批量操作和查询性能上有显著提升## 备选方案 - Django ORM:与 Django 深度绑定,不适合独立使用- Tortoise ORM:异步优先但生态较小- SQLModel:基于 SQLAlchemy 但更年轻,稳定性待验证## 影响 - 团队需要学习 SQLAlchemy 2.0 的新 API- 可以利用 Alembic 进行数据库迁移- 需要注意 async session 的正确使用方式
24.6 代码重构 24.6.1 重构原则 重构的核心原则来自 Martin Fowler 的经典著作《重构:改善既有代码的设计》:
小步前进 :每次只做一个小改动,确保每步都可编译通过测试保障 :重构前确保有足够的测试覆盖行为不变 :重构不改变代码的外部可观察行为持续重构 :遵循”三次法则”——第三次做类似的事时重构24.6.2 代码异味与重构手法 代码异味 描述 重构手法 Long Method 方法过长,难以理解 提取方法、以查询替代临时变量 Large Class 类承担过多职责 提取类、单一职责原则 Long Parameter List 参数过多 引入参数对象、保持对象完整 Divergent Change 一个类因不同原因变化 提取类 Shotgun Surgery 一个变更影响多个类 移动方法/字段、内联类 Feature Envy 方法过度使用其他类数据 移动方法、提取方法 Data Clumps 数据项总是一起出现 提取类 Primitive Obsession 过度使用基本类型 以对象替代数据值 Switch Statements 复杂条件逻辑 以多态替代条件、以策略模式替代 Speculative Generality 过度设计 折叠层次、内联类
24.6.3 重构实战示例 重构前 :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 def process_order (data: dict ) -> dict : if not data.get("items" ): raise ValueError("Order must have items" ) total = 0 for item in data["items" ]: if item.get("price" ) and item.get("quantity" ): if item.get("discount" ): item_total = item["price" ] * item["quantity" ] * (1 - item["discount" ]) else : item_total = item["price" ] * item["quantity" ] total += item_total if data.get("coupon" ): if data["coupon" ] == "SAVE10" : total = total * 0.9 elif data["coupon" ] == "SAVE20" : total = total * 0.8 if total > 1000 : total = total * 0.95 tax = total * 0.08 return { "subtotal" : total, "tax" : tax, "total" : total + tax, "items_count" : len (data["items" ]), }
重构后 :
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 from __future__ import annotationsfrom dataclasses import dataclass, fieldfrom enum import Enumfrom functools import reduceclass Coupon (Enum ): SAVE10 = ("SAVE10" , 0.10 ) SAVE20 = ("SAVE20" , 0.20 ) def __init__ (self, code: str , discount_rate: float ) -> None : self .code = code self .discount_rate = discount_rate @classmethod def from_code (cls, code: str ) -> Coupon | None : return next ((c for c in cls if c.code == code), None ) @dataclass class OrderItem : price: float quantity: int discount: float = 0.0 @property def subtotal (self ) -> float : return self .price * self .quantity * (1 - self .discount) @dataclass class Order : items: list [OrderItem] = field(default_factory=list ) coupon: Coupon | None = None bulk_discount_threshold: float = 1000.0 bulk_discount_rate: float = 0.05 tax_rate: float = 0.08 @property def items_count (self ) -> int : return len (self .items) @property def subtotal (self ) -> float : base = sum (item.subtotal for item in self .items) after_coupon = self ._apply_coupon(base) after_bulk = self ._apply_bulk_discount(after_coupon) return after_bulk @property def tax (self ) -> float : return self .subtotal * self .tax_rate @property def total (self ) -> float : return self .subtotal + self .tax def to_dict (self ) -> dict : return { "subtotal" : round (self .subtotal, 2 ), "tax" : round (self .tax, 2 ), "total" : round (self .total, 2 ), "items_count" : self .items_count, } def _apply_coupon (self, amount: float ) -> float : if self .coupon is None : return amount return amount * (1 - self .coupon.discount_rate) def _apply_bulk_discount (self, amount: float ) -> float : if amount > self .bulk_discount_threshold: return amount * (1 - self .bulk_discount_rate) return amount def process_order (data: dict ) -> dict : items = [ OrderItem( price=item["price" ], quantity=item["quantity" ], discount=item.get("discount" , 0.0 ), ) for item in data.get("items" , []) ] if not items: raise ValueError("Order must have items" ) coupon = None if coupon_code := data.get("coupon" ): coupon = Coupon.from_code(coupon_code) order = Order(items=items, coupon=coupon) return order.to_dict()
24.6.4 Makefile 自动化 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 .PHONY : install lint format test build cleanPYTHON := python UV := uv install: $(UV) sync lint: $(UV) run ruff check . $(UV) run mypy src/ format: $(UV) run ruff format . $(UV) run ruff check --fix . test: $(UV) run pytest tests/ -v --cov=src --cov-report=term-missing test-e2e: $(UV) run pytest tests/e2e/ -v build: $(UV) build clean: find . -type d -name __pycache__ -exec rm -rf {} + find . -type f -name "*.pyc" -delete rm -rf .pytest_cache .mypy_cache .ruff_cache rm -rf dist build *.egg-info check: lint test @echo "✅ All checks passed!" setup: install $(UV) run pre-commit install @echo "✅ Development environment ready!"
24.7 前沿技术动态 24.7.1 现代Python项目管理 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 [project] name = "my-project" version = "0.1.0" requires-python = ">=3.11" dependencies = [ "requests>=2.28.0" , ] [project.optional-dependencies] dev = ["pytest" , "ruff" , "mypy" ][build-system] requires = ["hatchling" ]build-backend = "hatchling.build" [tool.ruff] line-length = 88 select = ["E" , "F" , "I" , "N" , "W" ][tool.mypy] strict = true
24.7.2 UV包管理器 1 2 3 4 5 6 uv init my-project uv add requests numpy pandas uv add --dev pytest ruff mypy uv sync uv run pytest
24.7.3 Ruff一体化工具 1 2 3 4 5 6 7 8 9 10 [tool.ruff] line-length = 88 target-version = "py311" [tool.ruff.lint] select = ["E" , "F" , "I" , "N" , "W" , "UP" , "B" , "C4" , "SIM" ]ignore = ["E501" ][tool.ruff.format] quote-style = "double"
24.7.4 Pydantic Settings配置管理 1 2 3 4 5 6 7 8 9 10 11 12 13 from pydantic_settings import BaseSettingsfrom pydantic import Fieldclass Settings (BaseSettings ): app_name: str = "MyApp" debug: bool = False database_url: str = Field(alias="DATABASE_URL" ) class Config : env_file = ".env" env_file_encoding = "utf-8" settings = Settings()
24.8 本章小结 本章系统阐述了 Python 项目结构与规范的核心知识体系:
项目组织 :src 布局、Web 应用结构、数据科学项目结构的设计原则包管理 :pyproject.toml 标准配置、现代包管理工具对比、依赖版本策略代码规范 :PEP 8 扩展规范、Ruff 一体化工具、mypy 类型检查、pre-commit 自动化配置管理 :分层配置架构、基于类的配置、Pydantic Settings、环境变量安全文档体系 :README 规范、API 文档、架构决策记录代码重构 :重构原则、代码异味识别、实战重构示例24.9 习题与项目练习 基础练习 项目初始化 :使用 uv 创建一个标准 Python 库项目,配置 pyproject.toml,包含完整的元数据、依赖和工具配置。
代码规范配置 :为一个现有项目配置 Ruff + mypy + pre-commit,确保所有检查通过。
配置管理 :实现一个基于 Pydantic Settings 的配置系统,支持 .env 文件和环境变量覆盖。
进阶练习 项目模板 :创建一个可复用的项目模板(cookiecutter 或 copier),包含:
src 布局 完整的 pyproject.toml pre-commit 配置 GitHub Actions CI 文档骨架 代码审查 :对一段 200 行以上的遗留代码进行代码异味分析,列出所有发现的问题并制定重构计划。
重构实战 :将一个过程式风格的 Python 脚本重构为面向对象架构,确保重构过程中测试始终通过。
项目练习 完整项目搭建 :从零搭建一个生产级 Python Web 项目,要求:
使用 src 布局 配置分层配置管理 集成 Ruff + mypy + pre-commit 编写完整的 README 和 API 文档 配置 GitHub Actions CI/CD 实现至少一次有意义的重构 代码质量仪表盘 :开发一个工具,分析 Python 项目的代码质量指标:
圈复杂度 代码行数与注释率 类型注解覆盖率 测试覆盖率 依赖健康度 思考题 在微服务架构中,多个服务共享通用库时,如何设计包的结构和版本策略,以平衡复用性与独立性?
当项目从单体架构演进到微服务架构时,项目结构应如何调整?需要考虑哪些重构策略?
24.10 延伸阅读 24.10.1 项目结构 24.10.2 包管理工具 24.10.3 代码质量 24.10.4 重构与架构 《Refactoring》 (Martin Fowler) — 重构经典著作《Clean Code》 (Robert C. Martin) — 代码整洁之道《Architecture Patterns with Python》 — Python架构模式下一章:第25章 实战:命令行工具开发