第3章数据类型

学习目标

完成本章学习后，读者将能够：

掌握Python数值类型（int、float、complex）的内部表示与运算规则
理解浮点数精度问题的根源（IEEE 754标准）及Decimal的精确计算方案
熟练运用布尔类型的真值测试规则与短路求值机制
理解None类型的语义及其在API设计中的哨兵模式应用
掌握类型转换的隐式与显式规则，以及类型检查的最佳实践

3.1 数值类型

3.1.1 整数（int）

Python的整数类型没有大小限制，这是与C/Java等语言的重要区别：

a = 10
b = -5
c = 0

# Python整数无溢出（任意精度）
large = 2 ** 1000
print(large)
print(type(large))  # <class 'int'>

# 下划线分隔符（Python 3.6+，提升可读性）
million = 1_000_000
credit_card = 1234_5678_9012_3456

# 进制表示
binary_num = 0b1010      # 二进制：10
octal_num = 0o12         # 八进制：10
hex_num = 0xA            # 十六进制：10

# 进制转换
print(bin(42))           # '0b101010'
print(oct(42))           # '0o52'
print(hex(42))           # '0x2a'
print(int('1010', 2))    # 10 - 二进制转十进制
print(int('FF', 16))     # 255 - 十六进制转十进制

学术注记：CPython使用**任意精度整数（Arbitrary-precision integer）**实现。小整数（-5到256）使用小整数缓存池，大整数则动态分配内存，采用基为2³⁰的数组存储。这意味着Python不会出现C语言中的整数溢出问题，但大整数运算的性能开销随位数增长。

3.1.2 浮点数（float）

Python的float类型基于IEEE 754双精度浮点数标准（64位）：

pi = 3.14159
scientific = 1.5e10      # 科学计数法：15000000000.0
negative = -0.0001

print(type(pi))          # <class 'float'>
print(sys.float_info.max)  # 最大浮点数
print(sys.float_info.epsilon)  # 最小精度

# 浮点数精度问题（IEEE 754的固有缺陷）
print(0.1 + 0.2)         # 0.30000000000000004
print(0.1 + 0.2 == 0.3)  # False

# 正确的比较方式
import math
print(math.isclose(0.1 + 0.2, 0.3))  # True

# 特殊浮点值
import math
print(math.inf)           # 无穷大
print(-math.inf)          # 负无穷大
print(math.nan)           # 非数值（Not a Number）
print(math.inf + 1)       # inf
print(math.nan == math.nan)  # False - NaN不等于任何值

学术注记：浮点数精度问题源于十进制小数无法精确表示为二进制小数。例如0.1在二进制中是无限循环小数0.0001100110011...，存储时被截断，导致精度丢失。这是所有使用IEEE 754标准语言的共性问题，并非Python特有。

3.1.3 Decimal——精确数值计算

金融、科学计算等场景需要精确的十进制运算：

from decimal import Decimal, getcontext, ROUND_HALF_UP

# 设置全局精度
getcontext().prec = 28

# 从字符串创建（推荐）
a = Decimal("0.1")
b = Decimal("0.2")
print(a + b)              # 0.3 - 精确结果

# 从浮点数创建（不推荐，已丢失精度）
c = Decimal(0.1)          # 不精确！
print(c)                  # 0.1000000000000000055511151231257827021181583404541015625

# 金融计算示例
price = Decimal("19.99")
tax_rate = Decimal("0.08")
tax = (price * tax_rate).quantize(Decimal("0.01"), rounding=ROUND_HALF_UP)
total = price + tax
print(f"税额：{tax}，总计：{total}")

# Decimal运算比float慢10-100倍，仅在需要精确计算时使用

3.1.4 复数（complex）

c = 3 + 4j
print(c.real)             # 3.0 - 实部
print(c.imag)             # 4.0 - 虚部
print(abs(c))             # 5.0  - 模（√(3²+4²)）
print(c.conjugate())      # (3-4j) - 共轭复数

# 复数运算
z1 = 1 + 2j
z2 = 3 + 4j
print(z1 + z2)            # (4+6j)
print(z1 * z2)            # (-5+10j)

# 从极坐标创建
import cmath
r, theta = 2, cmath.pi / 3
z = cmath.rect(r, theta)
print(z)

3.1.5 数学模块

import math

# 基本函数
print(math.sqrt(16))      # 4.0
print(math.ceil(3.2))     # 4   - 向上取整
print(math.floor(3.8))    # 3   - 向下取整
print(math.fabs(-5))      # 5.0 - 绝对值
print(math.gcd(12, 8))    # 4   - 最大公约数
print(math.lcm(4, 6))     # 12  - 最小公倍数（Python 3.9+）

# 对数与幂
print(math.log(math.e))   # 1.0 - 自然对数
print(math.log10(100))    # 2.0 - 常用对数
print(math.log2(8))       # 3.0 - 二进制对数
print(math.pow(2, 10))    # 1024.0

# 三角函数
print(math.sin(math.pi / 2))   # 1.0
print(math.cos(0))             # 1.0
print(math.degrees(math.pi))   # 180.0 - 弧度转角度
print(math.radians(180))       # 3.14159... - 角度转弧度

# 常量
print(math.pi)            # 3.141592653589793
print(math.e)             # 2.718281828459045
print(math.tau)           # 6.283185307179586 (2π)
print(math.inf)           # 无穷大

3.1.6 分数类型（Fraction）

对于需要精确分数表示的场景，Python提供了fractions.Fraction：

from fractions import Fraction

a = Fraction(1, 3)        # 1/3
b = Fraction(2, 3)        # 2/3

print(a + b)              # 1
print(a * b)              # 2/9
print(a / b)              # 1/2
print(a ** 2)             # 1/9

# 自动约分
c = Fraction(4, 8)
print(c)                  # 1/2

# 从浮点数创建（注意精度问题）
d = Fraction(0.25)
print(d)                  # 1/4

# 从字符串创建（推荐）
e = Fraction('0.333')
print(e)                  # 333/1000

f = Fraction('1/3')
print(f)                  # 1/3

# 分数与小数转换
from decimal import Decimal
g = Fraction(Decimal('0.1'))
print(g)                  # 1/10

# 获取分子分母
print(a.numerator)        # 1
print(a.denominator)      # 3

# 转换为浮点数
print(float(a))           # 0.3333333333333333

# 实际应用：精确比例计算
def calculate_ratio(part: int, total: int) -> Fraction:
    """计算精确比例"""
    return Fraction(part, total)

ratio = calculate_ratio(3, 7)
print(f"比例: {ratio} = {float(ratio):.4f}")

3.1.7 数值类型的底层实现

理解Python数值类型的底层实现有助于编写高效代码：

整数对象的内存布局：

import sys

small_int = 42
large_int = 2 ** 100

print(sys.getsizeof(small_int))   # 28字节（64位系统）
print(sys.getsizeof(large_int))   # 随位数增长

def int_size_analysis():
    """分析整数对象大小随位数的变化"""
    for bits in [0, 8, 16, 32, 64, 128, 256]:
        n = 2 ** bits
        size = sys.getsizeof(n)
        print(f"2^{bits:3d} = {n:40d} -> {size} bytes")

int_size_analysis()

浮点数的IEEE 754表示：

import struct

def float_to_binary(f: float) -> str:
    """将浮点数转换为其IEEE 754二进制表示"""
    packed = struct.pack('>d', f)
    bits = ''.join(f'{b:08b}' for b in packed)
    sign = bits[0]
    exponent = bits[1:12]
    mantissa = bits[12:]
    return f"符号位: {sign}\n指数: {exponent}\n尾数: {mantissa}"

print(float_to_binary(0.1))
print(float_to_binary(-0.1))
print(float_to_binary(1.0))

数值类型的性能对比：

import timeit

def benchmark_numeric_types():
    """比较不同数值类型的运算性能"""
    n = 1_000_000

    int_time = timeit.timeit('a + b', 
        setup='a, b = 123456, 789012', number=n)
    
    float_time = timeit.timeit('a + b', 
        setup='a, b = 123.456, 789.012', number=n)
    
    decimal_time = timeit.timeit('a + b', 
        setup='from decimal import Decimal; a, b = Decimal("123.456"), Decimal("789.012")', 
        number=n)
    
    fraction_time = timeit.timeit('a + b', 
        setup='from fractions import Fraction; a, b = Fraction(123456, 1000), Fraction(789012, 1000)', 
        number=n)

    print(f"int加法:     {int_time:.4f}秒")
    print(f"float加法:   {float_time:.4f}秒")
    print(f"Decimal加法: {decimal_time:.4f}秒")
    print(f"Fraction加法:{fraction_time:.4f}秒")

benchmark_numeric_types()

性能提示：int和float是原生类型，性能最佳。Decimal比float慢10-100倍，Fraction更慢。仅在需要精确计算时使用Decimal/Fraction。

3.2 布尔类型（bool）

3.2.1 布尔值基础

is_true = True
is_false = False

print(type(is_true))      # <class 'bool'>
print(isinstance(is_true, int))  # True - bool是int的子类
print(True + True)        # 2 - True等价于1
print(False + True)       # 1 - False等价于0

3.2.2 真值测试（Truthiness）

Python中任何对象都可以进行布尔测试。以下值为假值（Falsy），其余均为真值（Truthy）：

假值	说明
`None`	空值
`False`	布尔假
`0`, `0.0`, `0j`	数值零
`""`, `''`	空字符串
`[]`, `()`, `{}`, `set()`	空容器
`range(0)`	空范围

# 真值测试
print(bool(0))            # False
print(bool(0.0))          # False
print(bool(""))           # False
print(bool([]))           # False
print(bool(None))         # False

print(bool(1))            # True
print(bool(-1))           # True
print(bool("hello"))      # True
print(bool([0]))          # True - 非空列表
print(bool("False"))      # True - 非空字符串

# 自定义对象的真值
class Container:
    def __init__(self, items):
        self.items = items
    
    def __bool__(self):
        return len(self.items) > 0

c = Container([])
print(bool(c))            # False
c = Container([1])
print(bool(c))            # True

3.2.3 短路求值与实用模式

# 短路求值：and在遇到假值时停止，or在遇到真值时停止
result = 0 or "default"           # "default" - 提供默认值
result = "" or "N/A"              # "N/A"
result = "Alice" or "default"     # "Alice" - 原值非空则保留

# 安全取值模式
name = input("姓名：") or "匿名"
data = response or {}

# and的链式判断
x = 5
result = x > 0 and x < 10 and x != 3  # True

# 条件赋值
status = "active" if user_verified else "pending"

3.3 None类型

3.3.1 None的语义

None是Python中表示”无值”的单例对象：

x = None
print(type(x))            # <class 'NoneType'>
print(x is None)          # True - 推荐用is判断
print(x == None)          # True - 不推荐

# None是单例
a = None
b = None
print(a is b)             # True - 同一对象

3.3.2 None的典型应用

# 1. 函数默认返回值
def greet(name):
    print(f"Hello, {name}")
    # 无显式return，默认返回None

result = greet("Alice")
print(result)             # None

# 2. 可变默认参数的哨兵值
def append_to(item, target=None):
    """正确处理可变默认参数"""
    if target is None:
        target = []
    target.append(item)
    return target

# 3. 表示缺失或未初始化
class User:
    def __init__(self, name):
        self.name = name
        self.email = None     # 尚未设置

# 4. API设计中的可选返回
def find_user(user_id: int) -> dict | None:
    """查找用户，未找到返回None"""
    users = {1: {"name": "Alice"}, 2: {"name": "Bob"}}
    return users.get(user_id)

user = find_user(999)
if user is not None:
    print(user["name"])
else:
    print("用户不存在")

工程实践：判断是否为None时，始终使用is None而非== None。is比较对象标识，不受__eq__方法重载影响，更安全且性能更优。

3.4 类型转换

3.4.1 隐式转换

Python在特定场景下自动进行类型转换：

# int与float运算 → float
result = 1 + 2.0
print(type(result))       # <class 'float'>

# int与complex运算 → complex
result = 1 + 2j
print(type(result))       # <class 'complex'>

# bool参与算术运算 → 按int处理
print(True + 1)           # 2
print(False + 0.5)        # 0.5

3.4.2 显式转换

# 数值转换
print(int(3.14))          # 3   - 截断小数部分
print(int(-3.9))          # -3  - 向零截断
print(int("42"))          # 42  - 字符串转整数
print(int("1010", 2))     # 10  - 二进制字符串
print(int("FF", 16))      # 255 - 十六进制字符串

print(float(42))          # 42.0
print(float("3.14"))      # 3.14
print(float("1e3"))       # 1000.0

# 字符串转换
print(str(42))            # "42"
print(str(3.14))          # "3.14"
print(str(True))          # "True"
print(str(None))          # "None"

# 布尔转换
print(bool(1))            # True
print(bool(0))            # False
print(bool(""))           # False
print(bool("False"))      # True - 非空字符串

# 容器转换
print(list("hello"))      # ['h', 'e', 'l', 'l', 'o']
print(tuple([1, 2, 3]))   # (1, 2, 3)
print(set([1, 2, 2, 3]))  # {1, 2, 3}
print(list((1, 2, 3)))    # [1, 2, 3]

3.4.3 安全转换

def safe_int(value: str, default: int = 0) -> int:
    """安全转换为整数，失败返回默认值"""
    try:
        return int(value)
    except (ValueError, TypeError):
        return default

print(safe_int("42"))       # 42
print(safe_int("abc"))      # 0
print(safe_int("abc", -1))  # -1

def safe_float(value: str, default: float = 0.0) -> float:
    """安全转换为浮点数"""
    try:
        return float(value)
    except (ValueError, TypeError):
        return default

3.5 类型检查

3.5.1 type()与isinstance()

x = 10

# type() - 返回对象的类型
print(type(x))            # <class 'int'>
print(type(x) == int)     # True - 精确类型匹配

# isinstance() - 检查是否为某类型的实例（含子类）
print(isinstance(x, int))     # True
print(isinstance(True, int))  # True - bool是int的子类
print(isinstance(True, bool)) # True

# isinstance支持类型元组
print(isinstance(x, (int, float, str)))  # True

# type() vs isinstance() 对子类的行为不同
class MyInt(int):
    pass

n = MyInt(42)
print(type(n) == int)         # False - 精确类型不是int
print(isinstance(n, int))     # True  - 是int的子类实例

工程实践：类型检查优先使用isinstance()而非type() ==。isinstance()支持继承，符合面向对象的里氏替换原则（LSP）。type()精确匹配仅在需要区分子类时使用。

3.5.2 类型注解与静态检查

from typing import Union, Optional

def process_data(
    value: int | float,
    name: str = "",
    flag: bool = False
) -> str:
    """带类型注解的函数"""
    return f"{name}: {value} (flag={flag})"

# Optional等价于 T | None
def find_user(user_id: int) -> Optional[dict]:
    users = {1: {"name": "Alice"}}
    return users.get(user_id)

# 使用mypy进行静态类型检查
# $ pip install mypy
# $ mypy mycode.py

3.6 前沿技术动态

3.6.1 数值类型增强（PEP 696）

Python 3.12引入了类型参数默认值，简化泛型类型定义：

from typing import Generic, TypeVar

# 传统写法
T = TypeVar('T')
class Container(Generic[T]):
    ...

# Python 3.12+ 简化写法
class Container[T = int]:
    def __init__(self, value: T):
        self.value = value

c = Container("hello")  # T 推断为 str
c2 = Container()        # T 默认为 int

3.6.2 高精度数值计算库

现代Python数值计算生态持续演进：

# mpmath - 任意精度数学库
from mpmath import mp, pi, sin

mp.dps = 50  # 设置50位精度
print(pi)    # 3.1415926535897932384626433832795028841971693993751

# numpy 2.0 新特性
import numpy as np
arr = np.array([1, 2, 3], dtype=np.int64)  # 更好的类型推断

3.6.3 类型系统增强

from typing import TypeAlias, Literal, TypedDict

# 类型别名（Python 3.12+）
type Point = tuple[float, float]
type Vector = list[float]

# Literal类型用于精确值约束
def set_mode(mode: Literal["fast", "slow", "auto"]) -> None:
    ...

# TypedDict用于结构化字典
class UserDict(TypedDict):
    name: str
    age: int
    email: str

user: UserDict = {"name": "Alice", "age": 30, "email": "alice@example.com"}

3.6.4 性能优化技术

import sys

# Python 3.11+ 更快的解释器
# 使用专门化优化（PEP 659）

# 小整数缓存范围
print(sys.int_info)  # 查看整数实现信息

# 浮点数优化
import math
math.fsum([0.1] * 10)  # 精确求和，避免累积误差

3.7 本章小结

本章系统介绍了Python的数据类型体系：

整数：任意精度，无溢出风险，支持多种进制表示，底层采用动态数组存储
浮点数：IEEE 754双精度，存在精度问题，使用Decimal进行精确计算
分数：精确分数表示，自动约分，适合比例计算
复数：内置支持，适合科学计算
布尔类型：bool是int的子类，掌握真值测试规则
None类型：表示”无值”的单例对象，使用is None判断
类型转换：隐式转换遵循数值提升规则，显式转换需注意边界情况
类型检查：优先使用isinstance()，配合类型注解实现静态检查

3.7.1 数值类型选择指南

场景	推荐类型	原因
一般计算	int, float	性能最佳
金融计算	Decimal	精确十进制运算
科学计算	float, complex	与NumPy兼容
比例/分数	Fraction	精确分数表示
位运算	int	原生支持
大整数	int	任意精度

3.7.2 类型转换规则图

┌─────────────────────────────────────────┐
│           隐式转换规则                    │
├─────────────────────────────────────────┤
│  int + float  → float                   │
│  int + complex → complex                │
│  float + complex → complex              │
│  bool参与运算 → 按int处理               │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│           显式转换函数                    │
├─────────────────────────────────────────┤
│  int(x)     → 整数（截断）               │
│  float(x)   → 浮点数                     │
│  str(x)     → 字符串                     │
│  bool(x)    → 布尔值                     │
│  Decimal(x) → 十进制数                   │
│  Fraction(x)→ 分数                       │
└─────────────────────────────────────────┘

3.8 练习题

基础题

编写程序，输入圆的半径，计算并输出圆的面积和周长，结果保留2位小数。
将字符串"0xFF"转换为十进制整数，并验证转换结果。
编写函数，判断一个值是否为”假值”，并列出所有Python内置假值。

进阶题

使用Decimal实现一个精确的货币计算器，支持加、减、乘、除，结果四舍五入到分。
实现温度转换函数，支持摄氏度、华氏度和开尔文三种温标之间的互转。
编写函数safe_convert(value, target_type)，支持安全转换为int、float、str、bool四种类型，转换失败返回None。

项目实践

科学计算器：编写一个命令行科学计算器，要求：
- 支持基本四则运算和幂运算
- 支持三角函数、对数函数
- 使用Decimal处理需要精确结果的运算
- 包含完整的类型注解和错误处理
- 使用math和cmath模块

思考题

为什么0.1 + 0.2 != 0.3？请从IEEE 754标准的角度解释浮点数精度问题的根源。
Python的bool是int的子类，这一设计决策带来了哪些便利和哪些潜在问题？
为什么判断x is None比x == None更安全？举例说明__eq__方法重载可能带来的问题。

3.9 延伸阅读

3.9.1 数值计算基础

《What Every Computer Scientist Should Know About Floating-Point Arithmetic》 (David Goldberg) — 浮点数计算的权威论文，理解IEEE 754的必读材料
《Numerical Recipes: The Art of Scientific Computing》 — 科学计算的经典教材，涵盖数值稳定性、误差分析等核心概念
IEEE 754-2019标准 (https://ieeexplore.ieee.org/document/8766229) — 浮点数运算的官方标准文档

3.9.2 Python数值类型实现

CPython源码：Objects/longobject.c — Python整数类型的C语言实现，理解任意精度整数的工作原理
CPython源码：Objects/floatobject.c — Python浮点数类型的实现
PEP 238 — Changing the Division Operator (https://peps.python.org/pep-0238/) — Python 3中真除法的设计决策

3.9.3 精确计算库

decimal模块文档 (https://docs.python.org/3/library/decimal.html) — Python标准库十进制运算模块
fractions模块文档 (https://docs.python.org/3/library/fractions.html) — Python标准库分数运算模块
mpmath (https://mpmath.org/) — 任意精度浮点运算库，支持特殊函数
SymPy (https://www.sympy.org/) — 符号计算库，支持精确数学运算

3.9.4 数值计算生态系统

NumPy (https://numpy.org/doc/) — Python科学计算的基础库，高效的多维数组运算
SciPy (https://scipy.org/) — 科学计算工具包，包含优化、积分、插值等高级功能
Pandas (https://pandas.pydata.org/) — 数据分析库，基于NumPy构建
《Python Data Science Handbook》 (Jake VanderPlas) — Python数据科学的综合指南

下一章：第4章控制流

第3章 数据类型