PyTorch 入门教程：从零开始的深度学习框架¶

面向对象: 机器学习初学者 结合项目: DeepQuantum 量子计算模拟框架版本: PyTorch 最新版 + DeepQuantum v4.4.0 更新日期: 2026-01-14

目录¶

什么是 PyTorch？
核心概念：张量（Tensor）
自动微分（Autograd）
神经网络模块（nn.Module）
PyTorch 在量子计算中的应用
实战案例
学习资源

什么是 PyTorch？¶

🎯 简单理解¶

想象你在做一道数学题，需要计算很多数字。你可以： - 用纸笔算：慢，容易出错 - 用计算器：快一些，但还是不够智能 - 用 PyTorch：超级快，还能自动帮你求导！

PyTorch 是一个开源的**深度学习框架**，由 Facebook 开发。它就像一个功能强大的"科学计算器 + 优化器"，特别适合： - ✅ 处理大量数据（张量运算） - ✅ 训练神经网络（自动求导） - ✅ GPU 加速计算 - ✅ 量子计算模拟

📊 为什么选择 PyTorch？¶

特性	说明
易学易用	语法像 Python 一样自然
动态计算图	可以随时修改代码，调试方便
强大社区	大量教程和开源项目
GPU 加速	自动利用显卡加速计算
自动微分	自动计算梯度，省去手动求导的麻烦

核心概念：张量（Tensor）¶

🧠 什么是张量？¶

**张量**就像是一个超级强大的"数组"或"矩阵"。

标量（0维张量）：      5
向量（1维张量）：      [1, 2, 3]
矩阵（2维张量）：      [[1, 2], [3, 4]]
3维张量：             [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]

生活中的类比： - 标量：温度计上的一个数字（如 26°C） - 向量：一组数据（如一周的温度 [23, 25, 26, 24, 22, 21, 20]） - 矩阵：Excel 表格（行和列的数据） - 高维张量：一本厚书（页 × 行 × 列）

💻 创建张量的方法¶

import torch

# 方法1：从数据创建
t1 = torch.tensor([1, 2, 3])           # 创建向量
t2 = torch.tensor([[1, 2], [3, 4]])    # 创建矩阵

# 方法2：创建特殊张量
t3 = torch.zeros(2, 3)     # 创建 2×3 的全零矩阵
# 结果：[[0., 0., 0.],
#       [0., 0., 0.]]

t4 = torch.ones(2, 3)      # 创建 2×3 的全一矩阵
# 结果：[[1., 1., 1.],
#       [1., 1., 1.]]

t5 = torch.eye(2)          # 创建 2×2 的单位矩阵
# 结果：[[1., 0.],
#       [0., 1.]]

t6 = torch.randn(2, 3)     # 创建 2×3 的随机矩阵（正态分布）
# 结果：[[-0.1234,  0.5678, -0.9012],
#       [ 0.3456, -0.7890,  0.2345]]

🔢 张量的数据类型¶

在量子计算中，我们经常使用**复数张量**：

# 创建复数张量（量子态必需）
state = torch.zeros(4, 1, dtype=torch.cfloat)
# cfloat = complex float32（32位复数）
# cdouble = complex float64（64位复数）

# 示例：量子态 |00⟩ = [1, 0, 0, 0]ᵀ
quantum_state = torch.tensor([
    [1+0j],  # |00⟩ 的振幅
    [0+0j],  # |01⟩ 的振幅
    [0+0j],  # |10⟩ 的振幅
    [0+0j]   # |11⟩ 的振幅
], dtype=torch.cfloat)

🎯 常用张量操作¶

import torch

# 创建两个张量
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])

# 1. 基本运算
print(a + b)        # 加法：[5, 7, 9]
print(a * b)        # 乘法：[4, 10, 18]
print(a ** 2)       # 平方：[1, 4, 9]

# 2. 数学函数
print(torch.cos(a)) # 余弦：[0.5403, -0.4161, -0.9900]
print(torch.sin(a)) # 正弦：[0.8415, 0.9093, 0.1411]
print(torch.exp(a)) # 指数：[2.7183, 7.3891, 20.0855]

# 3. 线性代数
A = torch.tensor([[1, 2], [3, 4]])
B = torch.tensor([[5, 6], [7, 8]])
print(torch.matmul(A, B))  # 矩阵乘法
# 结果：[[19, 22],
#       [43, 50]]

# 4. 张量拼接
x = torch.tensor([[1, 2], [3, 4]])
y = torch.tensor([[5, 6]])
print(torch.cat([x, y], dim=0))  # 按行拼接
# 结果：[[1, 2],
#       [3, 4],
#       [5, 6]]

# 5. 爱因斯坦求和（高级）
# 在矩阵乘积态（MPS）计算中非常重要
print(torch.einsum('ij,jk->ik', A, B))  # 等价于矩阵乘法

🌟 DeepQuantum 中的张量应用¶

量子态初始化（state.py:15）：

import torch
import torch.nn as nn

class QubitState(nn.Module):
    def __init__(self, nqubit):
        super().__init__()
        # 创建 2^nqubit × 1 的零向量
        self.state = torch.zeros((2 ** nqubit, 1), dtype=torch.cfloat)
        self.state[0, 0] = 1.0  # 初始化为 |00...0⟩ 态

量子门矩阵（gate.py:9）：

# Pauli-X 门（量子非门）
class Xgate:
    def get_unitary(self):
        return torch.tensor([
            [0, 1],
            [1, 0]
        ], dtype=torch.cfloat)

# 旋转门（参数化）
class RYgate:
    def __init__(self, theta):
        self.theta = theta

    def get_unitary(self):
        cos = torch.cos(self.theta / 2)
        sin = torch.sin(self.theta / 2)
        return torch.tensor([
            [cos, -sin],
            [sin,  cos]
        ], dtype=torch.cfloat)

自动微分（Autograd）¶

🎓 什么是自动微分？¶

自动微分**是 PyTorch 最强大的功能之一。它可以**自动计算导数（梯度），这对于优化神经网络参数至关重要。

生活中的类比： - 想象你在爬山，想要找到最低的山谷 - 你需要知道哪个方向是下坡（梯度） - PyTorch 的 autograd 就像一个"指南针"，告诉你该往哪个方向走

📚 为什么需要自动微分？¶

在机器学习和量子计算中，我们经常需要**优化参数**：

手动求导（麻烦）：

# 假设损失函数 L = (x - 3)²
# 我们需要求 dL/dx = 2(x - 3)

x = 2.0
L = (x - 3) ** 2  # L = 1

# 手动计算梯度
dL_dx = 2 * (x - 3)  # dL_dx = -2

自动求导（简单）：

x = torch.tensor(2.0, requires_grad=True)
L = (x - 3) ** 2

# 反向传播，自动计算梯度
L.backward()
print(x.grad)  # 自动得到：tensor(-2.)

🔧 如何使用自动微分？¶

步骤1：创建需要梯度的张量¶

# requires_grad=True 告诉 PyTorch 要跟踪这个张量的运算
x = torch.tensor(2.0, requires_grad=True)
y = torch.tensor(3.0, requires_grad=True)

步骤2：构建计算图¶

# 定义一个计算过程
z = x ** 2 + y ** 3  # z = x² + y³

# PyTorch 会在后台记录所有操作
print(z)  # tensor(31., grad_fn=<AddBackward0>)
#        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
#        注意这个 grad_fn，表示梯度函数

步骤3：反向传播计算梯度¶

# 计算 dz/dx 和 dz/dy
z.backward()

# 查看梯度
print(x.grad)  # dz/dx = 2x = 4
print(y.grad)  # dz/dy = 3y² = 27

🎯 完整示例：优化问题¶

import torch

# 目标：找到使 f(x) = (x - 3)² 最小的 x 值
# 答案应该是 x = 3

# 1. 初始化 x（随机值）
x = torch.tensor(0.0, requires_grad=True)

# 2. 设置学习率
lr = 0.1

# 3. 梯度下降迭代
for i in range(50):
    # 前向传播：计算损失
    loss = (x - 3) ** 2

    # 反向传播：计算梯度
    loss.backward()

    # 更新参数（需要用 torch.no_grad()）
    with torch.no_grad():
        x = x - lr * x.grad

    # 清空梯度（重要！）
    x.requires_grad = True

    if i % 10 == 0:
        print(f'迭代 {i}: x = {x.item():.4f}, loss = {loss.item():.4f}')

# 最终结果
print(f'最优解：x = {x.item():.4f}')  # 应该接近 3.0

输出：

迭代 0: x = 0.6000, loss = 9.0000
迭代 10: x = 2.4032, loss = 0.3570
迭代 20: x = 2.8803, loss = 0.0142
迭代 30: x = 2.9761, loss = 0.0006
迭代 40: x = 2.9952, loss = 0.0000
最优解：x = 2.9990

🌟 DeepQuantum 中的自动微分¶

变分量子本征求解器（VQE）（examples/gbs/vqe_ground_energy_for_H2_hardware/）：

import torch
import torch.optim as optim

# 1. 定义可训练参数（量子门的旋转角度）
angles = torch.nn.Parameter(torch.randn(8))

# 2. 定义优化器
optimizer = optim.Adam([angles], lr=0.1)

# 3. 训练循环
for iteration in range(100):
    optimizer.zero_grad()      # 清空梯度

    # 构建量子电路
    circuit = build_vqe_circuit(angles)

    # 计算能量（期望值）
    energy = calculate_energy(circuit, hamiltonian)

    # 反向传播
    energy.backward()          # 自动计算梯度

    # 更新参数
    optimizer.step()           # 自动更新 angles

    print(f'迭代 {iteration}: 能量 = {energy.item():.6f}')

量子门的导数计算（gate.py:11）：

from torch.autograd.functional import jacobian

class ParameterizedGate:
    def __init__(self, theta):
        self.theta = theta

    def get_unitary(self):
        """返回参数化的酉矩阵"""
        cos = torch.cos(self.theta / 2)
        sin = torch.sin(self.theta / 2)
        return torch.tensor([[cos, -sin], [sin, cos]])

    def get_derivative(self):
        """自动计算导数"""
        # 使用 jacobian 自动计算 dU/dθ
        return jacobian(self.get_unitary, self.theta)

⚠️ 常见错误¶

错误1：忘记清空梯度¶

# ❌ 错误做法
for i in range(100):
    loss = model(x)
    loss.backward()
    optimizer.step()
    # 梯度会累积！导致错误

# ✅ 正确做法
for i in range(100):
    optimizer.zero_grad()  # 清空梯度
    loss = model(x)
    loss.backward()
    optimizer.step()

错误2：在梯度跟踪下更新参数¶

# ❌ 错误做法
x = torch.tensor(1.0, requires_grad=True)
y = x * 2
x = x - 0.1 * x.grad  # 错误！不能在 requires_grad=True 时直接修改

# ✅ 正确做法
with torch.no_grad():
    x = x - 0.1 * x.grad
x.requires_grad = True

神经网络模块（nn.Module）¶

🏗️ 什么是 nn.Module？¶

nn.Module 是 PyTorch 中所有神经网络的**基类**（基础模板）。它提供了一个**容器**，可以： - ✅ 自动管理参数（权重和偏置） - ✅ 支持设备切换（CPU ↔ GPU） - ✅ 方便地嵌套子模块 - ✅ 自动计算梯度

生活中的类比： - 想象你在搭乐高积木 - nn.Module 就像一个"底板" - 你可以在上面添加各种"积木"（层、激活函数等）

📦 如何使用 nn.Module？¶

基本结构¶

import torch
import torch.nn as nn

class MyNetwork(nn.Module):
    def __init__(self):
        super(MyNetwork, self).__init__()
        # 在这里定义网络层
        self.linear1 = nn.Linear(10, 5)  # 输入10维 → 输出5维
        self.linear2 = nn.Linear(5, 1)   # 输入5维  → 输出1维

    def forward(self, x):
        # 在这里定义前向传播
        x = self.linear1(x)
        x = torch.relu(x)  # 激活函数
        x = self.linear2(x)
        return x

# 使用网络
model = MyNetwork()
input = torch.randn(10)  # 输入数据
output = model(input)    # 自动调用 forward()
print(output)

参数管理¶

# 查看所有参数
for name, param in model.named_parameters():
    print(f'{name}: {param.shape}')

# 输出：
# linear1.weight: torch.Size([5, 10])
# linear1.bias: torch.Size([5])
# linear2.weight: torch.Size([1, 5])
# linear2.bias: torch.Size([1])

# 访问特定参数
print(model.linear1.weight)  # 查看权重
print(model.linear1.bias)    # 查看偏置

使用 nn.Parameter¶

class QuantumGate(nn.Module):
    def __init__(self):
        super(QuantumGate, self).__init__()
        # 定义可训练参数
        self.theta = nn.Parameter(torch.tensor(0.0))

    def forward(self):
        """返回参数化的量子门矩阵"""
        cos = torch.cos(self.theta / 2)
        sin = torch.sin(self.theta / 2)
        return torch.tensor([[cos, -sin], [sin, cos]])

# 使用
gate = QuantumGate()
print(gate.theta)  # Parameter containing: tensor(0.)

# theta 会被自动跟踪，可以求导
loss = gate.theta ** 2
loss.backward()
print(gate.theta.grad)  # tensor(0.)

🌟 DeepQuantum 中的 nn.Module¶

量子态类（state.py:15）：

import torch
import torch.nn as nn

class QubitState(nn.Module):
    """量子态：继承 nn.Module 以支持自动微分"""

    def __init__(self, nqubit):
        super(QubitState, self).__init__()
        self.nqubit = nqubit

        # 使用 register_buffer 注册非参数张量
        # buffer 不会被优化，但会随模型移动设备（CPU/GPU）
        self.register_buffer(
            'state',
            torch.zeros((2 ** nqubit, 1), dtype=torch.cfloat)
        )
        self.state[0, 0] = 1.0  # 初始化为 |00...0⟩

    def forward(self, gate):
        """应用量子门"""
        self.state = gate @ self.state
        return self.state

    def __str__(self):
        return f'量子态({self.nqubit} 比特)'

# 使用
state = QubitState(nqubit=2)
print(state)  # 量子态(2 比特)
print(state.state)
# tensor([[1.+0.j],
#         [0.+0.j],
#         [0.+0.j],
#         [0.+0.j]])

量子电路类（circuit.py:38）：

class QubitCircuit(nn.Module):
    """量子电路：包含多个量子门"""

    def __init__(self, nqubit):
        super(QubitCircuit, self).__init__()
        self.nqubit = nqubit
        self.elements = nn.ModuleList()  # 存储量子门序列

    def add_gate(self, gate):
        """添加量子门"""
        self.elements.append(gate)

    def forward(self, state=None):
        """执行电路"""
        if state is None:
            state = QubitState(self.nqubit)

        for gate in self.elements:
            state = gate(state)  # 应用每个门

        return state

# 使用示例
circuit = QubitCircuit(nqubit=2)

# 添加量子门
circuit.add_gate(Hadamard(0))  # H 门在第 0 个比特
circuit.add_gate(CNOT(0, 1))   # CNOT 门

# 执行电路
output_state = circuit()

🎯 为什么继承 nn.Module？¶

特性	说明	量子计算中的应用
参数管理	自动追踪可训练参数	优化量子门参数
设备切换	`.to(device)` 自动移动所有参数	GPU 加速量子模拟
梯度计算	自动构建计算图	VQE/QAOA 参数优化
模块嵌套	可以包含子模块	电路包含门，门包含参数
保存/加载	方便保存模型状态	保存训练好的电路

PyTorch 在量子计算中的应用¶

🌌 1. 量子态表示¶

状态向量：

import torch

# 2 量子比特系统：4 个基态
# |00⟩, |01⟩, |10⟩, |11⟩
state = torch.tensor([
    [1+0j],  # |00⟩ 的振幅
    [0+0j],  # |01⟩ 的振幅
    [0+0j],  # |10⟩ 的振幅
    [0+0j]   # |11⟩ 的振幅
], dtype=torch.cfloat)

# 归一化检查
norm = torch.sum(torch.abs(state) ** 2)
print(f'归一化: {norm.item()}')  # 应该 = 1.0

矩阵乘积态（MPS）：

name="__codelineno-23-1" href="#__codelineno-23-1">class MatrixProductState(nn.Module): """使用张量网络表示量子态""" def __init__(self, nqubit, bond_dim): super().__init__() self.nqubit = nqubit self.bond_dim = bond_dim # 每个量子比特一个张量 self.tensors = nn.ParameterList([ nn.Parameter(torch.randn(bond_dim, 2, bond_dim, dtype=torch.cfloat)) for _ in range(nqubit) ]) def contract(self): class="w"> """收缩张量网络得到完整状态向量""" # 使用爱因斯坦求和约定 state = self.tensors[0] for i in range(1, self.nqubit): state = torch.einsum('...ab,...bc->...ac', state, self.tensors[i]) return state.reshape(-1, 1)

🔲 2. 量子门操作¶

单量子比特门：

def hadamard_gate():
    """Hadamard 门：创建叠加态"""
    sqrt2 = torch.sqrt(torch.tensor(2.0))
    return (1 / sqrt2) * torch.tensor([
        [1,  1],
        [1, -1]
    ], dtype=torch.cfloat)

def pauli_x_gate():
    """Pauli-X 门：量子非门"""
    return torch.tensor([
        [0, 1],
        [1, 0]
    ], dtype=torch.cfloat)

# 应用量子门
state = torch.tensor([[1], [0]], dtype=torch.cfloat)  # |0⟩
H = hadamard_gate()

# 矩阵乘法
new_state = H @ state
# 结果：(1/√2) * [[1], [1]] = (|0⟩ + |1⟩) / √2
print(new_state)
# tensor([[0.7071+0.j],
#         [0.7071+0.j]])

参数化量子门：

class RotationGate(nn.Module):
    """参数化旋转门"""

    def __init__(self, axis='z'):
        super().__init__()
        self.axis = axis
        self.theta = nn.Parameter(torch.tensor(0.0))

    def get_matrix(self):
        """返回旋转矩阵"""
        half = self.theta / 2
        cos = torch.cos(half)
        sin = torch.sin(half)

        if self.axis == 'x':
            return torch.stack([
                torch.stack([cos, -1j*sin]),
                torch.stack([-1j*sin, cos])
            ])
        elif self.axis == 'y':
            return torch.stack([
                torch.stack([cos, -sin]),
                torch.stack([sin, cos])
            ])
        elif self.axis == 'z':
            return torch.stack([
                torch.stack([torch.exp(-1j*half), torch.zeros_like(cos)]),
                torch.stack([torch.zeros_like(cos), torch.exp(1j*half)])
            ])

# 使用
rx = RotationGate(axis='x')
optimizer = torch.optim.Adam([rx.theta], lr=0.1)

# 训练参数
for i in range(100):
    optimizer.zero_grad()
    matrix = rx.get_matrix()
    loss = torch.abs(matrix[0, 0] - 0.5)  # 假设目标是某个特定值
    loss.backward()
    optimizer.step()

🔄 3. 量子电路模拟¶

QAOA（量子近似优化算法）（examples/qaoa.ipynb）：

import torch
import torch.nn as nn

class QAOACircuit(nn.Module):
    """QAOA 电路"""

    def __init__(self, nqubit, depth):
        super().__init__()
        self.nqubit = nqubit
        self.depth = depth

        # 可训练参数
        self.gamma = nn.Parameter(torch.randn(depth))
        self.beta = nn.Parameter(torch.randn(depth))

    def forward(self, hamiltonian):
        """执行 QAOA 电路并计算期望值"""
        # 初始态：均匀叠加态
        state = torch.ones(2 ** self.nqubit, dtype=torch.cfloat)
        state = state / torch.sqrt(torch.tensor(2 ** self.nqubit))

        # 应用 QAOA 层
        for d in range(self.depth):
            # 问题哈密顿量演化
            state = apply_hamiltonian(state, hamiltonian, self.gamma[d])
            # 混合哈密顿量演化
            state = apply_mixing(state, self.beta[d])

        # 计算期望值
        expectation = calculate_expectation(state, hamiltonian)
        return expectation

# 训练 QAOA
model = QAOACircuit(nqubit=4, depth=3)
optimizer = torch.optim.Adam(model.parameters(), lr=0.01)

for iteration in range(100):
    optimizer.zero_grad()
    cost = model(hamiltonian)
    cost.backward()  # 自动计算梯度
    optimizer.step()
    print(f'迭代 {iteration}: 代价 = {cost.item():.4f}')

批量处理（circuit.py:185）：

from torch import vmap

class BatchCircuit(nn.Module):
    """支持批量处理的量子电路"""

    def __init__(self, nqubit):
        super().__init__()
        self.nqubit = nqubit

    def _forward_single(self, data, state):
        """单个样本的前向传播"""
        # 将数据编码到量子态
        encoded = encode_data(data)
        new_state = apply_gates(state, encoded)
        return new_state

    def forward(self, data_batch, state):
        """批量前向传播"""
        # 使用 vmap 自动向量化
        # in_dims=(0, None) 表示：第0个参数沿批次维度映射，第2个参数不变
        self.state = vmap(self._forward_single, in_dims=(0, None))(data_batch, state)
        return self.state

# 使用
circuit = BatchCircuit(nqubit=2)
data = torch.randn(100, 2)  # 100 个样本，每个 2 维
state = torch.zeros(4, 1, dtype=torch.cfloat)
output = circuit(data, state)  # 输出：(100, 4, 1)

📊 4. 优化算法¶

变分量子本征求解器（VQE）：

def vqe_optimization(hamiltonian, nqubit, n_layers=3):
    """使用 VQE 找到基态能量"""

    # 1. 定义变分ansatz（参数化电路）
    class Ansatz(nn.Module):
        def __init__(self):
            super().__init__()
            # 参数：每层的旋转角度
            self.thetas = nn.Parameter(torch.randn(n_layers * nqubit))

        def __call__(self):
            # 构建参数化电路
            state = initialize_state(nqubit)
            for i in range(n_layers):
                for j in range(nqubit):
                    theta = self.thetas[i * nqubit + j]
                    state = apply_ry(state, j, theta)  # RY 门
                # 添加纠缠层
                state = apply_entanglement(state)
            return state

    # 2. 初始化
    ansatz = Ansatz()
    optimizer = torch.optim.Adam([ansatz.thetas], lr=0.1)

    # 3. 优化循环
    energies = []
    for iteration in range(200):
        optimizer.zero_grad()

        # 前向传播：得到量子态
        state = ansatz()

        # 计算能量：⟨ψ|H|ψ⟩
        energy = torch.real(state.conj().T @ (hamiltonian @ state))
        energies.append(energy.item())

        # 反向传播
        energy.backward()

        # 更新参数
        optimizer.step()

        if iteration % 20 == 0:
            print(f'迭代 {iteration}: 能量 = {energy.item():.6f}')

    return ansatz, energies

# 运行 VQE
hamiltonian = build_molecule_hamiltonian('H2')  # 氢分子哈密顿量
ansatz, energies = vqe_optimization(hamiltonian, nqubit=4)

# 绘制收敛曲线
import matplotlib.pyplot as plt
plt.plot(energies)
plt.xlabel('迭代次数')
plt.ylabel('能量')
plt.title('VQE 优化过程')
plt.show()

🖥️ 5. GPU 加速¶

设备管理：

# 检查是否有 GPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f'使用设备: {device}')

# 将模型和数据移到 GPU
model = QubitCircuit(nqubit=10).to(device)
state = torch.zeros(2**10, 1, dtype=torch.cfloat).to(device)

# 计算会自动在 GPU 上执行
output = model(state)

内存管理（photonic/utils.py:46）：

def get_optimal_batchsize(device, dtype=torch.cfloat):
    """根据 GPU 内存计算最优批量大小"""
    if device == torch.device('cpu'):
        return 1  # CPU 不需要批量

    # 获取 GPU 内存信息
    free_mem, total_mem = torch.cuda.mem_get_info()
    free_gb = free_mem / (1024**3)

    # 估算每个样本需要的内存
    bytes_per_element = 16 if dtype == torch.cdouble else 8
    elements_per_sample = 2 ** nqubit
    mem_per_sample = elements_per_sample * bytes_per_element

    # 计算批量大小
    batch_size = int(free_gb * (1024**3) / mem_per_sample)
    return max(1, batch_size)

🌐 6. 分布式量子模拟¶

多 GPU 并行（distributed.py）：

import torch.distributed as dist

def distributed_simulation(rank, world_size):
    """分布式量子模拟"""

    # 初始化进程组
    dist.init_process_group(
        backend='nccl',  # NVIDIA GPU 通信
        rank=rank,
        world_size=world_size
    )

    # 设置当前 GPU
    torch.cuda.set_device(rank)

    # 分片量子态到不同 GPU
    nqubits = 20
    local_size = 2 ** nqubits // world_size
    local_state = torch.randn(local_size, 1, dtype=torch.cfloat).cuda()

    # 全局归约（例如计算归一化因子）
    local_norm = torch.sum(torch.abs(local_state) ** 2)
    global_norm = torch.zeros(1, dtype=torch.float).cuda()
    dist.all_reduce(local_norm, op=dist.ReduceOp.SUM)
    print(f'归一化: {global_norm.item()}')

    # 清理
    dist.destroy_process_group()

# 启动多进程
# python -m torch.distributed.launch --nproc_per_node=4 script.py

实战案例¶

🎯 案例1：简单的量子电路¶

目标：实现并运行 Bell 态电路

import torch
import torch.nn as nn
import matplotlib.pyplot as plt

# 定义量子门
def hadamard():
    """Hadamard 门"""
    sqrt2 = torch.sqrt(torch.tensor(2.0))
    return (1 / sqrt2) * torch.tensor([
        [1, 1],
        [1, -1]
    ], dtype=torch.cfloat)

def cnot():
    """CNOT 门"""
    return torch.tensor([
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 0, 1],
        [0, 0, 1, 0]
    ], dtype=torch.cfloat)

# 构建 Bell 态电路
class BellCircuit(nn.Module):
    def __init__(self):
        super().__init__()
        # 注册门矩阵为 buffer（不需要梯度）
        self.register_buffer('H', hadamard())
        self.register_buffer('CNOT', cnot())

    def forward(self):
        # 初始态 |00⟩
        state = torch.tensor([[1], [0], [0], [0]], dtype=torch.cfloat)

        # 应用 H ⊗ I
        HI = torch.kron(self.H, torch.eye(2, dtype=torch.cfloat))
        state = HI @ state

        # 应用 CNOT
        state = self.CNOT @ state

        return state

# 运行
circuit = BellCircuit()
bell_state = circuit()

print("Bell 态：")
print(bell_state)
print(f"\n概率分布：")
probs = torch.abs(bell_state) ** 2
basis = ['|00⟩', '|01⟩', '|10⟩', '|11⟩']
for b, p in zip(basis, probs.flatten()):
    print(f'{b}: {p.item():.4f}')

# 可视化
plt.bar(basis, probs.flatten().numpy())
plt.xlabel('基态')
plt.ylabel('概率')
plt.title('Bell 态概率分布')
plt.show()

输出：

Bell 态：
tensor([[0.7071+0.j],
        [0.0000+0.j],
        [0.0000+0.j],
        [0.7071+0.j]])

概率分布：
|00⟩: 0.5000
|01⟩: 0.0000
|10⟩: 0.0000
|11⟩: 0.5000

🧪 案例2：VQE 求解基态能量¶

目标：使用 VQE 找到简单哈密顿量的基态能量

import torch
import torch.nn as nn
import torch.optim as optim

# 定义哈密顿量：H = σ_z ⊗ I + I ⊗ σ_z
H = torch.kron(
    torch.tensor([[1, 0], [0, -1]], dtype=torch.cfloat),
    torch.eye(2, dtype=torch.cfloat)
) + torch.kron(
    torch.eye(2, dtype=torch.cfloat),
    torch.tensor([[1, 0], [0, -1]], dtype=torch.cfloat)
)

# 定义变分ansatz
class SimpleAnsatz(nn.Module):
    def __init__(self):
        super().__init__()
        self.theta = nn.Parameter(torch.tensor(0.0))

    def get_unitary(self):
        """参数化的量子门"""
        cos = torch.cos(self.theta / 2)
        sin = torch.sin(self.theta / 2)
        # RY 门
        return torch.tensor([[cos, -sin], [sin, cos]], dtype=torch.cfloat)

    def get_state(self):
        """得到量子态"""
        # 初始态 |00⟩
        state = torch.tensor([[1], [0], [0], [0]], dtype=torch.cfloat)

        # 应用 RY(θ) ⊗ I
        U = torch.kron(self.get_unitary(), torch.eye(2, dtype=torch.cfloat))
        state = U @ state

        return state

    def energy(self):
        """计算期望值 ⟨ψ|H|ψ⟩"""
        state = self.get_state()
        # E = ψ† H ψ
        E = torch.real(state.conj().T @ (H @ state))
        return E

# 训练
ansatz = SimpleAnsatz()
optimizer = optim.Adam([ansatz.theta], lr=0.1)

print("开始 VQE 优化...\n")
energies = []

for iteration in range(100):
    optimizer.zero_grad()

    # 计算能量
    E = ansatz.energy()
    energies.append(E.item())

    # 反向传播
    E.backward()

    # 更新参数
    optimizer.step()

    if iteration % 10 == 0:
        print(f'迭代 {iteration:3d}: 能量 = {E.item():.6f}, θ = {ansatz.theta.item():.6f}')

# 最终结果
print(f'\n最优能量: {energies[-1]:.6f}')
print(f'最优参数 θ = {ansatz.theta.item():.6f}')

# 绘制收敛曲线
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 4))

plt.subplot(1, 2, 1)
plt.plot(energies)
plt.xlabel('迭代次数')
plt.ylabel('能量')
plt.title('VQE 优化过程')
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(energies)
plt.xlabel('迭代次数')
plt.ylabel('能量')
plt.title('收敛细节（最后20次）')
plt.grid(True)
plt.xlim([80, 100])
plt.ylim([min(energies[-20:]) - 0.1, max(energies[-20:]) + 0.1])

plt.tight_layout()
plt.show()

📈 案例3：批量量子电路¶

目标：使用 vmap 批量处理多个量子电路

import torch
from torch import vmap
import time

def single_circuit(theta, state):
    """单个电路的前向传播"""
    # Hadamard 门
    H = (1 / torch.sqrt(torch.tensor(2.0))) * torch.tensor([
        [1, 1],
        [1, -1]
    ], dtype=torch.cfloat)

    # 旋转门
    cos = torch.cos(theta / 2)
    sin = torch.sin(theta / 2)
    RY = torch.tensor([[cos, -sin], [sin, cos]], dtype=torch.cfloat)

    # 应用门
    state = H @ state
    state = RY @ state

    return state

# 初始化
state = torch.tensor([[1], [0]], dtype=torch.cfloat)  # |0⟩
thetas = torch.linspace(0, 2*torch.pi, 100)  # 100 个不同的参数

# 方法1：循环（慢）
start = time.time()
results_loop = []
for theta in thetas:
    result = single_circuit(theta, state)
    results_loop.append(result)
results_loop = torch.stack(results_loop)
loop_time = time.time() - start

# 方法2：vmap（快）
start = time.time()
results_vmap = vmap(single_circuit, in_dims=(0, None))(thetas, state)
vmap_time = time.time() - start

print(f'循环时间: {loop_time:.4f}秒')
print(f'vmap时间: {vmap_time:.4f}秒')
print(f'加速比: {loop_time / vmap_time:.2f}x')

# 计算概率
probs = torch.abs(results_vmap[:, 0, 0]) ** 2  # |0⟩ 的概率

# 可视化
import matplotlib.pyplot as plt
plt.figure(figsize=(12, 4))

plt.subplot(1, 3, 1)
plt.plot(thetas.numpy(), probs.numpy())
plt.xlabel('θ (rad)')
plt.ylabel('P(|0⟩)')
plt.title('|0⟩ 的概率随 θ 变化')
plt.grid(True)

plt.subplot(1, 3, 2)
plt.plot(thetas.numpy(), torch.abs(results_vmap[:, 0, 0]).numpy())
plt.xlabel('θ (rad)')
plt.ylabel('振幅')
plt.title('|0⟩ 的振幅')
plt.grid(True)

plt.subplot(1, 3, 3)
plt.plot(thetas.numpy(), torch.abs(results_vmap[:, 1, 0]).numpy())
plt.xlabel('θ (rad)')
plt.ylabel('振幅')
plt.title('|1⟩ 的振幅')
plt.grid(True)

plt.tight_layout()
plt.show()

学习资源¶

📚 官方文档¶

PyTorch 中文文档
完整的 API 参考
官方教程翻译
PyTorch 60 分钟快速入门
张量操作
自动微分
神经网络训练

🎥 视频教程¶

PyTorch 官方 YouTube 频道
开发者大会演讲
实战教程
中文教程
B 站搜索"PyTorch 入门"
莫烦 Python - PyTorch 系列

🌐 在线课程¶

Fast.ai
面向编程初学者
实战项目导向
动手学深度学习
李沐大神著作
理论与实践结合

🔗 推荐文章¶

张量基础¶

自动微分¶

nn.Module¶

💻 练习平台¶

Google Colab
免费 GPU
无需配置环境
Kaggle Kernels
免费计算资源
大量数据集

总结¶

🎯 PyTorch 核心要点¶

概念	关键词	应用场景
张量	多维数组	数据表示、量子态
自动微分	梯度计算	参数优化、VQE
nn.Module	模块容器	量子电路、神经网络
GPU 加速	CUDA	大规模模拟
批量处理	vmap	并行计算

🚀 学习路径¶

第1周：张量基础
  ├── 创建张量
  ├── 张量运算
  └── 索引和切片

第2周：自动微分
  ├── 计算图
  ├── backward()
  └── 梯度下降

第3周：nn.Module
  ├── 构建网络
  ├── 参数管理
  └── 保存/加载

第4周：实战项目
  ├── VQE 实现
  ├── QAOA 实现
  └── 量子机器学习

💡 初学者常见问题¶

Q1: 什么时候用 Tensor，什么时候用 Parameter？ - Tensor：普通数据，不需要求导 - Parameter：模型参数，需要优化

Q2: 什么时候用 .to(device)？ - 当你要使用 GPU 加速时 - 需要将模型和数据都移到同一设备

Q3: 为什么有时梯度是 None？ - 检查 requires_grad=True - 确保计算是连续的（没有中断计算图）

Q4: vmap 什么时候用？ - 需要批量处理时 - 可以替换 for 循环提高性能

祝学习愉快！

如果本文档对您有帮助，欢迎分享给其他初学者。如有疑问，可以查阅官方文档或在社区提问。

本文档基于 DeepQuantum v4.4.0 项目生成 生成日期: 2026-01-14