CNN模块设计 | Bamboo Traces

AlexNet(2012)

分组卷积

操作：先将特征通道分组，再卷积操作限制在对应组内。

优点：减少了浮点运算量和参数量。

缺点：阻碍了通道之间的信息流，削弱了网络的表达能力。

ShuffleNet(2018)

通道洗牌操作：用较小的代价实现了组卷积通道间的信息交互，精度提升。

ShuffleNetV2

通道划分操作代替分组卷积

MobileNet(2017)

深度可分离卷积：将标准卷积拆分为深度卷积和点卷积，其中深度卷积为分组数等于通道数的组卷积。

MobileNetV2(2018)

倒置残差模块：由一个升维的点卷积、一个深度卷积、一个降维的点卷积以及残差连接组成.

MobileNetV3(2019)

倒置残差模块加入SE模块：引入了h-swish作为非线性激活函数，同时引入了网络搜索技术来进一步提高网络效率.

EfficientNet(2020)

基于 MobileNetV2模块使用网络搜索技术

EfficientNetV2(2021)

利用优化器的角度改进了MobileNetV3模块，并将其加入搜索空间

原始卷积

由图可以看到，卷积过程为：将filter与矩阵叠加，然后执行相应元素的相乘，将相乘的结果进行求和，得到输出图片的目标像素值（特征图），重复操作在所有位置上。

卷积核的不同可以提取不同的特征

代码实现：

import numpy as np

def numpy_conv2d(image, kernel, stride=1, padding='same'):
    # 输入和卷积核转为NumPy数组
    image = np.array(image)
    kernel = np.array(kernel)
    
    # 处理多通道输入（RGB图像）
    if len(image.shape) == 3:
        return np.stack([numpy_conv2d(image[:,:,c], kernel) for c in range(image.shape[2])], axis=2)
    
    # 添加padding
    k_h, k_w = kernel.shape
    if padding == 'same':
        pad_h = (k_h - 1) // 2
        pad_w = (k_w - 1) // 2
        image = np.pad(image, ((pad_h, pad_h), (pad_w, pad_w)), mode='constant')
    
    # 计算输出尺寸
    out_h = (image.shape[0] - k_h) // stride + 1
    out_w = (image.shape[1] - k_w) // stride + 1
    
    # 初始化输出矩阵
    output = np.zeros((out_h, out_w))
    
    # 向量化计算
    for i in range(out_h):
        for j in range(out_w):
            h_start = i * stride
            w_start = j * stride
            receptive_field = image[h_start:h_start+k_h, w_start:w_start+k_w]
            output[i, j] = np.sum(receptive_field * kernel)
    return output

# 测试用例
image = np.random.randn(224, 224)  # 模拟输入图像
kernel = np.array([[1, 0, -1],      # Sobel水平边缘检测核
                   [2, 0, -2],
                   [1, 0, -1]])
result = numpy_conv2d(image, kernel, padding='same')

使用pytorch实现：

import torch
import torch.nn as nn

# 定义标准卷积层
standard_conv = nn.Conv2d(
    in_channels=3,      # 输入通道数
    out_channels=64,    # 输出通道数
    kernel_size=3,      # 卷积核尺寸 (3x3)
    stride=1,           # 步长
    padding=1,          # 填充
    bias=False          # 是否使用偏置项
)

# 输入数据 (Batch=4, Channels=3, Height=224, Width=224)
x = torch.randn(4, 3, 224, 224)
output = standard_conv(x)
print(output.shape)  # 输出形状: [4, 64, 224, 224]

分组卷积

特性：参数量减少，是普通卷积的1/group倍。用来降低参数量，就是速度会减慢。

#分4组
group_conv = nn.Conv2d(
    in_channels=64,
    out_channels=64,  # 输出通道数必须能被 groups 整除
    kernel_size=3,
    stride=1,
    padding=1,
    groups=4,          # 关键参数：分4组
    bias=False
)

x = torch.randn(4, 64, 224, 224)
output = group_conv(x)
print(output.shape)  # 输出形状: [4, 64, 224, 224]

深度可分离卷积

class DepthwiseSeparableConv(nn.Module):
    def __init__(self, in_channels, out_channels):
        super().__init__()
        # 深度卷积（每个输入通道单独卷积）
        self.depthwise = nn.Conv2d(
            in_channels, 
            in_channels,  # 输出通道数 = 输入通道数
            kernel_size=3,
            padding=1,
            groups=in_channels,  # 关键：groups=输入通道数
            bias=False
        )
        # 逐点卷积（1x1卷积组合通道信息）
        self.pointwise = nn.Conv2d(
            in_channels,
            out_channels,
            kernel_size=1,  # 1x1卷积核
            bias=False
        )
    
    def forward(self, x):
        x = self.depthwise(x)
        x = self.pointwise(x)
        return x

# 使用示例
ds_conv = DepthwiseSeparableConv(in_channels=64, out_channels=128)
x = torch.randn(4, 64, 224, 224)
output = ds_conv(x)
print(output.shape)  # 输出形状: [4, 128, 224, 224]