如何解决我无法弄清楚maxpool层的输出
我试图从CS231n了解三层CNN的示例代码。但是有一个变量,我无法理解它的含义。在下面的代码中,它是变量D
class ThreeLayerConvNet(object):
"""
A three-layer convolutional network with the following architecture:
conv - relu - 2x2 max pool - affine - relu - affine - softmax
The network operates on minibatches of data that have shape (N,C,H,W)
consisting of N images,each with height H and width W and with C input
channels.
"""
def __init__(
self,input_dim=(3,32,32),num_filters=32,filter_size=7,hidden_dim=100,num_classes=10,weight_scale=1e-3,reg=0.0,dtype=np.float32,):
"""
Initialize a new network.
Inputs:
- input_dim: Tuple (C,W) giving size of input data
- num_filters: Number of filters to use in the convolutional layer
- filter_size: Width/height of filters to use in the convolutional layer
- hidden_dim: Number of units to use in the fully-connected hidden layer
- num_classes: Number of scores to produce from the final affine layer.
- weight_scale: Scalar giving standard deviation for random initialization
of weights.
- reg: Scalar giving L2 regularization strength
- dtype: numpy datatype to use for computation.
"""
self.params = {}
self.reg = reg
self.dtype = dtype
C,W = input_dim
filter_height = filter_size # For convolution
filter_widht = filter_size # For convolution
D = num_filters * (H // 2) * (W // 2) # This line
self.params['W1'] = np.random.normal(scale=weight_scale,size=(num_filters,filter_height,filter_widht))
self.params['b1'] = np.zeros((num_filters,))
self.params['W2'] = np.random.normal(scale=weight_scale,size=(D,hidden_dim))
self.params['b2'] = np.zeros((hidden_dim,))
self.params['W3'] = np.random.normal(scale=weight_scale,size=(hidden_dim,num_classes))
self.params['b3'] = np.zeros((num_classes,))
因为它被用作第二层的形状,所以我认为这是最大池层中输出要素的数量。但是,我认为输出功能的数量应该是
output_nb = num_filters * filter_weight * filter_height // 4
因为我们在最大池之前使用了卷积,所以输出的数量应该已经小于原始像素数H*W
。然后,最大池层会从每个4个特征中选择最大值,因此应将其除以4。
D
在这里代表什么?
版权声明:本文内容由互联网用户自发贡献,该文观点与技术仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 dio@foxmail.com 举报,一经查实,本站将立刻删除。