如何在小批量中训练神经网络？

青葱年少 2年前 machine-learning 498

原文标题 ：How do you Train a Neural Network in Mini-Batches?

我目前正在用 C++ 从头开始编写神经网络。

我现在正在向程序中添加批处理逻辑的步骤，我对它的工作原理有点困惑。

所以，假设我有 1000 个数据样本，我将其分成 10 个批次。在一个训练周期中，我会向它提供 10 个数据样本，然后运行一次反向传播。我将如何计算反向传播的误差？我会平均它的猜测，然后使用那组值吗？

由于运行反向传播并更改其权重取决于神经元的当前激活，我是否只使用最后一组数据中的值？还是我想错了，我确实在批次中的每条数据之后向后传播，但之后才调整权重？

这是我更新隐藏层的功能：

void Layer::BackProgagate(Layer* previousLayer, Layer* nextLayer, float weightLearningRate, float biasLearningRate, bool adjustValues)
{
    Node* node;
    Node* nextNode;
    Node* previousNode;
    
    for (int n = 0; n < Nodes.size(); n++)
    {
        node = Nodes[n];
        float gamma = 0;

        //Get the gamma value relative to the next layer
        for (int n2 = 0; n2 < nextLayer->Nodes.size(); n2++)
        {
            nextNode = nextLayer->Nodes[n2];

            //foreach neuron in the next layer, add the neuron's gamma times this neurons weight connected to it.
            gamma += nextNode->Gamma * node->Connections[n2];
        }

        gamma *= (1 - node->Value * node->Value);
        node->Gamma = gamma;

        if (!adjustValues)
            return;

        for (int n2 = 0; n2 < previousLayer->Nodes.size(); n2++)
        {
            previousNode = previousLayer->Nodes[n2];

            //Calculate a New Value to Subtract
            float Delta = gamma * previousNode->Value;

            //Apply the New Value to Adjust the Weight thats pointing to the neuron
            previousNode->Connections[n] -= Delta * weightLearningRate;

            //Apply the New Value to Adjust the Bias of the Previous Weight
            previousNode->Bias -= Delta * biasLearningRate;
        }
    }
}

这是我更新输出层的函数：

void Layer::BackPropagate(Layer* previousLayer, std::vector<float> desiredAnswer, float weightLearningRate, float biasLearningRate, bool adjustValues)
{
    Node* node;
    Node* nextNode;
    Node* previousNode;

    for (int n = 0; n < Nodes.size(); n++)
    {
        node = Nodes[n];
        float gamma = 0;

        gamma = (node->Value - desiredAnswer[n]);

        gamma *= (1 - node->Value * node->Value);

        node->Gamma = gamma;

        if (!adjustValues)
            return;

        for (int n2 = 0; n2 < previousLayer->Nodes.size(); n2++)
        {
            previousNode = previousLayer->Nodes[n2];

            //Calculate a New Value to Subtract
            float Delta = gamma * previousNode->Value;

            //Apply the New Value to Adjust the Weight thats pointing to the neuron
            previousNode->Connections[n] -= Delta * weightLearningRate;

            //Apply the New Value to Adjust the Bias of the Previous Weight
            previousNode->Bias -= Delta * biasLearningRate;
        }
    }
}

这是训练网络的函数：

void NeuralNetwork::TrainNetwork(std::vector<float> desiredAnswer)
{
    Layer* layer;

    float MSE = (pow(OutputLayer()->Nodes[0]->Value - desiredAnswer[0], 2) / 2) + (pow(OutputLayer()->Nodes[1]->Value - desiredAnswer[1], 2) / 2);
    std::cout << "MSE: " << MSE << "\n";

    //Iterate from the output layer to every layer except the inputs.
    for (int l = Layers.size() - 1; l > 0; l--)
    {
        layer = Layers[l];
        
        if (layer == OutputLayer())
            layer->BackPropagate(Layers[l - 1], desiredAnswer, weightLearningRate, biasLearningRate, false);
        else
            layer->BackProgagate(Layers[l - 1], Layers[l + 1], weightLearningRate, biasLearningRate);
    }
}

原文链接：https://stackoverflow.com//questions/71504238/how-do-you-train-a-neural-network-in-mini-batches

我来回复

lejlot 评论
还是我想错了，我确实在批次中的每条数据之后向后传播，但之后才调整权重？

正确的实现是对每个数据点分别计算所有的反向传播，然后对结果进行平均，这来自梯度算子的线性：
```
GRAD 1/N SUM_i f_i = 1/N SUM_i GRAD f_i
```
换句话说：平均梯度是梯度的平均值。
2年前 0条评论

如何在小批量中训练神经网络？

回复

相关问题