如何在小批量中训练神经网络?

原文标题How do you Train a Neural Network in Mini-Batches?

我目前正在用 C++ 从头开始​​编写神经网络。

我现在正在向程序中添加批处理逻辑的步骤,我对它的工作原理有点困惑。

所以,假设我有 1000 个数据样本,我将其分成 10 个批次。在一个训练周期中,我会向它提供 10 个数据样本,然后运行一次反向传播。我将如何计算反向传播的误差?我会平均它的猜测,然后使用那组值吗?

由于运行反向传播并更改其权重取决于神经元的当前激活,我是否只使用最后一组数据中的值?还是我想错了,我确实在批次中的每条数据之后向后传播,但之后才调整权重?

这是我更新隐藏层的功能:

void Layer::BackProgagate(Layer* previousLayer, Layer* nextLayer, float weightLearningRate, float biasLearningRate, bool adjustValues)
{
    Node* node;
    Node* nextNode;
    Node* previousNode;
    
    for (int n = 0; n < Nodes.size(); n++)
    {
        node = Nodes[n];
        float gamma = 0;

        //Get the gamma value relative to the next layer
        for (int n2 = 0; n2 < nextLayer->Nodes.size(); n2++)
        {
            nextNode = nextLayer->Nodes[n2];

            //foreach neuron in the next layer, add the neuron's gamma times this neurons weight connected to it.
            gamma += nextNode->Gamma * node->Connections[n2];
        }

        gamma *= (1 - node->Value * node->Value);
        node->Gamma = gamma;

        if (!adjustValues)
            return;

        for (int n2 = 0; n2 < previousLayer->Nodes.size(); n2++)
        {
            previousNode = previousLayer->Nodes[n2];

            //Calculate a New Value to Subtract
            float Delta = gamma * previousNode->Value;

            //Apply the New Value to Adjust the Weight thats pointing to the neuron
            previousNode->Connections[n] -= Delta * weightLearningRate;

            //Apply the New Value to Adjust the Bias of the Previous Weight
            previousNode->Bias -= Delta * biasLearningRate;
        }
    }
}

这是我更新输出层的函数:

void Layer::BackPropagate(Layer* previousLayer, std::vector<float> desiredAnswer, float weightLearningRate, float biasLearningRate, bool adjustValues)
{
    Node* node;
    Node* nextNode;
    Node* previousNode;

    for (int n = 0; n < Nodes.size(); n++)
    {
        node = Nodes[n];
        float gamma = 0;

        gamma = (node->Value - desiredAnswer[n]);

        gamma *= (1 - node->Value * node->Value);

        node->Gamma = gamma;

        if (!adjustValues)
            return;

        for (int n2 = 0; n2 < previousLayer->Nodes.size(); n2++)
        {
            previousNode = previousLayer->Nodes[n2];

            //Calculate a New Value to Subtract
            float Delta = gamma * previousNode->Value;

            //Apply the New Value to Adjust the Weight thats pointing to the neuron
            previousNode->Connections[n] -= Delta * weightLearningRate;

            //Apply the New Value to Adjust the Bias of the Previous Weight
            previousNode->Bias -= Delta * biasLearningRate;
        }
    }
}

这是训练网络的函数:

void NeuralNetwork::TrainNetwork(std::vector<float> desiredAnswer)
{
    Layer* layer;

    float MSE = (pow(OutputLayer()->Nodes[0]->Value - desiredAnswer[0], 2) / 2) + (pow(OutputLayer()->Nodes[1]->Value - desiredAnswer[1], 2) / 2);
    std::cout << "MSE: " << MSE << "\n";

    //Iterate from the output layer to every layer except the inputs.
    for (int l = Layers.size() - 1; l > 0; l--)
    {
        layer = Layers[l];
        
        if (layer == OutputLayer())
            layer->BackPropagate(Layers[l - 1], desiredAnswer, weightLearningRate, biasLearningRate, false);
        else
            layer->BackProgagate(Layers[l - 1], Layers[l + 1], weightLearningRate, biasLearningRate);
    }
}

原文链接:https://stackoverflow.com//questions/71504238/how-do-you-train-a-neural-network-in-mini-batches

回复

我来回复
  • lejlot的头像
    lejlot 评论

    还是我想错了,我确实在批次中的每条数据之后向后传播,但之后才调整权重?

    正确的实现是对每个数据点分别计算所有的反向传播,然后对结果进行平均,这来自梯度算子的线性:

    GRAD 1/N SUM_i f_i = 1/N SUM_i GRAD f_i
    

    换句话说:平均梯度是梯度的平均值。

    2年前 0条评论