如何对 LSTM 和线性层的输出进行二进制分类

社会演员多 2年前 pytorch 201

原文标题 ：How to do a binary classification for output of LSTM and Linear Layer

我正在尝试为我的 AI 助手项目构建唤醒词模型。我正在获取音频，将它们转换为 mfcc，将它们提供给 LSTM，LSTM 给我输出（我使用 h_n 输出）形状，如（4,32,32） directions∗num_layers, batch, hidden_size然后我把它给我的线性层，它给了我 (4,32,1)。我正在尝试解决二进制分类问题，所以我有 2 个类0不唤醒1唤醒 AI。但我不明白线性层的输出。我想像（32,1）这样的输出，这将是批量大小，预测。但是我应该如何从线性层处理这个（4,32,1）。我想我错过了一些东西这里的基础知识。

请你给我解释一下。我将在下面留下我的模型代码。

class LSTMWakeWord(nn.Module):
    def __init__(self,input_size,hidden_size,num_layers,dropout,bidirectional,num_of_classes, device='cpu'):
        super(LSTMWakeWord, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.num_layers = num_layers
        self.device = device
        self.bidirectional = bidirectional
        self.directions = 2 if bidirectional else 1

        self.lstm = nn.LSTM(input_size=input_size,
                            hidden_size = hidden_size,
                            num_layers = num_layers,
                            dropout=dropout,
                            bidirectional=bidirectional,
                            batch_first=True)
        self.layernorm = nn.LayerNorm(input_size)

        self.classifier = nn.Linear(hidden_size , num_of_classes)

    def _init_hidden(self,batch_size):
        n, d, hs = self.num_layers, self.directions, self.hidden_size
        return (torch.zeros(n * d, batch_size, hs).to(self.device),
                torch.zeros(n * d, batch_size, hs).to(self.device))

    def forward(self,x):
        # the values with e+xxx are gone. so it normalizes the values
        x = self.layernorm(x)
        # x shape ->  feature(n_mfcc),batch,seq_len(time)
        hidden = self._init_hidden(x.size()[0])
        out, (hn, cn) = self.lstm(x, hidden)
        print("hn "+str(hn.shape))# directions∗num_layers, batch, hidden_size
        #print("out " + str(out.shape))# batch, seq_len, direction(2 or 1)*hidden_size
        out = self.classifier(hn)
        print("out2 " + str(out.shape))

        return out

原文链接：https://stackoverflow.com//questions/71565894/how-to-do-a-binary-classification-for-output-of-lstm-and-linear-layer

我来回复

ki-ljl 评论
你可以试试这个：
```
hn = hn[-1, :, :]
out = self.classifier(hn)
```
2年前 0条评论

如何对 LSTM 和线性层的输出进行二进制分类

回复

相关问题