学习如何使用GPT2进行文本生成(torch+transformers)
GPT2是OPen AI发布的一个预训练语言模型,见论文《Language Models are Unsupervised Multitask Learners》,GPT-2利用单向Transformer的优势,做一些BERT使用的双向Transformer所做不到的事。那就是通过上文生成下文文本。
理论部分的文章有很多,这里不做深究,下面直接看代码吧
导入相关包
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel
加载tokenizer
tokenizer = GPT2Tokenizer.from_pretrained('gpt2')
编码输入
对给出的文本进行编码,并转换为tensor
indexed_tokens = tokenizer.encode("Xiao Ming is a primary school student. He likes playing games")
print( tokenizer.decode(indexed_tokens))
tokens_tensor = torch.tensor([indexed_tokens])
Xiao Ming is a primary school student. He likes playing games
加载预训练模型(权重)
model = GPT2LMHeadModel.from_pretrained('gpt2')
将模型设置为评估模式
model.eval()
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
tokens_tensor = tokens_tensor.to(device)
model.to(device)
预测所有标记
with torch.no_grad():
outputs = model(tokens_tensor)
predictions = outputs[0]
得到预测的下一词
predicted_index = torch.argmax(predictions[0, -1, :]).item()
predicted_text = tokenizer.decode(indexed_tokens + [predicted_index])
print(predicted_text)
可以看到,GPT2预测的下一个词是and
Xiao Ming is a primary school student. He likes playing games and
生成一段完整的话
stopids = tokenizer.convert_tokens_to_ids(["."])[0]
past = None
for i in range(100):
with torch.no_grad():
output, past = model(tokens_tensor, past_key_values=past, return_dict=False)
token = torch.argmax(output[..., -1, :])
indexed_tokens += [token.tolist()]
if stopids== token.tolist():
break
tokens_tensor = token.unsqueeze(0)
sequence = tokenizer.decode(indexed_tokens)
print(sequence)
生成的文本为:and playing with his friends.与给出的句子构成了一段完整的话。
Xiao Ming is a primary school student. He likes playing games and playing with his friends.
试试其他语句
我们将上面的句子加上句号,gpt2会生成一个不一样的句子。
原:Xiao Ming is a primary school student. He likes playing games
现:Xiao Ming is a primary school student. He likes playing games.
生成为:
Xiao Ming is a primary school student. He likes playing games. He is also a member of the team that won the World Cup in 2010.
文章出处登录后可见!