声明
本项目基于pytorch实现。可以自行更换训练文本,但请记住:训练文本一定一定得是TXT格式!!!
另外,本项目仅用作学习交流,严禁用于商用、参加任何形式以及团体组织的比赛,请使用者严格遵守Apache License Version 开源协议
作者不承担此程序造成的一切后果与责任。
版权归作者重庆市小学生胡哲涵所有,Github likehuiyuanai。
PS:最近喜欢看某科学的超电磁炮,所以实例代码用的是日本轻小说《魔法禁书目录》作为训练数据,暂拒绝公开此数据集(你可以自己整理,字数多亿些训练效果好点
本博客(帖子)的版权也同样归胡哲涵所有,严禁转发或参加比赛!
正式开始(暂时没有开发代码生成器的意向,毕竟我不喜欢。没有bug的代码等同于人失去了灵魂!)
import torch
import torch.nn as nn
import numpy as np
import matplotlib.pyplot as plt
import time
from scipy.sparse import csr_matrix
from tensorboardX import SummaryWriter
%matplotlib inline
导入库,幼儿园都学过吧。
with open('./mfjsml.txt', 'r', encoding='utf-8') as f:
data = f.readlines()
读取数据集,mfjsml这个名字可以改,但数据集和jupyter lab代码放一个文件夹下!
data=''.join(data)
print(data[:100])
展示数据集的一部分
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)
print(f'data has {data_size} characters, {vocab_size} unique.')
char_to_ix = { ch:i for i,ch in enumerate(chars) }
ix_to_char = { i:ch for i,ch in enumerate(chars) }
接下来我们开始构建LSTM模型
X_train = csr_matrix((len(data), len(chars)), dtype=np.int)
char_id = np.array([chars.index(c) for c in data])
X_train[np.arange(len(data)), char_id] = 1
y_train = np.roll(char_id,-1)
X_train.shape
y_train.shape
def get_batch(X_train, y_train, seq_length):
'''Return a training batch with certain number of X and y pairs.'''
X = X_train
#X = torch.from_numpy(X_train).float()
y = torch.from_numpy(y_train).long()
for i in range(0, len(y), seq_length):
id_stop = i+seq_length if i+seq_length < len(y) else len(y)
yield([torch.from_numpy(X[i:id_stop].toarray().astype(np.float32)),
y[i:id_stop]])
def sample_chars(rnn, X_seed, h_prev, length=20):
'''Generate text using trained model'''
X_next = X_seed
results = []
with torch.no_grad():
for i in range(length):
y_score, h_prev = rnn(X_next.view(1,1,-1), h_prev)
y_prob = nn.Softmax(0)(y_score.view(-1)).detach().numpy()
y_pred = np.random.choice(chars,1, p=y_prob).item()
results.append(y_pred)
X_next = torch.zeros_like(X_seed)
X_next[chars.index(y_pred)] = 1
return ''.join(results)
class nn_LSTM(nn.Module):
def __init__(self, input_size, hidden_size, output_size):
super().__init__()
self.hidden_size = hidden_size
self.lstm = nn.LSTM(input_size, hidden_size)
self.out = nn.Linear(hidden_size, output_size)
def forward(self, X, hidden):
_, hidden = self.lstm(X, hidden)
output = self.out(hidden[0])
return output, hidden
def initHidden(self):
return (torch.zeros(1, 1, self.hidden_size),
torch.zeros(1, 1, self.hidden_size)
)
hidden_size = 256
seq_length = 25
rnn = nn_LSTM(vocab_size, hidden_size, vocab_size)
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(rnn.parameters(), lr=0.005)
def train(X_batch, y_batch):
h_prev = rnn.initHidden()
optimizer.zero_grad()
batch_loss = torch.tensor(0, dtype=torch.float)
for i in range(len(X_batch)):
y_score, h_prev = rnn(X_batch[i].view(1,1,-1), h_prev)
loss = loss_fn(y_score.view(1,-1), y_batch[i].view(1))
batch_loss += loss
batch_loss.backward()
optimizer.step()
return y_score, batch_loss/len(X_batch)
writer = SummaryWriter(f'logs/lstm1_{time.strftime("%Y%m%d-%H%M%S")}')
准备好了吗?所有CUDA/Tensor核心全部启动启动还有这个,启动训练!
all_losses = []
print_every = 100
for epoch in range(20):
for batch in get_batch(X_train, y_train, seq_length):
X_batch, y_batch = batch
_, batch_loss = train(X_batch, y_batch)
all_losses.append(batch_loss.item())
if len(all_losses)%print_every==1:
print(f'----\nRunning Avg Loss:{np.mean(all_losses[-print_every:])} at iter: {len(all_losses)}\n----')
# log to tensorboard every X iterations. Can be removed if Tensorboard is not installed.
writer.add_scalar('loss', np.mean(all_losses[-100:]), len(all_losses))
# generate text every X iterations
print(sample_chars(rnn, X_batch[0], rnn.initHidden(), 200))
终于可以启动生成器了
print(sample_chars(rnn, X_batch[20], rnn.initHidden(), 200))
torch.save(rnn.state_dict(), 'shediao.pth')
把模型存着。你自己训练的模型你自己可以随便搞,作者对此没有版权和责任 另外,推荐大家去看看某科学的超电磁炮,特别好看!要联系的去我的主页加我邮箱
共 5 条回复
我才10岁,我就是学前班的了
萌新
你都12岁了还说你是
6
做的怎么样?本人觉得在人工智能邻域就是幼儿园水平,不过本人也在努力学习新知识新技术,每天以一个12岁六年级小学生的身份硬啃外语论文和高等数学,我觉得知识都不难,只要是自己喜欢的事情,哪怕造火箭都不困难。对了,大家运行代码报错就来回复我,我尽量全部解答。也提前祝大家1024快乐,学业有成!同志们加油!