WhatsApp网页版登录WhatsApp网页版登录

WhatsApp中文版

前向和递归神经网络的区别_RNN递归神经网络

一文彻底搞懂RNN、BPTT(递归神经网络)

RNN:递归神经网络(RNN)是时间递归神经网络(recurrent neural network)和结构递归神经网络(recursive neural

network)的总称。时间递归神经网络的神经元间连接构成矩阵,而结构递归神经网络结构更加复杂。

【基本描述】

1、递归神经网络RNN主要用来处理序列有关数据whatsapp网页版,比如时间序列、文本序列等。序列是被排成一列的对象,RNN处理序列对象之所以存在优势,是因为RNN神经元中充分考虑了序列输入中前段时间一定的权重。

2、RNN网络结构:和传统神经网络一样,RNN网络结构分为输入层、隐藏层和输出层

RNN网络特别之处在于如下:SYNAPSE_h会连接当前t时刻的隐藏层神经元whatsapp网页版,同时也连接着t-1时刻的神经元结构,所以RNN神经网络中的参数主要有SYNAPSE_O对应的U矩阵,SYNAPSE_h对应的W矩阵和SYNAPSE_1对应的V矩阵,如下图U是3*3矩阵,W是4*4矩阵,V是4*2矩阵。

3、公式推导_前向传播

前向传播相对简单,其主要涉及的公式见下:

4、公式推导_反向传播

【实例】

例:用RNN来实现一个八位的二进制数加法运算。

import copy, numpy as np np.random.seed(0) # compute sigmoid nonlinearity def

sigmoid(x): output = 1 / (1 + np.exp(-x)) return output # convert output of

sigmoid function to its derivative def sigmoid_output_to_derivative(output):

return output * (1 - output) # training dataset generation int2binary = {}

binary_dim = 8 largest_number = pow(2, binary_dim) binary = np.unpackbits(

np.array(

range(largest_number)

, dtype=np.uint8).T, axis=1) for i in

range(largest_number): int2binary

= binary

# input variables alpha = 0.1

input_dim = 2 hidden_dim = 16 output_dim = 1 # initialize neural network

weights synapse_0 = 2 * np.random.random((input_dim, hidden_dim)) - 1 synapse_1

= 2 * np.random.random((hidden_dim, output_dim)) - 1 synapse_h = 2 *

np.random.random((hidden_dim, hidden_dim)) - 1 synapse_0_update =

np.zeros_like(synapse_0) synapse_1_update = np.zeros_like(synapse_1)

synapse_h_update = np.zeros_like(synapse_h) # training logic for j in

range(10000): # generate a simple addition problem (a + b = c) a_int =

np.random.randint(largest_number / 2) # int version a = int2binary

a_int

binary encoding b_int = np.random.randint(largest_number / 2) # int version b =

int2binary

b_int

# binary encoding # true answer c_int = a_int + b_int c =

int2binary

c_int

# where we'll store our best guess (binary encoded) d =

np.zeros_like(c) overallError = 0 layer_2_deltas = list() layer_1_values =

list() layer_1_values.append(np.zeros(hidden_dim)) # moving along the positions

in the binary encoding for position in range(binary_dim): # generate input and

output X = np.array(

binary_dim - position - 1

, b

binary_dim - position -

) y = np.array(

binary_dim - position - 1

).T # hidden layer (input ~+

prev_hidden) layer_1 = sigmoid(np.dot(X, synapse_0) +

np.dot(layer_1_values

-1

, synapse_h)) # output layer (new binary

representation) layer_2 = sigmoid(np.dot(layer_1, synapse_1)) # did we miss?...

if so, by how much? layer_2_error = y - layer_2

layer_2_deltas.append((layer_2_error) * sigmoid_output_to_derivative(layer_2))

overallError += np.abs(layer_2_error

) # decode estimate so we can print it

out d

binary_dim - position - 1

= np.round(layer_2

) # store hidden layer

so we can use it in the next timestep

layer_1_values.append(copy.deepcopy(layer_1)) future_layer_1_delta =

np.zeros(hidden_dim) for position in range(binary_dim): X =

np.array(

position

, b

position

) layer_1 = layer_1_values

-position - 1

prev_layer_1 = layer_1_values

-position - 2

# error at output layer

layer_2_delta = layer_2_deltas

-position - 1

# error at hidden layer

layer_1_delta = (future_layer_1_delta.dot(synapse_h.T) + layer_2_delta.dot(

synapse_1.T)) * sigmoid_output_to_derivative(layer_1) # let's update all our

weights so we can try again synapse_1_update +=

np.atleast_2d(layer_1).T.dot(layer_2_delta) synapse_h_update +=

np.atleast_2d(prev_layer_1).T.dot(layer_1_delta) synapse_0_update +=

X.T.dot(layer_1_delta) future_layer_1_delta = layer_1_delta synapse_0 +=

synapse_0_update * alpha synapse_1 += synapse_1_update * alpha synapse_h +=

synapse_h_update * alpha synapse_0_update *= 0 synapse_1_update *= 0

synapse_h_update *= 0 # print out progress if (j % 1000 == 0): print("Error:" +

str(overallError)) print("Pred:" + str(d)) print("True:" + str(c)) out = 0 for

index, x in enumerate(reversed(d)): out += x * pow(2, index) print(str(a_int) +

" + " + str(b_int) + " = " + str(out)) print("------------")

以上源码经过几次得到就能得到很好的效果,详细的执行逻辑可以跟踪代码进行查看!

【总结】

时间递归神经网络可以描述动态时间行为,但是简单递归神经网络无法处理随着递归whatsapp登录,梯度爆炸或者梯度消失的问题,并且难以捕捉长期时间关联;有效的处理方法是忘掉错误的信息,记住正确的信息。LSTM能够比较好的解决这个问题。

相关文章

«    2025年8月    »
123
45678910
11121314151617
18192021222324
25262728293031

控制面板

您好,欢迎到访网站!
  查看权限

网站分类

最近发表

最新留言

    文章归档

    标签列表

    友情链接