搜档网
当前位置:搜档网 › Backpropagation Wikipedia Definition

Backpropagation Wikipedia Definition

Backpropagation Wikipedia Definition
Backpropagation Wikipedia Definition

https://www.sodocs.net/doc/3a13438328.html,/wiki/Backpropagation#cite_note-0

Backpropagation

From Wikipedia, the free encyclopedia

This article is about the computer algorithm. For the biological process, see Neural backpropagation.

Backpropagation is a common method of training artificial neural networks so as to minimize the objective function. Arthur E. Bryson and Yu-Chi Ho described it as a multi-stage dynamic system optimization method in 1969.[1][2] It wasn't until 1974 and later, when applied in the context of neural networks and through the work of Paul Werbos,[3]David E. Rumelhart, Geoffrey E. Hinton and Ronald J. Williams,[4][5] th at it gained recognition, and it led to a “renaissance” in the field of artificial neural network research.

It is a supervised learning method, and is a generalization of the delta rule. It requires a dataset of the desired output for many inputs, making up the training set. It is most useful for feed-forward networks (networks that have no feedback, or simply, that have no connections that loop). The term is an abbreviation for "backward propagation of errors". Backpropagation requires that

(or "nodes") be differentiable.

[edit]Summary

For better understanding, the backpropagation learning algorithm can be divided into two phases: propagation and weight update.

[edit]Phase 1: Propagation

Each propagation involves the following steps:

1.Forward propagation of a training pattern's input through the neural network in order to

generate the propagation's output activations.

2.Backward propagation of the propagation's output activations through the neural network

using the training pattern's target in order to generate the deltas of all output and hidden neurons.

[edit]Phase 2: Weight update

For each weight-synapse:

1.Multiply its output delta and input activation to get the gradient of the weight.

2.Bring the weight in the opposite direction of the gradient by subtracting a ratio of it from

the weight.

This ratio influences the speed and quality of learning; it is called the learning rate. The sign of the gradient of a weight indicates where the error is increasing, this is why the weight must be updated in the opposite direction.

Repeat phase 1 and 2 until the performance of the network is satisfactory.

[edit]Modes of learning

There are two modes of learning to choose from: One is on-line(incremental) learning and the other is batch learning. In on-line(incremental) learning, each propagation is followed immediately by a weight update. In batch learning, many propagations occur before weight updating occurs. Batch learning requires more memory capacity, but on-line learning requires more updates.

[edit]Algorithm

Actual algorithm for a 3-layer network (only one hidden layer):

Initialize the weights in the network (often randomly)

Do

For each example e in the training set

O = neural-net-output(network, e) ; forward pass

T = teacher output for e

Calculate error (T - O) at the output units

Compute delta_wh for all weights from hidden layer to output layer ; backward pass

Compute delta_wi for all weights from input layer to hidden layer ; backward pass continued

Update the weights in the network

Until all examples classified correctly or stopping criterion satisfied

Return the network

As the algorithm's name implies, the errors propagate backwards from the output nodes to the inner nodes. Technically speaking, backpropagation calculates the gradient of the error of the network regarding the network's modifiable weights.[6] This gradient is almost always used in a simple stochastic gradient descent algorithm to find weights that minimize the error. Often the term "backpropagation" is used in a more general sense, to refer to the entire procedure encompassing both the calculation of the gradient and its use in stochastic gradient descent. Backpropagation usually allows quick convergence on satisfactory local minima for error in the kind of networks to which it is suited.

Backpropagation networks are necessarily multilayer perceptrons (usually with one input, one hidden, and one output layer). In order for the hidden layer to serve any useful function, multilayer networks must have non-linear activation functions for the multiple layers: a multilayer network using only linear activation functions is equivalent to some single layer, linear network. Non-linear activation functions that are commonly used include the logistic function, the softmax function, and the gaussian function.

The backpropagation algorithm for calculating a gradient has been rediscovered a number of times, and is a special case of a more general technique called automatic differentiation in the reverse accumulation mode.

It is also closely related to the Gauss–Newton algorithm, and is also part of continuing research in neural backpropagation.

[edit]Multithreaded backpropagation

Backpropagation is an iterative process that can often take a great deal of time to complete. When multicore computers are used multithreaded techniques can greatly decrease the amount of time that backpropagation takes to converge. If batching is being used, it is relatively simple to adapt the backpropagation algorithm to operate in a multithreaded manner.

The training data is broken up into equally large batches for each of the threads. Each thread executes the forward and backward propagations. The weight and threshold deltas are summed for each of the threads. At the end of each iteration all threads must pause briefly for the weight and threshold deltas to be summed and applied to the neural network. This process continues for each iteration. This multithreaded approach to backpropagation is used by the Encog Neural Network Framework.[7]

[edit]Limitations

?The convergence obtained from backpropagation learning is very slow.

?The convergence in backpropagation learning is not guaranteed.

?The result may generally converge to any local minimum on the error surface, since stochastic gradient descent exists on a surface which is not flat.

?Backpropagation learning requires input scaling or normalization. Inputs are usually scaled into the range of +0.1f to +0.9f for best performance.[clarification needed]

[edit]References

1.^Stuart Russell and Peter Norvig. Artificial Intelligence A Modern Approach. p. 578.

"The most popular method for learning in multilayer networks is called Back-propagation.

It was first invented in 1969 by Bryson and Ho, but was largely ignored until the mid-1980s."

2.^ Arthur Earl Bryson, Yu-Chi Ho (1969). Applied optimal control: optimization,

estimation, and control. Blaisdell Publishing Company or Xerox College Publishing.

pp. 481.

3.^ Paul J. Werbos. Beyond Regression: New Tools for Prediction and Analysis in the

Behavioral Sciences. PhD thesis, Harvard University, 1974

4.^Alpayd?n, Ethem (2010).Introduction to machine learning (2nd ed. ed.). Cambridge,

Mass.: MIT Press. p. 250. ISBN978-0-262-01243-0. "...and hence the name backpropagation was coined (Rumelhart, Hinton, and Williams 1986a)."

5.^ Rumelhart, David E.; Hinton, Geoffrey E., Williams, Ronald J. (8 October 1986).

"Learning representations by back-propagating errors". Nature323 (6088): 533–536.doi:10.1038/323533a0.

6.^ Paul J. Werbos (1994). The Roots of Backpropagation. From Ordered Derivatives to

Neural Networks and Political Forecasting. New York, NY: John Wiley & Sons, Inc.

7.^ J. Heaton https://www.sodocs.net/doc/3a13438328.html,/encog/mprop/compare.html Applying

Multithreading to Resilient Propagation and Backpropagation

[edit]External links

?Backpropagation for mathematicians

?Chapter 7 The backpropagation algorithm of Neural Networks - A Systematic Introduction by Raúl Rojas (ISBN 978-3540605058)

?NeuronDotNet - A modular implementation of artificial neural networks in C# along with sample applications

?Implementation of BackPropagation in C++

?Implementation of BackPropagation in C#

?Implementation of BackPropagation in Java

?Implementation of BackPropagation in Ruby

?Implementation of BackPropagation in Python

?Implementation of BackPropagation in PHP

?Quick explanation of the backpropagation algorithm

?Graphical explanation of the backpropagation algorithm

?Concise explanation of the backpropagation algorithm using math notation ?Backpropagation neural network tutorial at the Wikiversity

Wiki环境下的大学英语协作教学模式,海外英语.doc

Wiki环境下的大学英语协作教学模式,海 外英语, 《海外英语》 【摘要】基于Wiki在教育领域的发展,在大学英语教学中引入Wiki教学理念。Wiki教学模式是充分利用信息技术,教师与学生协作构建知识、分享知识,共同参与教学的一种协作式教学。wiki模式的有效开展需要教师精心的设计、不断激发学生兴趣和师生共同构建知识。Wiki教学模式着眼于提高学习者学习的参与性、创造性及协作意识。 【关键词】Wiki;共享;构建;协作;大学英语教学 【中图分类号】G420 【文献标识码】A 【编号】 一基于Wiki的理解 Wiki一词源于夏威夷的一个词,意为“快捷”。“Wiki是一种在线协作的写作工具,是一种可以马上创建的网页”[1]。是一个“协作网络空间,任何人都可以添加已经公布的内容”[2] 。Wiki

环境下,在线的学生都可以积极参与整合共同创建的知识的过程中。“基于Wiki的设计用于促进团体合作、共享,并建立网上的内容,特别适用于时空分开的远程学习者”[3] 。Wiki环境下,参与者共享空间,利用Wiki技术通过撰写、讨论、评论、编辑、反馈和评估等过程最终完成一个共享的成果。 “Wiki是目前信息时代中我们教育环境下可能是最为有效的工具。教师要最大化他们与学生联系的能力,就必须知道和了解这一工具”【同1】。“教育工必须学会接受一个新的“参与时代”,指导学生在创建和分享信息和知识方面的在线协作能力”[4]。教师创建Wiki既可以促进信息管理,同时也能为学生提供一种工具以系统化、组织、和分类各类信息、统计数据和观点,从而提高学生学习能力和效益。Cumming等[5]认为相对于单纯的网络,Wiki的特点在于,一方面Wiki网页用于服务和提供信息时是静态的实体,另一方面,Wiki是交互式的,多人可以同时进入操作进行编辑、修订和补充列单,任务的完成和修订都是可以同时进行。这样合作花费的时间少,复杂程度低,而最后的成果在线可以提供给每一个人使用和了解。对于教师来说,当教师没有时间面对面和学生进行逐一教学讨论时,Wiki是最好的载体。学生在课外完成指定作业需要相互讨论时,Wiki是最佳的选择。Wiki环境下的教学,能促进技术在课程中的综合利用和便利教学。 二Wiki用于大学英语教学的可行性

相关主题