10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 1 Development of Complex Curricula for Molecular Bionics and InfobionicsPrograms within a consortial* framework** Consortium leader PETER PAZMANY CATHOLIC UNIVERSITY Consortium members SEMMELWEIS UNIVERSITY, DIALOG CAMPUS PUBLISHER The Project has been realisedwith the support of the European Union and has been co-financed by the European Social Fund *** **Molekuláris bionika és Infobionika Szakok tananyagának komplex fejlesztése konzorciumi keretben ***A projekt az Európai Unió támogatásával, az Európai Szociális Alap társfinanszírozásával valósul meg. PETER PAZMANY CATHOLIC UNIVERSITY SEMMELWEIS UNIVERSITY sote_logo.jpg dk_fejlec.gif INFOBLOKK 10/5/2011 TÁMOP –4.1.2-08/2/A/KMR-2009-0006 2 Peter Pazmany Catholic University Faculty of Information Technology Feedforward Neural Networks www.itk.ppke.hu Elõrecsatolt neurális hálózatok J. Levendovszky, A. Oláh, K. Tornai Digitális-neurális-, éskiloprocesszorosarchitektúrákonalapulójelfeldolgozás Digital-and Neural Based Signal Processing & KiloprocessorArrays Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Contents • Introduction –topology • Representation capability • Blum and Li construction • Generalization capabilities • Bias variance dilemma • Learning • Applications 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 3 www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 4 Introduction –FFNN • Multilayer neural network– Input layer – Intermediate (hidden) layers – Output layer – The outputs are the inputs of the following layer • Multiple inputs, multiple outputs • Each layer contains a number of nonlinear perceptrons Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 5 Introduction –FFNN • Feed Forward Neural Networks are used for• Classification• Supervised learning for classification • Given inputs and class labels • Approximation• Arbitrary function with arbitrary precision • Prediction• „What is the next element in the future of given time series?” Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 6 Topology first hidden layer second hidden layer Input signal (stimulus) input layer output layer Output signal (response) ) 2 ( 11 w 1 1 2 l 1 1 2 l 2 1 2 l 3 . . . . . . x y Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feedforward Neural Network Topology • Each cell• Weights• lthlayer • ithneuron in the lthlayer • From the jthneuron of the (l-1)thlayer • Nonlinear activation function (logistic function, biologically motivated) 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 7 n Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Activation functions 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 8 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Activation functions • The parameter ofthe sigmoidfunction may bedifferent as it canbe seen on thefigure 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 9 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network FFNN –mode of operation • Output of the network • Where • Number of layers: L, neurons in lthlayer: nl 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 10 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network FFNN –weights • The free parameters, called weights• Can be changed in course of adaptation (learning) process in order to „tune” the network for performing a special task • This learning procedure will be discussedlater • When solving engineering task by FFNN we are faced with the following questions: 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 11 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network FFNN –questions 1. Representation– How many different tasks can be represented by an FFNN 2. Learning– How to set up the weights to solve a specific given task 3. Generalization– If only limited knowledge is available about the task which is to be solved, then how the FFNN is going to generalize this knowledge 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 12 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network FFNN in operation • The neural network works as follows• the network should be created by the specification • the weights of the network are set so the error of the network should be minimal • The weights are set by the training sequence • The learning is lead through the error function, which determines the adaptation of the weights of the neural network on the error surface 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 13 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network FFNN –in operation • most cases the error function is chosen to be the square error • the adaptation of weights can be done by different methods• usually the gradient descent method is used • In simple problems the error function is a quadratic function• It has only one minimum, so the convergence to the global optima can be assured. 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 14 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network FFNN –Example of error function • Possible quadratic error surface• The learning task is to find the global minimum 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 15 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Representation • In the following the representation capability of the FFNN will be discussed. • We seek the Ffunction space where the FFNN approximation is uniformly dense • ( symbol denotes the fact that the NNis uniformly dense in F. 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 16 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Representation • In this function space every function can be arbitrarily approximated with FFNN • The notation || || defines a norm used in Fspace • For example error computed as follows in Lp 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 17 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network www.itk.ppke.hu Representation –Theorem 1 Theorem (Harnik, Stinchambe, White 1989) • The FFNN-s are uniformly dense in the Lpspace • Recall: 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 18 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network www.itk.ppke.hu Representation –Theorem 1 Theorem (Harnik, Stinchambe, White 1989) • In other words every function in Lpcan be represented arbitrarily closely approximationby a neural net • More precisely for each 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 19 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network www.itk.ppke.hu Representation –Theorem 1 Theorem (Harnik, Stinchambe, White 1989) • Since Lpis a rather large space, the theorem implies that almost any engineering task can be solved by a one-layer neural network • The proof of theorem heavily draws from functional analysis and is based on the Hahn-Banach theorem. • Since it is out of the focus of the course this proof will not be presented here. 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 20 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network www.itk.ppke.hu Representation –Blum and Li theorem Theorem (Blum and Li) • The FFNN-s are uniformly dense in the L2space • In other words:• For each 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 21 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network www.itk.ppke.hu Representation –Blum and Li theorem • Theorem: • Proof:• Using the step functions: S • From elementary integral theory it is clear that S is uniformly dense in L1, namely every function in L1can be approximated by an appropriate step function (figure) 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 22 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network www.itk.ppke.hu Representation –Blum and Li theorem • This step functioncan have arbitrarynarrow steps • For example eachstep could be dividedinto two sub-steps • Therefore 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 23 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network www.itk.ppke.hu Representation –Blum and Li theorem • These steps partition the domain of the function • One partition can be easily represented by small neural network• In two dimension the following figure gives an example • The borders of the partition are hyper planes which could represented by one perceptron 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 24 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network www.itk.ppke.hu Representation –Blum and Li theorem • Now since every partition can be represented by a corresponding • Therefore whole F(x)function can be approximated by the FFNN • In the following slides a constructive approximation method will be introduced 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 25 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction • The Blum and Li construction is based on the „LEGO” principle • The approximation of the F function is based on its step function• Let us have a step function with nnumber of steps 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 26 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction • This step function partitions the domain of the original F function • For each partition there is a neuron responsible for approximation the „step” • If the input of the FFNN (x) falls into a given range the appropriate approximator neuron has to be selected • The output of the networkshould be this selected value 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 27 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction 1. Incoming arbitrary xvalue 2. The appropriate interval will be selected 3. The response of the network is the response of selected neuron (approximator) 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 28 1. 2. 3. Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction • This construction …• … has no dimensional limits • … has no equidistance restrictions on tiles (partitions) • … can be further fined, and the approximation can be any precise • 2 dimensional example• The tiles are the topof the columns foreach approximationcell 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 29 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction • Constructionfor one particularregion • The outputis I1if weare in thisregion 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 30 x2 xM -1 . . . -1 -1 . x1 -1 . . . . . . . . . AND Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction • Constructionfor one particularregion • The outputis I2if weare in thisregion 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 31 x2 xM -1 . . . -1 -1 . x1 -1 . . . . . . . . . AND AND perceptron with 0 or 1 output Linear separation for one side of the region Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction • Each regionis beingapproximatedby a blockspecifiedabove 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 32 x2 xM . x1 . . . . . . . . . Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction • Third layer• This neuron has linear activation function • The weights of this neuron are the approximation values of the F function • The output of blocks marked with different colors is zero or one as the input is in the specified region, • Thus the approximation for the whole domain of the original F function is done by FFNN 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 33 x2 xM . x1 . . . . . . . . . Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li construction • Minimizing the number of neurons• We do not have to represent a hyper plane more than once • size of FFNN ~ max||grad F|| • If F has an input, where F is very sensitive, meaning that the changing of F is very fast (the derivative is large), than we have to define the number of regions according to the derivative. 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 34 www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li examples • 2D example and 3D example 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 35 www.itk.ppke.hu zh 2dfv_20x20 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li examples • Weights –separator neurons1. [ -1,875 -1 ] 2. [ -1.875 +1 ], [ -0.625 -1 ] 3. [ -0.625 +1 ], [ 0.625 -1 ] 4. [ 0.625 +1 ], [ 1.875 -1 ] 5. [ 1.875 +1 ] • AND neurons: [ 0.5 1 ] or [ 1.5 1 1 ] • Linear neuron in output layer:• Weights: [ 0, -0.18, 1, 0.24, 0.01] 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 36 www.itk.ppke.hu zh Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li in general • The partitioning of the domain may be arbitrary • Let us consider the 2D plane as the domain of the F function • The following partitioning is possible to be used: 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 37 www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Blum and Li –problems • The Blum and Li construction is a good approximatoras shown previously, but it has its limitations• The size of the FFNN constructed via this method is quite big • Consider the task on the picture, where let us have 1000 by 1000 cell to approximate the function • Optimal case 3003 neurons are needed • (non-optimal: ~4 Million) • Smoother approximation needs more • We are after to find a less complicatedarchitecture 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 38 www.itk.ppke.hu 2dfv_20x20 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning • The Blum and Li construction is not always applicable, therefore we seek a solution which trains the neural network for an arbitrary function, then this function can be approximated by the neural network• The F function is partially known • The F function behaves as a black box • The task is to find a w which minimize the difference between the F and the network: 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 39 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning • This minimization task is not possibly done• Complete information is needed about F(x) • Weak learning in incomplete environment, instead of using F(x) • A training set is being constructed of observations 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 40 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning • The error of the network (the square of difference between the output and the desired output) is minimal• The approximation is the best achievable • We cannot do this due to the limited information on F, instead of we seek: 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 41 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 42 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning • The questions are the following• What is the relationship of these optimal weights • How this new objective function should be minimized as quickly as possible 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 43 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Statistical learning theory • Empirical error • Theoretical error • Let us have xkrandom variables subject to uniform distribution 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 44 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Statistical learning theory • xkrandom variable, where d=F(x) 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 45 Because it is ~ constant due to the uniformity Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Statistical learning theory • Therefore • Where l.i.m. means: lim in mean • The question is, how to set Kto have 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 46 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Bias –variance dilemma • Size of the NN- size of training set, K• The size of the neural network is the number of weights • K is the size of the training set • Let us investigate the difference: 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 47 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Bias –variance dilemma • One can write then (adding and subtracting the same term) • Therefore • This expected value should be zero 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 48 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Bias –variance dilemma • Remarks• The other terms in the expression above become zero • The first term in the expression above is the approximation error between F(x) and Net(x,w) • The second term is the error resulting from the finite training set • One can choose between the following options• either minimizing the first term (which is referred to as bias) with a relatively large size network, but in this case with a limited size training set the weights cannot be trained correctly by learning, so the second term will be large 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 49 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Bias –variance dilemma • Second option• minimizing the second term (called variance) which needs small size network.Howeverthe size of the training set theshouldbe large, invokingthe first term large • Conclusion• there is a dilemma between bias and variance • This gives rise to the question, how to set the size of the training set which strikes a good balance between the bias and variance. 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 50 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • Question: how to set the size of the training set which strikes a good balance between the bias and variance. • We know the theoretical and empirical errorThe question is, what is the probability of that the difference of these errors are greater than a given constant • Furthermore this probability must be minimized 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 51 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • We seek this function • Replacing the optimal weight vector: • To have such result, we have to introduce a more stronger bound on the convergence, called uniform convergence 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 52 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • Uniform convergence • Which enforces that for all other w 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 53 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • If this uniform convergence holds then the necessary size of learning set can be estimated • Vapnik and Chervonenkispioneered the work in revealing such bounds and the basic parameter of this bound is called VC dimension to honor their achievements • Following slides will discuss this VC dimension 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 54 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • Let us assume that we are given by a Net(x,w), what we use for binary classification • VC dimension is related to the classification “power” of Net(x,w). • More precisely, given the set of dichotomies expanded by Net(x,w) as 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 55 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • The VC dimension of Net(x,w). is defined as the number of possible dichotomies expressed by Net(x,w) • For example let us consider the following elementary network Net(x,w)=sgn{wTx–b}• Its VC dimension is N +1 • If N= 2 only 2 + 1 = 3 points can be separated on a 2D plane. • (As we have seen at the investigation of the capacity of one perceptron) 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 56 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • VC dimension in general• Consider the following theoretical and empirical errors, and given relations • We also know 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 57 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • Therefore • Vapnik states the following • Combining 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 58 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • VC dimension result • To set the constant properly • Therefore the optimal size of training set is driven by the Vc dimension 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 59 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network VC dimension • Value of the Vcparameter• If we apply hard nonlinearity in the neural network • If we apply soft nonlinearity • Where the is the number of weights in the neural network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 60 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feedforward Neural Network Learning –in practice • Learning based on the training set: • Minimize the empirical error function (Remp) • Learning is a multivariate optimization task 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 61 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning –Newton method • Newton method • In each step using the learning set we modify the weights of the neurons in layers in order to minimize the error • To do this the empirical error of the actual neuron is computed and the gradient of this error is used to modify the weight 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 62 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning • The Rosenblatt algorithm is inapplicable, while we do not know the error and desired output in the hidden layers of the FFNN • Someway the error of the whole network has to be distributed to the internal neurons, in a feedback way 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 63 Forward propagation of function signals and back-propagation of errors signals Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Sequential back propagation • Adapting the weights of the FFNN • The weights are modified towards the differential of the error function: • The elements of the training set adapted by the FFNN sequentially 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 64 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Sequential back propagation • Consider the following FFNN• Error function • Adapting the bias of neuronin hiddenlayer • Where the empirical error is 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 65 y x1 x2 . . Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Sequential back propagation • Activationfunction • The derivative of this function 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 66 actfunction u .(u) Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Sequential back propagation • Using the previous result of the derivative of activation function • Modifying the weight 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 67 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Sequential back propagation • Adapting the weights of the neuron in outputlayer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 68 y x1 x2 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Sequential back propagation • Adapting the weights of the neuron in hiddenlayer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 69 x1 x2 y Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Sequential back propagation • Adapting the weights of the neuron in hiddenlayer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 70 x1 x2 y Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Steps of learning 1. Initialization• Setting up the initial wweights, usually random numbers 2. Assembling the training set• The training set has pairs of inputs and desired outputs 3. Propagating the signal• Compute the outputs for all neurons in the network 4. Back propagating the error and updating the weights 5. Repeating the 3. and 4. steps for a new sample 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 71 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Propagation and back propagation 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 72 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Numerical example –step 1 & 2 • Consider the following problem, initial states: 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 73 .=1 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Numerical example –step 3 • Propagating the signal 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 74 k=1 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Numerical example –step 4 • Back propagating, and updating • Output layer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 75 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Numerical example –step 4 • Back propagating, and updating • Output layer -Updating 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 76 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Numerical example –step 4 • Back propagating, and updating • Hidden layer -Updating 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 77 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Numerical example –step 4 • Back propagating, and updating • Hidden layer -Updating 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 78 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Numerical example –step 4 • This must be repeated for the other samples in the training set, until a pre-defined stopping criteria is reached. • This criteria can be• A limit of steps • A pre-defined level of empirical error • When the weight does not change • … 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 79 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Example of errors • Consider the following problem, initial states: 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 80 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Example of errors • The structure of back propagated errors: • Simplified 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 81 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning issues • The speed of learning or the quality of approximation is not always the best reachable• It is possible to improve the result with other (better) w weights or other neural structure • The Vcdimension must be considered when the size of the network and the training set is planned• It is very possible that the FFNN is being over trained • On the elements of the training set the output of the network is errorless, but on other inputs the error is huge • Consider the following figure 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 82 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning issues • Example: over trained network • The error of thenetwork in casewhere the inputis a training pointis almost zero. • In other casesthe error is muchbigger. • Therefore not thegoal F functionhas been learnedby the network. 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 83 F functionTraining set Learned function Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Learning issues • Example: under trained network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 84 F functionTraining set Learned function Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Improvements of learning • Preprocessing the input and post processing the output• Normalizing • Altering the statistical properties of the input• Type of distribution • Range of data mapping • Use of different nonlinearity (even linearity)• Using different activation functions in different layers • Use of different learning parameters 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 85 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Improvements of learning • Altering the initialization method• Not to use random numbers during initialization • Improved version of learning algorithms• Resilient Back propagation • Levenberg Marquadt algorithm • Momentum methods • Partition the training set into• Learning set –to train the network • Validation set –to validate the learned weights • Testing set –to evaluate the FFNN 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 86 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. www.itk.ppke.hu TÁMOP –4.1.2-08/2/A/KMR-2009-0006 87 Comparison of two learning methods • Resilient back propagation rule • Convergence time (learning time) res2 Numberof steps Time (s) Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Applications of FFNN -Introducing • Pattern (character) recognition• Given: samples and indices • Input: noisy sample • Output: index of stored sample • Time series prediction• FFNN is able to predict the new value of time series when historical data is available and the FFNN is trained on the historical data • Example: power consumption, currency exchange rates 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 88 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network Applications of FFNN -Introducing • Telecommunication• Signal detection task • Given channel and noisy symbols arrived through this channel • The task is to decide what symbol has been sent over the channel • Call admission control• In packet switched networks • To provide maximal throughput and avoid overflow of the network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 89 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 90 Applications of FFNN –Time series prediction • The task is the following:• We know the history of a time series till the current time instant • We would like to estimate the next few element of this time series • In order to solve this task using the FFNN a training set must be assembled• This training set contains a nlength vector containing the values from ito i+nfrom the time series as input and the i+n+1of the time series as desired output • Running ifrom 1 to N-n-1, where Nis the length of the time series the training set is constructed easily Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 91 Applications of FFNN –Time series prediction • For example take the following simple function as the time series pred_orig.jpg Testing set Training set Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 92 Applications of FFNN –Time series prediction • The predicted time series• The precision of prediction is very high pred.jpg Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 93 Applications of FFNN –Time series prediction • Error of prediction and real time series• This function was learned by the FFNN • The information if the training set was generalized and the future values of the time series was predicted well by the FFNN pred.jpg x10-8 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 94 Applications of FFNN –Classification • The classification example task is the following:• Classification with two classes • A data set is given with vectors, the two classes are not defined explicitly • The information which vector belongs to the first class and which vector belongs to the second class is available • The training set is constructed from the previous information• Vector as input and +1 or –1 as the classification data Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 95 Applications of FFNN –Classification • Training set in 2D space with two classes Z:\Kami\Dropbox\PHD\Oktatás\TAMOP\10_FFNN\class.jpg Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 96 Applications of FFNN –Classification • Training set in 2D space• The classes may not beseparable with hyper plane • The edges of classes are softedges and there is no explicitrule • For example: a circle with center at 0,0 and with radius 7. • This information (where the edges are between the two classes) should be learned and generalized by the FFNN Z:\Kami\Dropbox\PHD\Oktatás\TAMOP\10_FFNN\class.jpg Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 97 Applications of FFNN –Classification • Real classification by FFNN in a 3D example Class 1 / 2 Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 98 Applications of FFNN –Signal detection • Signal detection in a wireless network• Given channel with defined noise • There is no information about the noise• No parameters, only observations • The sender transmits its symbols through this noisy channel • The receiver detects these symbols with noise • The task is to determine which symbols has been sent through this channel • To solve this task we can use FFNN as detector Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 99 Applications of FFNN –Signal detection • There is no information about the noise• No parameters, only observations • This observation may be used as the training set for the network Channel Detector + Sender Receiver Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 100 Applications of FFNN –Signal detection, example • Let us have the following impulse response from the channel (after channel identification • The following training set can constructed from observations• Sent symbols • Received values • Training set (example, using two symbols as input) Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 101 Applications of FFNN –Signal detection, example • Structure of the FFNN • Activation function for the outputneuron should be the following • Because a differentiable functionis needed, but it has to be very similar to the sign function in order to obtain –1 or +1 response of the neural network Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 www.itk.ppke.hu 102 Summary • The architecture of the Feed forward Neural Network has been introduced • The representation capability of the FFNN is the following • Blum and Li construction –LEGO principle• Constructive algorithm to approximate arbitrary function • Back propagation algorithm• Training set, iterative algorithm to obtain information from the training set • Bias-Variance dilemma, VC dimension • Applications