10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

1

Development of Complex Curricula for Molecular Bionics and InfobionicsPrograms within a consortial* framework**
Consortium leader
PETER PAZMANY CATHOLIC  UNIVERSITY
Consortium members
SEMMELWEIS UNIVERSITY, DIALOG CAMPUS PUBLISHER
The Project has been realisedwith the support of the European Union and has been co-financed by the European Social Fund ***

**Molekuláris bionika és Infobionika Szakok tananyagának komplex fejlesztése konzorciumi keretben
***A projekt az Európai Unió támogatásával, az Európai Szociális Alap társfinanszírozásával valósul meg.

PETER PAZMANY
CATHOLIC UNIVERSITY

SEMMELWEIS
UNIVERSITY

sote_logo.jpg
dk_fejlec.gif
INFOBLOKK

10/5/2011

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

2

Peter Pazmany Catholic University  
Faculty of Information Technology

Feedforward Neural Networks

www.itk.ppke.hu

Előrecsatolt neurális hálózatok

J. Levendovszky, A. Oláh, K. Tornai

Digitális-neurális-, éskiloprocesszorosarchitektúrákonalapulójelfeldolgozás

Digital-and Neural Based Signal Processing & KiloprocessorArrays


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Contents
•
Introduction –topology

•
Representation capability

•
Blum and Li construction

•
Generalization capabilities

•
Bias variance dilemma

•
Learning

•
Applications


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

3

www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

4

Introduction –FFNN
•
Multilayer neural network–
Input layer

–
Intermediate (hidden) layers

–
Output layer

–
The outputs are the inputs of the following layer

•
Multiple inputs, multiple outputs

•
Each layer contains a number of nonlinear perceptrons

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

5

Introduction –FFNN
•
Feed Forward Neural Networks are used for•
Classification•
Supervised learning for classification

•
Given inputs and class labels


•
Approximation•
Arbitrary function with arbitrary precision

•
Prediction•
„What is the next element in the future of given time series?”


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

6

Topology

first hidden 

layer

second hidden 

layer

Input

signal

(stimulus)

input

layer

output

layer

Output

signal

(response)

)

2

(

11

w


1

1

2

l

1

1

2

l

2

1

2

l

3

.

.

.

.

.

.


x

y


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feedforward Neural Network

Topology
•
Each cell•
Weights•
lthlayer

•
ithneuron in the lthlayer

•
From the jthneuron of the (l-1)thlayer


•
Nonlinear activation function (logistic function, biologically motivated)


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

7


n


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Activation functions

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

8


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Activation functions
•
The parameter ofthe sigmoidfunction may bedifferent as it canbe seen on thefigure


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

9


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

FFNN –mode of operation
•
Output of the network

•
Where

•
Number of layers: L, neurons in lthlayer: nl


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

10


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

FFNN –weights
•
The free parameters, called weights•
Can be changed in course of adaptation (learning) process in order to „tune” the network for performing a special task

•
This learning procedure will be discussedlater


•
When solving engineering task by FFNN we are faced with the following questions:


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

11


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

FFNN –questions
1.
Representation–
How many different tasks can be represented by an FFNN


2.
Learning–
How to set up the weights to solve a specific given task

3.
Generalization–
If only limited knowledge is available about the task which is to be solved, then how the FFNN is going to generalize this knowledge


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

12


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

FFNN in operation
•
The neural network works as follows•
the network should be created by the specification

•
the weights of the network are set so the error of the network should be minimal

•
The weights are set by the training sequence


•
The learning is lead through the error function, which determines the adaptation of the weights of the neural network on the error surface


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

13


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

FFNN –in operation
•
most cases the error function is chosen to be the square error

•
the adaptation of weights can be done by different methods•
usually the gradient descent method is used


•
In simple problems the error function is a quadratic function•
It has only one minimum, so the convergence to the global optima can be assured.


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

14


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

FFNN –Example of error function
•
Possible quadratic error surface•
The learning task is to find the global minimum


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

15


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Representation
•
In the following the representation capability of the FFNN will be discussed.

•
We seek the Ffunction space where the FFNN approximation is uniformly dense

•
(       symbol denotes the fact that the NNis uniformly dense in F.


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

16


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Representation
•
In this function space every function can be arbitrarily approximated with FFNN


•
The notation || || defines a norm used in Fspace

•
For example error computed as follows in Lp


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

17


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

www.itk.ppke.hu

Representation –Theorem 1
Theorem (Harnik, Stinchambe, White 1989)
•
The FFNN-s are uniformly dense in the Lpspace


•
Recall:


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

18


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

www.itk.ppke.hu

Representation –Theorem 1
Theorem (Harnik, Stinchambe, White 1989)
•
In other words every function in Lpcan be represented arbitrarily closely approximationby a neural net

•
More precisely for each


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

19


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

www.itk.ppke.hu

Representation –Theorem 1
Theorem (Harnik, Stinchambe, White 1989)
•
Since Lpis a rather large space, the theorem implies that almost any engineering task can be solved by a one-layer neural network

•
The proof of theorem heavily draws from functional analysis and is based on the Hahn-Banach theorem.

•
Since it is out of the focus of the course this proof will not be presented here.


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

20


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

www.itk.ppke.hu

Representation –Blum and Li theorem
Theorem (Blum and Li)
•
The FFNN-s are uniformly dense in the L2space

•
In other words:•
For each 


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

21


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

www.itk.ppke.hu

Representation –Blum and Li theorem
•
Theorem:

•
Proof:•
Using the step functions: S

•
From elementary integral theory it is clear that S is uniformly dense in L1, namely every function in L1can be approximated by an appropriate step function (figure)


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

22


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

www.itk.ppke.hu

Representation –Blum and Li theorem
•
This step functioncan have arbitrarynarrow steps

•
For example eachstep could be dividedinto two sub-steps

•
Therefore 


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

23


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

www.itk.ppke.hu

Representation –Blum and Li theorem
•
These steps partition the domain of the function

•
One partition can be easily represented by small neural network•
In two dimension the following figure gives an example


•
The borders of the partition are hyper planes which could represented by one perceptron


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

24


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

www.itk.ppke.hu

Representation –Blum and Li theorem
•
Now since every partition can be represented by a corresponding


•
Therefore whole F(x)function can be approximated by the FFNN

•
In the following slides a constructive approximation method will be introduced


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

25


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
•
The Blum and Li construction is based on the „LEGO” principle

•
The approximation of the F function is based on its step function•
Let us have a step function with nnumber of steps


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

26


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
•
This step function partitions the domain of the original F function

•
For each partition there is a neuron responsible for approximation the „step”

•
If the input of the FFNN (x) falls into a given range the appropriate approximator neuron has to be selected

•
The output of the networkshould be this selected value


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

27


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
1.
Incoming arbitrary xvalue

2.
The appropriate interval will be selected

3.
The response of the network is the response of selected neuron (approximator)


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

28


1.

2.

3.


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
•
This construction …•
… has no dimensional limits

•
… has no equidistance restrictions on tiles (partitions)

•
… can be further fined, and the approximation can be any precise


•
2 dimensional example•
The tiles are the topof the columns foreach approximationcell


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

29


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
•
Constructionfor one particularregion

•
The outputis I1if weare in thisregion


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

30


x2

xM


-1


.

.

.


-1


-1


.


x1


-1


.
.
.

.
.
.

.
.
.


AND


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
•
Constructionfor one particularregion

•
The outputis I2if weare in thisregion


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

31


x2

xM


-1


.


.


.


-1


-1


.


x1


-1


.
.
.

.
.
.

.
.
.


AND


AND perceptron with 0 or 1 output

Linear separation for one side of the region


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
•
Each regionis beingapproximatedby a blockspecifiedabove


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

32


x2

xM


.


x1


.
.
.

.
.
.

.
.
.


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
•
Third layer•
This neuron has linear activation function

•
The weights of this neuron are the approximation values of the F function

•
The output of blocks marked with different colors is zero or one as the input is in the specified region, 

•
Thus the approximation for the whole domain of the original F function is done by FFNN


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

33


x2

xM


.


x1


.
.
.

.
.
.

.
.
.


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li construction
•
Minimizing the number of neurons•
We do not have to represent a hyper plane more than once

•
size of FFNN ~ max||grad F||

•
If F has an input, where F is very sensitive, meaning that the changing of F is very fast (the derivative is large), than we have to define the number of regions according to the derivative.

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

34

www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li examples
•
2D example and 3D example


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

35

www.itk.ppke.hu

zh
2dfv_20x20

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li examples
•
Weights –separator neurons1.
[ -1,875  -1 ]

2.
[ -1.875 +1 ], [ -0.625 -1 ]

3.
[ -0.625 +1 ], [  0.625 -1 ]

4.
[  0.625 +1 ], [  1.875 -1 ]

5.
[  1.875 +1 ]

•
AND neurons: [ 0.5 1 ] or [ 1.5 1 1 ]

•
Linear neuron in output layer:•
Weights: [ 0,  -0.18,  1,   0.24,  0.01]


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

36

www.itk.ppke.hu

zh

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li in general
•
The partitioning of the domain may be arbitrary

•
Let us consider the 2D plane as the domain of the F function

•
The following partitioning is possible to be used:


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

37

www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Blum and Li –problems
•
The Blum and Li construction is a good approximatoras shown previously, but it has its limitations•
The size of the FFNN constructed via this method is quite big

•
Consider the task on the picture, where let us have 1000 by 1000 cell to approximate the function

•
Optimal case 3003 neurons are needed

•
(non-optimal: ~4 Million)

•
Smoother approximation needs more

•
We are after to find a less complicatedarchitecture

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

38

www.itk.ppke.hu

2dfv_20x20

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning
•
The Blum and Li construction is not always applicable, therefore we seek a solution which trains the neural network for an arbitrary function, then this function can be approximated by the neural network•
The F function is partially known

•
The F function behaves as a black box

•
The task is to find a w which minimize the difference between the F and the network:


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

39


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning
•
This minimization task is not possibly done•
Complete information is needed about F(x)


•
Weak learning in incomplete environment, instead of using F(x)

•
A training set is being constructed of observations


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

40


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning

•
The error of the network (the square of difference between the output and the desired output) is minimal•
The approximation is the best achievable

•
We cannot do this due to the limited information on F, instead of we seek:


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

41


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

42

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning
•
The questions are the following•
What is the relationship of these optimal weights

•
How this new objective function should be minimized as quickly as possible


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

43


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Statistical learning theory
•
Empirical error


•
Theoretical error


•
Let us have xkrandom variables subject to uniform distribution


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

44


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Statistical learning theory
•
xkrandom variable, where d=F(x)


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

45


Because it is ~ constant due to the uniformity


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Statistical learning theory
•
Therefore

•
Where l.i.m. means: lim in mean

•
The question is, how to set Kto have


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

46


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Bias –variance dilemma
•
Size of the NN- size of training set, K•
The size of the neural network is the number of weights

•
K is the size of the training set

•
Let us investigate the difference:


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

47


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Bias –variance dilemma
•
One can write then (adding and subtracting the same term)


•
Therefore

•
This expected value should be zero


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

48


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Bias –variance dilemma
•
Remarks•
The other terms in the expression above become zero

•
The first term in the expression above is the approximation error between F(x) and Net(x,w)

•
The second term is the error resulting from the finite training set

•
One can choose between the following options•
either minimizing the first term (which is referred to as bias) with a relatively large size network, but in this case with a limited size training set the weights cannot be trained correctly by learning, so the second term will be large


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

49


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Bias –variance dilemma
•
Second option•
minimizing the second term (called variance) which needs small size network.Howeverthe size of the training set theshouldbe large, invokingthe first term large


•
Conclusion•
there is a dilemma between bias and variance

•
This gives rise to the question, how to set the size of the training set which strikes a good balance between the bias and variance.


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

50

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
Question: how to set the size of the training set which strikes a good balance between the bias and variance.

•
We know the theoretical and empirical errorThe question is, what is the probability of that the difference of these errors are greater than a given constant


•
Furthermore this probability must be minimized


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

51


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
We seek this function

•
Replacing the optimal weight vector:


•
To have such result, we have to introduce a more stronger bound on the convergence, called uniform convergence


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

52


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
Uniform convergence


•
Which enforces that for all other w


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

53


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
If this uniform convergence holds then the necessary size of learning set can be estimated

•
Vapnik and Chervonenkispioneered the work in revealing such bounds and the basic parameter of this bound is called VC dimension to honor their achievements

•
Following slides will discuss this VC dimension


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

54


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
Let us assume that we are given by a Net(x,w), what we use for binary classification

•
VC dimension is related to the classification “power” of Net(x,w).

•
More precisely, given the set of dichotomies expanded by Net(x,w) as


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

55


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
The VC dimension of Net(x,w). is defined as the number of possible dichotomies expressed by Net(x,w)

•
For example let us consider the following elementary network Net(x,w)=sgn{wTx–b}•
Its VC dimension is N +1

•
If N= 2 only 2 + 1 = 3 points can be separated on a 2D plane. 

•
(As we have seen at the investigation of the capacity of one perceptron)


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

56


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
VC dimension in general•
Consider the following theoretical and empirical errors, and given relations


•
We also know


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

57


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
Therefore


•
Vapnik states the following


•
Combining


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

58


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
VC dimension result


•
To set the constant properly


•
Therefore the optimal size of training set is driven by the Vc dimension


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

59


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

VC dimension
•
Value of the Vcparameter•
If we apply hard nonlinearity in the neural network

•
If we apply soft nonlinearity


•
Where the      is the number of weights in the neural network


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

60


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feedforward Neural Network

Learning –in practice
•
Learning based on the training set:


•
Minimize the empirical error function (Remp)


•
Learning is a multivariate optimization task


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

61


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning –Newton method
•
Newton method


•
In each step using the learning set we modify the weights of the neurons in layers in order to minimize the error

•
To do this the empirical error of the actual neuron is computed and the gradient of this error is used to modify the weight


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

62


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning
•
The Rosenblatt algorithm is inapplicable, while we do not know the error and desired output in the hidden layers of the FFNN

•
Someway the error of the whole network has to be distributed to the internal neurons, in a feedback way


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

63


Forward propagation of 
function signals and 
back-propagation of 
errors signals


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Sequential back propagation
•
Adapting the weights of the FFNN

•
The weights are modified towards the differential of the error function:


•
The elements of the training set adapted by the FFNN sequentially


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

64


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Sequential back propagation
•
Consider the following FFNN•
Error function

•
Adapting the bias of neuronin hiddenlayer

•
Where the empirical error is


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

65


y


x1

x2

.

.


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Sequential back propagation
•
Activationfunction

•
The derivative of this function

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

66

actfunction


u

.(u)


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Sequential back propagation
•
Using the previous result of the derivative of activation function


•
Modifying the weight


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

67


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Sequential back propagation
•
Adapting the weights of the neuron in outputlayer


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

68


y


x1

x2


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Sequential back propagation
•
Adapting the weights of the neuron in hiddenlayer


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

69


x1

x2


y


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Sequential back propagation
•
Adapting the weights of the neuron in hiddenlayer


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

70


x1

x2


y


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Steps of learning
1.
Initialization•
Setting up the initial wweights, usually random numbers


2.
Assembling the training set•
The training set has pairs of inputs and desired outputs


3.
Propagating the signal•
Compute the outputs for all neurons in the network


4.
Back propagating the error and updating the weights


5.
Repeating the 3. and 4. steps for a new sample


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

71


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Propagation and back propagation

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

72


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Numerical example –step 1 & 2
•
Consider the following problem, initial states:


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

73


.=1


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Numerical example –step 3
•
Propagating the signal


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

74


k=1


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Numerical example –step 4
•
Back propagating, and updating

•
Output layer


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

75


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Numerical example –step 4
•
Back propagating, and updating

•
Output layer -Updating


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

76


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Numerical example –step 4
•
Back propagating, and updating

•
Hidden layer -Updating


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

77


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Numerical example –step 4
•
Back propagating, and updating

•
Hidden layer -Updating


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

78


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Numerical example –step 4
•
This must be repeated for the other samples in the training set, until a pre-defined stopping criteria is reached.

•
This criteria can be•
A limit of steps

•
A pre-defined level of empirical error

•
When the weight does not change

•
…


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

79


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Example of errors
•
Consider the following problem, initial states:


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

80


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Example of errors
•
The structure of back propagated errors:


•
Simplified


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

81


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning issues
•
The speed of learning or the quality of approximation is not always the best reachable•
It is possible to improve the result with other (better) w weights or other neural structure


•
The Vcdimension must be considered when the size of the network and the training set is planned•
It is very possible that the FFNN is being over trained

•
On the elements of the training set the output of the network is errorless, but on other inputs the error is huge

•
Consider the following figure


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

82


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Learning issues
•
Example: over trained network

•
The error of thenetwork in casewhere the inputis a training pointis almost zero.

•
In other casesthe error is muchbigger.

•
Therefore not thegoal F functionhas been learnedby the network.


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

83


F functionTraining set
Learned function


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network


Learning issues
•
Example: under trained network


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

84

F functionTraining set
Learned function


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Improvements of learning
•
Preprocessing the input and post processing the output•
Normalizing

•
Altering the statistical properties of the input•
Type of distribution

•
Range of data mapping


•
Use of different nonlinearity (even linearity)•
Using different activation functions in different layers


•
Use of different learning parameters


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

85


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Improvements of learning
•
Altering the initialization method•
Not to use random numbers during initialization

•
Improved version of learning algorithms•
Resilient Back propagation

•
Levenberg Marquadt algorithm

•
Momentum methods


•
Partition the training set into•
Learning set –to train the network

•
Validation set –to validate the learned weights

•
Testing set –to evaluate the FFNN


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

86


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

www.itk.ppke.hu

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

87

Comparison of two learning methods
•
Resilient back propagation rule


•
Convergence time (learning time)


res2
Numberof steps

Time (s)


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Applications of FFNN -Introducing
•
Pattern (character) recognition•
Given: samples and indices

•
Input: noisy sample

•
Output: index of stored sample

•
Time series prediction•
FFNN is able to predict the new value of time series when historical data is available and the FFNN is trained on the historical data

•
Example: power consumption, currency exchange rates


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

88


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

Applications of FFNN -Introducing
•
Telecommunication•
Signal detection task

•
Given channel and noisy symbols arrived through this channel

•
The task is to decide what symbol has been sent over the channel

•
Call admission control•
In packet switched networks

•
To provide maximal throughput and avoid overflow of the network


10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

89


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

90

Applications of FFNN –Time series prediction
•
The task is the following:•
We know the history of a time series till the current time instant

•
We would like to estimate the next few element of this time series


•
In order to solve this task using the FFNN a training set must be assembled•
This training set contains a nlength vector containing the values from ito i+nfrom the time series as input and the i+n+1of the time series as desired output 

•
Running ifrom 1 to N-n-1, where Nis the length of the time series the training set is constructed easily


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

91

Applications of FFNN –Time series prediction
•
For example take the following simple function as the time series


pred_orig.jpg

Testing set

Training set


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

92

Applications of FFNN –Time series prediction
•
The predicted time series•
The precision of prediction is very high


pred.jpg

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

93

Applications of FFNN –Time series prediction
•
Error of prediction and real time series•
This function was learned by the FFNN

•
The information if the training set was generalized and the future values of the time series was predicted well by the FFNN


pred.jpg
x10-8


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

94

Applications of FFNN –Classification
•
The classification example task is the following:•
Classification with two classes

•
A data set is given with vectors, the two classes are not defined explicitly

•
The information which vector belongs to the first class and which vector belongs to the second class is available

•
The training set is constructed from the previous information•
Vector as input and +1 or –1 as the classification data


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

95

Applications of FFNN –Classification
•
Training set in 2D space with two classes


Z:\Kami\Dropbox\PHD\Oktatás\TAMOP\10_FFNN\class.jpg

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

96

Applications of FFNN –Classification
•
Training set in 2D space•
The classes may not beseparable with hyper plane

•
The edges of classes are softedges and there is no explicitrule

•
For example: a circle with center at 0,0 and with radius 7.

•
This information (where the edges are between the two classes) should be learned and generalized by the FFNN


Z:\Kami\Dropbox\PHD\Oktatás\TAMOP\10_FFNN\class.jpg

Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

97

Applications of FFNN –Classification
•
Real classification by FFNN in a 3D example


Class 1 / 2


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

98

Applications of FFNN –Signal detection
•
Signal detection in a wireless network•
Given channel with defined noise

•
There is no information about the noise•
No parameters, only observations


•
The sender transmits its symbols through this noisy channel

•
The receiver detects these symbols with noise

•
The task is to determine which symbols has been sent through this channel

•
To solve this task we can use FFNN as detector


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

99

Applications of FFNN –Signal detection
•
There is no information about the noise•
No parameters, only observations

•
This observation may be used as the training set for the network


Channel

Detector


+


Sender

Receiver


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

100

Applications of FFNN –Signal detection, example
•
Let us have the following impulse response from the channel (after channel identification

•
The following training set can constructed from observations•
Sent symbols

•
Received values 

•
Training set (example, using two symbols as input)


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

101

Applications of FFNN –Signal detection, example
•
Structure of the FFNN

•
Activation function for the outputneuron should be the following


•
Because a differentiable functionis needed, but it has to be very similar to the sign function in order to obtain –1 or +1 response of the neural network


Signal processing on digital, neural, and kiloprocessor based architectures: FNN –Feed forward Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

www.itk.ppke.hu

102

Summary
•
The architecture of the Feed forward Neural Network has been introduced

•
The representation capability of the FFNN is the following


•
Blum and Li construction –LEGO principle•
Constructive algorithm to approximate arbitrary function


•
Back propagation algorithm•
Training set, iterative algorithm to obtain information from the training set


•
Bias-Variance dilemma, VC dimension

•
Applications