10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

1

Development of Complex Curricula for Molecular Bionics and Infobionics Programs within a consortial* framework**
Consortium leader
PETER PAZMANY CATHOLIC  UNIVERSITY
Consortium members
SEMMELWEIS UNIVERSITY, DIALOG CAMPUS PUBLISHER
The Project has been realised with the support of the European Union and has been co-financed by the European Social Fund ***

**Molekuláris bionika és Infobionika Szakok tananyagának komplex fejlesztése konzorciumi keretben
***A projekt az Európai Unió támogatásával, az Európai Szociális Alap társfinanszírozásával valósul meg.

PETER PAZMANY
CATHOLIC UNIVERSITY

SEMMELWEIS
UNIVERSITY

sote_logo.jpg
dk_fejlec.gif
INFOBLOKK

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

2

Peter Pazmany Catholic University  
Faculty of Information Technology

Hopfield network, Hopfield net as associative memory and combinatorial optimizer

www.itk.ppke.hu

Hopfield hálózat, Hopfield, mint asszociatív memória és kombinatorikus optimalizáló

J. Levendovszky, A. Olah, G. Treplan, D. Tisza

Digitális-neurális-, éskiloprocesszorosarchitektúrákonalapulójelfeldolgozás

Digital-and Neural Based Signal Processing & KiloprocessorArrays


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

3

Outline
•
Introduction

•
Hopfield net -Structure and operation

•
Hopfield net -Stability and convergence properties

•
Hopfield net as an Associative Memory (AM)

•
Capacity analysis of the Hopfield net

•
Applications of Hopfield net as AM

•
Hopfield net as combinatorial optimizers

•
The way towards CNN

•
Example Problems


www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

4

Introduction (1)
Hopfield neural network is a 
•
Recurrent artificial neural network,

•
Invented by John Hopfield,

•
Serve asan associative memory system

•
Or operate as a combinatorial optimizer (quadratic programming)

•
A stable dynamic system, guaranteed to converge to a local minimum

•
Convergence to one of the stored patterns is not guaranteed.


www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

5

Introduction (2)


•
Topology of Hopfield Neural Network

www.itk.ppke.hu


Numberof connections:
Implementationdifficulty!


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

6

Structure and operation (1)
Notations (1)
•
Weight matrix contains the Wijsynaptic weight strength feedback from neuron ito neuron j:


•
where


www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

7

Structure and operation (2)
Notations (2)
•
Bias vector contains the threshold values of each neuron:


•
Let state vector of the system be


•
where


www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

8

Structure and operation (3)
•
The discrete Hopfield Neural Network can be regarded as a nonlinear recursion given in the form of


for every neuron.

•
If we reduce our attention only to the sequential updating rule, the neuron selection rule becomes:


www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

9

Structure and operation (4)
•
When investigating such a nonlinear recursion as an associative mapping, the following questions can arise:

1.
How to construct matrix Wif one wants to store a set of patterns 


as the fix points of algorithm 
such as that      
holds.


www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

10

Structure and operation (5)
2.
What is the number of those fix points Mas a function of the dimension (number of neurons in the Hopfield net) ?


In other words we want to reveal thestorage capacityof the Hopfield net as a function of the number of neurons N.


www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

11

Structure and operation (6)
3.Is recursion 
stable and if yes then what are the its convergence properties?
•
Next we will thoroughly discuss these questions. Before getting down to a detailed analysis, we need some tools rooted in the classical stability theory called Lyapunov technique.


www.itk.ppke.hu


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

12

Stability and convergence properties (1)

•
Lyapunov1functions are widely used in the study of dynamical systems in order to prove the stability of a system. T


•
This technique can be used to analyze the stability properties of the Hopfield Neural Networks.


1 Aleksandr Mikhailovich Lyapunov (June 6, 1857 -November 3, 1928) was a Russian mathematician, mechanician and physicist.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

13

Stability and convergence properties (2)


•
One possible Lyapunov function


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

14

Stability and convergence properties (3)
Lyapunov’s weak theorem (1)
•
Let us assume that there is a nonlinear recursion given in the following general form  

•
If one can define a function (the so-called Lyapunov function)  

•
over the state space Y, for which


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

15

Stability and convergence properties (4)
Lyapunov’s weak theorem (2)
1.L(y) has a global upper bound over the state space:

2.
the change of L(y) denoted by  


in each step of the recursion;
Then the recursion
is stable and converges one of the local maxima of L(y).


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

16

Stability and convergence properties (5)
Lyapunov’s weak theorem (3)
•
The exact proof, which can be found in numerous books dealing with control and stability theory, is omitted here. 

•
However, it is easy to see if in each step the L(y) can only increase and at the same time there exist a global lower bound then the recursion cannot go on indefinitely but it will converge to one of the local minima.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

17

Stability and convergence properties (6)
Lyapunov’s strong theorem (1)
•
Let us assume that there is a nonlinear recursion given in the following general form  


•
If one can define a function (the so-called Lyapunov function)  


•
over the state space Y, for which


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

18

Stability and convergence properties (7)
Lyapunov’s strong theorem (2)
1.L(y) has a global lower and upper bound over the state space: 

2.
the change of L(y) denoted by  


in each step of the recursion;
Then the recursion
is stable and converges one of the local maxima of L(y) .


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

19

Stability and convergence properties (8)
Lyapunov’s strong theorem (3)
and its transient time can be upper bounded as .

•
Again the proof omitted, however it is easy to interpret the result as follows: in each step L(y) increases at least by . and in the worst case the maximum number of steps needed to cover the distance B- Ais


•
With this tools at our disposal we can embark on ascertaining the stability and convergence properties of the Hopfield net.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

20

Stability and convergence properties (9)
•
To use the Lyapunov technique we have to assume a Lyapunov function associated to recursion 


•
According to Hopfield, Cohen and Grossberg, we define the corresponding Lyapunov function as follows:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

21

Stability and convergence properties (10)
•
Now we want to apply Lyapunov’s strong theorem, therefore we have to check the following three conditions:


1.
existence of global upper bound;

2.
existence of global lower bound;

3.
it is true, that L(y) . ..


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

22

Stability and convergence properties (11)
The existence of global upper bound
•
To derive an upper bound we can use the Cauchy-Schwartz inequality as follows:


•
taking into account that we are dealing with binary state vectors, elements of {-1,1}N for which


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

23

Stability and convergence properties (12)
The existence of global lower bound (1)
•
To derive a global lower upper for L(y) let us first broaden the state space from Y={-1,1}Nto the space of N dimensional real numbers and define


•
over this broadened state space. Therefore


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

24

Stability and convergence properties (13)
The existence of global lower bound (2)
•
The minimum  


•
can be easily calculated considering that 


•
is continuum therefore the gradient of


•
exists and form the well known results related to quadratic forms the location of this maximum is given as xopt=W-1b.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

25

Stability and convergence properties (14)
The existence of global lower bound (3)
•
Substituting xoptinto 


•
one obtains that 


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

26

Stability and convergence properties (15)
The change of the Lyapunov function (1)
•
Let us define the change of the Lyapunov function as follows


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

27

Stability and convergence properties (16)
The change of the Lyapunov function (2)
•
We apply the sequential update rule, which means that only the component 


•
in the state vector y(k) changes, we can write


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

28

Stability and convergence properties (17)
The change of the Lyapunov function (3)
•
Taking this into consideration .L(k) takes the following form


•
where


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

29

Stability and convergence properties (18)
The change of the Lyapunov function (4)
•
Let us introduce quantity .as


•
To calculate the values of this expression we can create the following table:


yl(k)
 yl(k+1)
 .yl(k)
 Wll.yl2 (k)
 2.yl(k)
 .jWlj.yj(k)-bl
 .L(k) 
 
-1
 +1
 +2
 4Wll
 +4
 .>0
 4(Wll+.)>0
 
+1
 -1
 -2
 4Wll
 -4
 -.<0
 4(Wll+.)>0
 

Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

30

Stability and convergence properties (19)
The change of the Lyapunov function (5)
•
From this table it can be seen that whenever a state transition occurs, then 


•
This means that for each step when a state transition occurs the energy function increases.

•
Given a global upper and lower bound it follows that there exist a time step where the algorithm must stop in a local maxima.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

31

Stability and convergence properties (20)
•
Theorem 3.The Hopfield type of recursion

1.
is stable;

2.
it converges to the local maxima of the function


3.
the transient time can be upper-bounded as


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

32

Hopfield Neural Network as an AM (1)
•
Analyzing the non-linear state recursion of the Hopfield Neural Network we have come to the conclusion that this is a finite state automata with binary state vectors, which gave rise to a new application of this network: Associative Mapping (AM).


•
We start the HNN from an initial state vector x, which corresponds to a corrupted version of a stored memory item, we call this vector clue. 


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

33

Hopfield Neural Network as an AM (2)
•
Then we start the iteration 


•
of the network, and if the network gets stuck in one of its steady-states then we call this vector as recalled memory item. 

•
This mapping is the so-called Associative Mapping.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

34

Hopfield Neural Network as an AM (3)
•
This new computational gives rise to the model in Figure where the Ndimensional binary vectors are mapped into two dimensions. The box represents the state space Y{-1, 1}N, there are a couple of fix points of the HNN s(1), s(2), s(3), s(4).


•
An Associative Mapping is a partitioning of the space              Y{-1, 1}N.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

35

Hopfield Neural Network as an AM (4)

•
There is a separation of this state space, in terms if we start the network from a state x, which falls into the basin of convergence of the memory pattern s(4), then finally the network will stuck in this steady state.

•
In general this also holds for the other memory patterns, and their basin of attraction.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

36

Hopfield Neural Network as an AM (5)
•
A pretty general demonstration of the working of an Associative Mapping is depicted in figure.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

37

Hopfield Neural Network as an AM (6)
•
Lets assume that there are some stored memory items, for example a picture of a vase, a cat and a lorry, these are the three patterns. 

•
An Associative Mapping means that if we have a corrupted and incomplete version of one of the memory patterns then it will be mapped to one of the stored items, which is the closest.


A demonstration how Associative Mapping works.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

38

Hopfield Neural Network as an AM (7)
•
In order to use the Hopfield Neural Network to implement this new computational paradigm, we have to make sure that the network is stable, meaning the network will converge to a steady state. 

•
However when we started to investigate the stability, we came up with the Lyapunov concept of stability, and there is a quadratic form 


•
which is the Lyapunov function of the HNN, and we have proven that the Hopfield Network is stable.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

39

Hopfield Neural Network as an AM (8)
•
Furthermore we have come to the conclusion that the transient time of this network is O(N2), which is must faster than exhaustive search O(2N).

•
When one implements an Associative Memory, there is a given set of items, which are to be stored, this is called the set of stored memory items, and noted by S,


•
here the number of the stored memory items is M, which items are binary vectors, with dimension N,


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

40

Hopfield Neural Network as an AM (9)
•
These binary vectors can refer to images, speech patterns or bank account numbers, depending on the actual application. 

•
The set Xrepresents the observation space, this can be any possible Ndimensional binary vector,


•
We can see that SX, because s(.)is also Ndimensional binary vector.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

41

Hopfield Neural Network as an AM (10)
•
Definition 1. An Associative Mapping  .is defined as a mapping from the observation space Xto the set of stored memory items S, in such a way that it maps any specific observation vector xinto a stored memory item s(ß) for which it holds, that this stored memory item is the closest to the observation vector, according to a certain distance criterion d().

•
Formally AM is a 


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

42

Hopfield Neural Network as an AM (11)
•
We can define distance in any arbitrary way, which suits our application. However when we deal with binary vectors it is rather plausible that this distance is the Hamming distance, which measures in how many bits two vectors differ from each other.

•
There are two fundamental attributes of an Associative Memory: 1) stability, 2) capacity.

•
Where capacity boils down to that what is the size of the stored memory items, in our notation M= |S|. When we speak of the analysis of Hopfield Neural Network as an Associative Mapping we are trying to reveal these two properties.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

43

Capacity Analysis (1)
•
When we use the Hopfield Neural Network as an Associative Memory the threshold vector b is set to the all zero vector 0, and if we want to store in the network a predefined set of memory items then the components of the weight matrix is as follows:


•
which can be written with vector notations using the outer product operation:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

44

Capacity Analysis (2)
•
This rule was discovered and fully elaborated on in the work of D. O. Hebb3, a Canadian born psychologist. 

•
That’s why it is named as Hebbian Learning Rule. He described this rule as a psychologist in a textual way and from this description it was mathematically inferred that 


is the learning rule.
3Donald Olding Hebb (July 22, 1904 -August 10, 1985) was a psychologist who was influential in the area of neuropsychology, where he sought to understand how the function of neurons contributed to psychological processes such as learning. He has been described as the father of neuropsychology and neural networks.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

45

Capacity Analysis (3)
•
By capacity we mean that how many stored memory items can be recalled from the Hopfield Neural Network. 

•
In order to do that we start with a very elementary investigation and then are going to penetrate deeper and deeper into the capacity analysis, until we arrive at the stage of

1.
Statistical Neurodynamics and 

2.
the Informational Theoretical Capacity.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

46

Capacity Analysis (4)
Static Capacity (1)
•
The first issue is the so called Static Capacity Analysis and Fix point Analysis. Recall that there is a set of stored memory items, which are represented by binary vectors:


•
and the Wljelement of the weight matrix is set according to the Hebbian Learning Rule


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

47

Capacity Analysis (5)
Static Capacity (2)
•
What we are going to investigate now is that if we pick up any stored memory vector s(ß)Sin order to recall this vector what we have to make sure that this vector is a fix point of the Hopfield Network, which means that the recursion


•
exhibits an equilibrium behavior, in terms of once we have reached s(ß) we can not get out of this state. This can be written with vector notations as follows:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

48

Capacity Analysis (6)
Static Capacity (3)
•
One can see, that what we have spelled out here is the so called fix point of this non-linear recursion, which is a necessary condition for s(ß)being a stored memory item.

•
The question is whether under what condition we can enforce this equality, meaning that what is the upper limit on the capacity which enforces that equation


•
holds. Since we investigate this equation with respect to the number of stored memory items M, we are going to draw a condition under which this equation will hold.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

49

Capacity Analysis (7)
Static Capacity (4)
•
In order to simplify the analysis let us analyze


•
which should hold for all components in order to obtain a steady state. We can replace Wljwith its definition and write


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

50

Capacity Analysis (8)
Static Capacity (5)
•
However these are finite sum, as a result we can exchange the sequence of these summations, and we can rewrite this


•
From the outer summation where .sweeps from 1 to Mwe can single out one term, where .equals to ß, as s(ß)is an element of the set S, once .will hit the value of ß, as a result we can rewrite the previous expression as follows


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

51

Capacity Analysis (9)
Static Capacity (6)
•
Where in the first part of the expression we have the square of s(ß)jwhich is always 1, because s(ß)  {-1, 1}N, and adding up Ntimes 1 gives N, which is divided by Nyields 1. Let us denote the second part of the previous expression by .l, which gives the following


•
Now what we are investigating is that under what condition the above equation holds. We can see that if .lis zero then indeed this equation will be satisfied. This criterion can be satisfied if the capacity is smaller than N


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

52

Capacity Analysis (10)
Static Capacity (7)
•
If we investigate the expression of .l


•
what we see is the inner product of the memory vectors. If 


•
This entails that the memory items must be orthogonal to each other because their inner product should be 0. However we have Ndimensional memory items, and in an Ndimensional space the maximum number of orthogonal vectors is N.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

53

Capacity Analysis (11)
Dynamic Capacity (1)
•
What we have investigated so far was a fix point analysis, but it does not necessarily entails that the Hopfield Network will converge to this fix point. Now we are going to pursue further investigation into this capacity matter, and the second stage of this investigation is that we are going to evaluate the Dynamic Capacity of the Hopfield Network, where the notion of Dynamic Capacity implies that we are investigating the steady states.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

54

Capacity Analysis (12)
Dynamic Capacity (2)
•
Definition 2 (Steady state).Steady states are a subset of fix points, into which the Hopfield Network converges.

•
The stability of the Hopfield Network was proven by using the Lyapunov concept of stability, where the center point of the concept was that there is a Lyapunov function associated with the Hopfield Network.

•
Since in the case of applying the HNN as an Associative Mapping vector bis zero what remains is as follows:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

55

Capacity Analysis (13)
Dynamic Capacity (3)
•
We have proven that the Hopfield Network is stable and converges one of the local maxima oft his Lyapunov function.

•
As a result if we want to make sure that an s(ß) vector, taken out of the set of stored memory items S, is a steady state then it is not enough to have it as a fix point, but we also have to make sure that the Lyapunov function has a local maxima over s(ß), meaning 


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

56

Capacity Analysis (14)
Dynamic Capacity (4)
•
First we deal with the Lyapunov function at the place s(ß)

•
which can be fully spelt out as follows

•
We can put instead of Wijits definition,


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

57

Capacity Analysis (15)
Dynamic Capacity (5)
•
However these finite summations can be rearranged in the following way, if we collect the terms which depend on index iand j

•
what one has to note that the summation with respect to i is the same as the summation with respect to j, as a result we can write this in a more compact form:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

58

Capacity Analysis (16)
Dynamic Capacity (6)
•
It gives rise to the following formula, if we notice that there is the inner product between two memory items

•
However in the very first investigation we pointed out that the stored memory items should be orthogonal to each other. As a result when . sweeps through 1 to M, once .will hit ßwhich can be singled out, giving the following expression


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

59

Capacity Analysis (17)
Dynamic Capacity (7)
•
Due to the orthogonality the second term is going to be zero, giving

•
We have evaluated the left hand side of the inequality

•
and now we are going to evaluate the right hand side, in a very similar manner:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

60

Capacity Analysis (18)
Dynamic Capacity (8)


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

61

Capacity Analysis (19)
Dynamic Capacity (9)
•
However we can say that if we have two binary vectors of dimension Nand components -1 and 1 it can be verified that the inner product of two vectors a, b{-1, 1}N  can be expressed by the means of Hamming distance as

•
where the Hamming distance d() are the number of components in which the two binary vectors differ from each other. Using this equation we can rewrite L(y) as follows


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

62

Capacity Analysis (20)
Dynamic Capacity (10)
•
And now we are going to provide an upper-bound on this expression, taking into account that the minimum Hamming distance in which the state vector can differ from any stored memory items is 1. As a result we can write


•
And then we are done with because what we have obtained is the following


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

63

Capacity Analysis (21)
Dynamic Capacity (10)
•
If this is fulfilled and indeed the stored memory items are going to be steady states, because this inequality holds and the Hopfield Network will converge to one of the local maxima of the Lyapunov function, and this local maxima is at the place of the stored memory item, as a result this is going to be a steady state.

•
However this result is devastating because this capacity is very small, the number of stored memory items is very limited by this expression, asymptotically it converges to 1, giving an unusable associative mapping, capable of storing one memory item.

Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network


www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

64

Capacity Analysis (22)
Dynamic Capacity (11)
•
The Dynamic Capacity of the Hopfield Network.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

65

Capacity Analysis (23)
Information Theoretical Capacity (1)
•
As we have seen, the Dynamic Capacity of the Hopfield Neural Network tends to be one as N(the dimension of stored patterns) increases to infinity. This strongly discourages us in using these networks as associative memory. 

•
In this section we describe the solution to get out of this dead-lock. However this result is devastating because this capacity is very small, the number of stored memory items is very limited by this expression, asymptotically it converges to 1, giving an unusable associative mapping, capable of storing one memory item.

Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

66

Capacity Analysis (24)
Information Theoretical Capacity (2)
•
The underlying assumption when we investigate the IT capacity of the HNN is that we choose the stored memory patterns as random variables. As a result 


•
is independent identically distributed Bernoulli random variable, with properties


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

67

Capacity Analysis (25)
Information Theoretical Capacity (3)
•
Which entails that the stored memory patterns are chosen as random series of coin flipping.

•
Orthogonal and random patterns.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

68

Capacity Analysis (26)
Information Theoretical Capacity (4)
•
What we are going to investigate is that if we take a memory vector s(ß)  then is it a fix point of the HNN or not. This is a random event, because the components of s(ß)  are chosen randomly. 

•
Furthermore applying the Hebbian Learning rule, the weight matrix is a random event as well. 

•
So we can investigate the probability of s(ß)  being a fix point:

Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

69

Capacity Analysis (27)
Information Theoretical Capacity (5)
•
Definition 3. Information Theoretical Capacity means, that what is the number of possible stored memory items Mwhich guarantee that probability 


•
asymptotically is going to be one, when Ntends to be infinity.:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

70

Capacity Analysis (28)
Information Theoretical Capacity (6)
•
In order to evaluate this equation we have to perform some calculations with 

•
which can be rewritten into the following form,


•
because being a vector valued equation, and the starting probability holds if the equation holds for every components of the vector.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

71

Capacity Analysis (29)
Information Theoretical Capacity (7)

•
holds due to the fact, that the components of s(ß) are independent random variables, and we have replaced Wijwith its definition, which is coming from the Hebbian Learning Rule. 

•
Restructuring this double summation we get to the following:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

72

Capacity Analysis (30)
Information Theoretical Capacity (8)
•
Since s(ß)  is among the stored memory patterns once .is going to hit ßand we can single out that case in the summation, which results in

•
where

•
because we are dealing with binary vectors with values {+1,-1}, and raising these values to the square we always get +1, which added up N times gives N.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

73

Capacity Analysis (31)
Information Theoretical Capacity (9)
•
Let us introduce a new variable


•
.iis a random variable as well, because the multiplication of Bernoulli random variables results in a Bernoulli random variable. What is in this expression is a double summation of Bernoulli random variables. 


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

74

Capacity Analysis (32)
Information Theoretical Capacity (10)
•
Due to the Central Limit Theorem, this approximates a Gaussian random variable

•
since the mean of each Bernoulli random variable is zero, the mean of the Gaussian random variable is zero as well, and because of the standard deviation of the Bernoulli random variables is one, and we add it up M- 1 times and normalize it with Nthe standard deviation of the Gaussian random variable is going to be


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

75

Capacity Analysis (33)
Information Theoretical Capacity (11)
•
Furthermore since we have delved into an asymptotic investigation when Ntends to be infinity if we replace M - 1 with Mit does not make any difference, so finally we get


•
What we have arrived at is:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

76

Capacity Analysis (34)
Information Theoretical Capacity (12)
•
It can be expressed by the means of conditional probabilities, namely


where we know, that


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

77

Capacity Analysis (35)
Information Theoretical Capacity (13)
•
We can rewrite these probabilities taking into account, that the sgn() function gives +1 for (1 + .i) if (1 + .i) > 0 holds, which results in


•
Now we can take advantage of that fact that .iis Gaussian:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

78

Capacity Analysis (36)
Information Theoretical Capacity (14)
•
.(u) is the cumulative distribution function of the random variable .i, with point symmetry property 1 - .(-u) = .(u) .


•
What we have arrived to is the following


•
which is a simple formula to investigate this probability.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

79

Capacity Analysis (37)
Information Theoretical Capacity (15)
•
In Definition 3. we are investigating that under what condition this probability tends to be one. However this is equivalent with investigating the logarithm of this probability when tends to zero:


•
which is equivalent with


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

80

Capacity Analysis (38)
Information Theoretical Capacity (16)
•
Remind the following facts from asymptotic analysis:


Having these approximations at hand we can see that if Ntends to be infinity and we reverse the fraction in the argument of .(u) then .(u) tends to have a very large argument, as a result we can approximate with formula above, which gives


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

81

Capacity Analysis (39)
Information Theoretical Capacity (17)
•
Now the argument of the logarithm function tends to zero when N goes to infinity, and because of this we can use 

•
to further rewrite this expression


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

82

Capacity Analysis (40)
Information Theoretical Capacity (18)
•
Having this result, we have to find Mfor which this expression tends to zero, because we have evaluated the logarithm of the initial probability, and this logarithmic probability should tend to zero, if we want the probability to tend to one. 

•
We can see that we are not in trouble to make this zero, if we choose


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

83

Capacity Analysis (41)
Statistical Neurodynamics (1)
•
When deriving the IT capacity of the HNN we used only a fix-point analysis, we did not pay attention of the dynamics of the network. Still we have to investigate the dynamics of the HNN, whether we really will converge to the stored memory items if they are chosen to be Bernoulli random variables.

•
Statistical Neurodynamics relies on the theory of Statistical Physics where we investigate very huge systems, containing a lot of particles (for example the Oxygen atoms in a room).


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

84

Capacity Analysis (42)
Statistical Neurodynamics (2)
•
Then we can characterize the state of the system by micro-states y(k), but this description is meaningless, because it contains too much information, does not reveal any important property of the system. 

•
However in order to really characterize the statistical system the concept of macro-states can be developed, where the macro-state


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

85

Capacity Analysis (43)
Statistical Neurodynamics (3)
•
And now we can investigate that if we know the state transition rule between the micro-states, Statistical Physics wants to derive the state transition rule for the macro-states.


•
For example if we take a room, we can observe the position of Oxygen atoms in there. A micro-state of this system would be characterized by giving the coordinates of each Oxygen atoms, which would be so much data that we could not retrieve any meaningful information from it.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

86

Capacity Analysis (44)
Statistical Neurodynamics (4)
•
However defining macro-states as what is the average distribution of Oxygen atoms, then this is a meaningful information. 

•
And when we are investigating the state-transmission, then for example if we turn up the heating at one corner, then we know from the basic rules of physics how the micro-states will change and then we are investigating how the macro-state will change, how the temperature will effect the average distribution of the Oxygen atoms.


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

87

Capacity Analysis (45)
Statistical Neurodynamics (5)
•
We define the macro state of the network as


•
The meaning of this macro state can be graphically represented in next Figure.


Representation of the macro state. 


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

88

Capacity Analysis (46)
Statistical Neurodynamics (6)
•
Here the macro state a(k) corresponds to cos(.(k)). If we can prove that for a given (u) a(k+ 1) > a(k) and a(k) converges to 1, than it means that the cosine of .(k) converges to 1, which means that .(k) converges to 0. Consequently if .(k) tends to 0, this means that the micro state y(k) converges to s().


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

89

Capacity Analysis (47)
Statistical Neurodynamics (7)
•
Throughout these discussions we restrict ourselves to s() = (1, 1, . . . , 1), for the sake of easier formulas, but these results can be also derived for any arbitrary  patterns. In this case if we consider


•
when s() is this special vector, we get


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

90

Capacity Analysis (48)
Statistical Neurodynamics (8)
•
It is an empirical average of the random variable yi(k), which approximates the expected value E{yi(k)}of this random variable, due to the Law of Large Numbers.

•
We would like to derive the macro state transition rule, and for this we start by writing down the 0th macro state:


•
after this we take the 1st macro state, which is as follows:


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

91

Capacity Analysis (49)
Statistical Neurodynamics (9)
•
It is an empirical average of random variables, because Wij is chosen according to the Hebbian Learning Rule, and s() i is subject to Bernoulli distribution, that is why yi(1) is a Bernoulli random variable. When we further elaborate on this expected value we can write


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

92

Capacity Analysis (50)
Statistical Neurodynamics (10)
•
After this we can substitute the definition of Wij , which comes from the Hebbian Learning Rule we applied

•
the two summations can be reordered, which yields


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

93

Capacity Analysis (51)
Statistical Neurodynamics (11)
•
The first summation here sweeps through all the possible values of , which contains as well, and because of this we can single this term out, and write

•
and because we have chosen      in a special way, namely we can write


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

94

Capacity Analysis (52)
Statistical Neurodynamics (12)
•
It was the value of the macro state at the 0thtime instance.

•
Elaborating on the second term of the summation we can see that it is a normalized summation of Bernoulli random variables, and due to the Central Limit Theorem this will approximate a Gaussian random variable .i in the following way


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

95

Capacity Analysis (53)
Statistical Neurodynamics (13)
•
Considering the fact, that we are interested in the asymptotic behavior of the Hopfield Network, when N tends to infinity, then writing M instead of M -1 does not make any difference, which gives us


Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network

www.itk.ppke.hu

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

96

Capacity Analysis (54)
Summary of the capacity
Capacity limits of Hopfield net.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

97

Outline of applying the HNN as a combinatorial optimizer
•
Defining the problem set of combinatorial optimization

•
Binary Quadratic programming as a combinatorial optimization problem

•
HNN as a combinatorial optimizer•
Philosophy

•
Minimization modification

•
Constraint mapping

•
Travelling salesman problem•
Problem formulation

•
Cost function as a quadratic function

•
Constraints as linear combination of penalty functions


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

98

Outline of applying the HNN as a combinatorial optimizer
•
TSP problem mapped into a quadratic form

•
Solution with HNN

•
HNN for ISI corrupted signal detection•
Problem formulation

•
Mapping the problem into a quadratic form

•
Solution with HNN

•
Performance analysis

•
Examples

•
Summary


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

99

Combinatorial optimization task
•
A discrete optimization task is an optimization task:where the domain of the objective function and the constraints are from a discrete set e.g.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

100

Combinatorial optimization task
•
A combinatorial optimization problem COP is a discrete optimization task where the domain of the functions is also a discrete set, but the elements are combinations of simpler elements. E.g.

•
The trivial solution for a discrete optimization problem is the exhaustive search.

•
Usually the combinatorial optimization tasks are NP problems, so for a large size the exhaustive search is intractable.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

101

Hopfield network as a combinatorial optimizer
•
We have learned that the HNN minimizes it’s energy function which is:

•
Note that if we don’t assume any constraints or incorporate the constraints in the energy function and choose the objective function to be the energy function, then the HNN can perform the combinatorial optimization.

•
So if we can address a combinatorial optimization problem as a quadratic function minimization, then a HNN can be used.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

102

HNN as a combinatorial optimizer –philosophy

•
We can approximate the solution of a traditionally NP problem


Data
representation

Quadratic 
optimization
problem

Combinatorial
optimization
problem

Binary 
representation

HNN

De-
representation

Optimal 
Solution

Quadratic
form

Global
minimum


O(2N) or O(N!)

O(N2)


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

103

HNN as a combinatorial optimizer –minimization
•
We have used the HNN for maximizing the energy function, we will prove that with a modification it can be used for minimization.

•
Modify the weight matrix as:


•
The new energy function is:


•
It can be proven that: 


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

104

HNN as a combinatorial optimizer –minimization
•
The two energy functions only differ in a constant term so the minimum is the same location: 


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

105

HNN as a combinatorial optimizer –minimization

•
The modified energy function does not alter the place of the minimum, so we can define a state transition rule which minimizes the new energy function:


•
The Modified HNN is capable of minimizing the quadratic form and the number of steps needed for this minimization is a polynomial function of the dimension of the network.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

106

HNN as a combinatorial optimizer –minimization
•
To prove this we have to ensure 3 properties for this structure:

1.
there exists a global upper-bound B;

2.
there exists a global lower-bound A;

3.
the change in the Lyapunov function is always larger than .


C:\home\tisda\itk.ppke.hu\wsn\tamop\Munka_NEURODSP\08_fejezet\lyap_min.png


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

107

HNN as a combinatorial optimizer –minimization
•
If properties shown we can state that the transient time is:


•
Following the methods shown at the maximization the lower and upper bounds can be derived similarly:


•
We have to elaborate on the change of the energy function to prove property 3.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

108

HNN as a combinatorial optimizer –minimization
•
Rewriting the change of the energy function:


•
To show that the change is bounded we introduce a table with the possible events:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

109

HNN as a combinatorial optimizer –minimization
•
Table of the possible changes in the energy function:


•
So the energy function always decreases which yields to a minimum point.

•
We can give a bound on a transition time:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

110

HNN as a combinatorial optimizer –minimization
Local vs. global optima
•
The Hopfield network acts along the energy “surface” function and in every step decreases it. However if a valley other than the basin of the desired solution (marked with blue) exist in the energy function, it can stuck in a suboptimal answer if started from a “wrong” basin.

•
In the energy function we call the bottom of such valleys local minima. Local minima usually have higher value than the global minimum.


C:\home\tisda\itk.ppke.hu\2009-2010-phd07-tavasz\ewcc\local_min copy.png


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

111

HNN as a combinatorial optimizer –minimization
Strategies overcoming staying in local minima
•
There are a lot of different strategies to overcome stuckingin a local minima. A lot of them are heuristics or combined with other combinatorial optimizer strategies.

•
A few examples:•
If stuck, “shake” ~ add noise to the state, maybe it shakes into a “better” valley –Noisy Hopfield Network

•
Chaotic Hopfield Network

•
Taboo search combined HNN

•
EDA+HNN

•
Hysteretic HNN

•
Multi stage, multi init.HNN, etc.


C:\home\tisda\itk.ppke.hu\2009-2010-phd07-tavasz\ewcc\local_min copy.png

Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

112

HNN as a combinatorial optimizer –constraint mapping
•
We have seen that the HNN is capable to minimize a quadratic function


•
But we cannot deal with the constraints in a straightforward manner.

•
The most common way to deal with the constraints is to incorporate them into the energy function as penalizing terms


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

113

HNN as a combinatorial optimizer –constraint mapping
•
Constructing an objective function as a linear combination of the original cost function and the penalizing terms


•
Although other construction could be used due to the linearity property the linear combination is the most commonly used choice.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

114

Travelling salesman COP
•
One of the most important COPs is the Travelling Salesman Problem (TSP).

•
We have Kcities. The TSP is that we have an agent at city 1 and we want him to visit all the cities exactly only once and arrive back to city 1 while we require him to travel the least (on the possible shortest route)


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

115

Travelling salesman COP
•
We have an edge weighted graph G(V,E) where the vertices are representing the cities the edges are the possible travelling connections between the cities and the distance between the cities (travelling costs) are the weights of the edges.

•
The TSP problem is the same as finding the shortest Hamiltonian cycle in an edge weighted graph

•
http://imgs.xkcd.com/comics/travelling_salesman_problem.png


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

116

Travelling salesman COP
•
Let us follow the steps of


•
Let’s transform the TSP into a quadratic optimization problem tractable by HNN


Data
representation

Quadratic 
optimization
problem

Combinatorial
optimization
problem

Binary 
representation

HNN

De-
representation

Optimal 
Solution

Quadratic
form

Global
minimum


O(2N) or O(N!)

O(N2)


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

117

Travelling salesman COP
•
Given a distance matrix D for the graph we can represent a trip by a matrix V having

•
For example the shown route can be described by the following route matrix:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

118

Travelling salesman COP
•
Note that Vcould be arbitrarily chosen, but for Vto describe a valid tour for the agent Vmust satisfy several constraints:

a)
Each row in Vmust contain exactly one 1-s, because the agent cannot be in two or more cities at the same time

b)
Each column in Vmust contain exactly one 1-s because we must visit all the cities and we must visit them only once.

•
We can describe mathematically these constraints as

1.
Row orthogonality:

2.
Column orthogonality:

3.
Sum of the elements of Vmust be K:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

119

Travelling salesman COP
•
For Vto describe a valid tour (be a feasible solution) Vhas to be a permutation matrix

•
Having this notation we can formulate the cost of a route:


•
The objective function to be minimized is:


•
Note that the HNN works with state vectors of


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

120

Travelling salesman COP

•
We have to transform Vto the domain of y


•
After this we can write the objective function as:

•
is this a quadratic form?


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

121

Travelling salesman COP
•
This is a quadratic form which we will write in the traditional matrix-vector notation. Thus giving W and b


V

y


matrix

vector


???


W, b

Transformation


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

122

Travelling salesman COP
•
We perform an index change:


•
From the original cost function


•
We come to:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

123

Travelling salesman COP


De-representation

yopt


Weight matrix

Bias vector

We need a modified goal function to ensure a permutation matrix

Vopt

Is it a 
permutation matrix ???


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

124

Travelling salesman COP
•
This energy function does not guarantee us to find a feasible solution (valid tour along the cities), so we have to incorporate the constraints for Vinto the energy function in such a way that the energy function has to be minimal if all the constraints are satisfied and the cost function is minimal.

•
Wechoose a weighted linear combination of the cost function and the constraint terms so that if any of the constraints are not satisfied then the energy function is penalized thus pressing it farther from the minimum.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

125

Travelling salesman COP
•
We have the new energy function as


•
Substitutinginto all terms, we get


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

126

Travelling salesman COP


•
This is a quadratic form again which we will write in the traditional matrix-vector notation. Thus giving W and b.

•
Let us first separate the quadratic terms the linear terms and the constant terms the same way as we did for only the cost term.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

127

Travelling salesman COP


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

128

Travelling salesman COP
•
After substituting and evaluating the parameters we get independent matrices and vectors to form the overall quadratic function:


•
Where the matrices and vectors correspond properties of the row, column orthogonality, the permutation matrix property and the press of the cost term.

•
The weights of the linear combinations can be adjusted over the solution process in each stage to emphasize one property over another. Usually heuristics are applied to change them.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

129

Signal detection as combinatorial optimization
•
One other example for a COP is a multipath propagated radio wave compensation and detection in communication.

•
We send a block of symbols but the receiver gets a noisy linear combination of them due to ISI and additive noise


Inter Symbol Interference
&
Additive Gaussian Noise

Sent signal

Received signal


Channel


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

130

Signal detection as combinatorial optimization
•
Wesendablockofsymbols

•
Butduetothechannelactslikealinearfilter,wereceiveanoiseaddedconvolutionofthesentsymbolswiththechannel’simpulseresponsefunction


ISI


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

131

Signal detection as combinatorial optimization
•
We want to make a decision based on the knowledge of H (the channel matrix) and the received message x, what was the most probable sent information vector y


Gaussian noise


Received signal

Sent signal

Detected signal

ISI


??

Detection

?


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

132

Signal detection as combinatorial optimization
•
We can use a simple decision rule, taking the sign of the received signal.

•
Threshold detector:


Received signal

Sent signal

Detected signal

ISI


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

133

Signal detection as combinatorial optimization
•
But we know more information of the underlying phenomenon (we know H and that a noise is added). So applying the Bayesian decision rule:

•
We know that the received signal is constructed by the channel as:

•
And that the noise is an additive white Gaussian noise

•
So the observed signal can be treated as a random variable:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

134

Signal detection as combinatorial optimization

•
We can describe the optimal decision based on the Bayesian rule:


•
We will show that this is a quadratic form indeed.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

135

Signal detection as combinatorial optimization
•
Expanding the expression:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

136

Signal detection as combinatorial optimization
•
So we have constructed a quadratic energy function that can be used as the energy function of the HNN.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

137

Signal detection as combinatorial optimization
•
The HNN with the given parameters can approximate the optimal detection.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

138

Signal detection as combinatorial optimization
•
Performance analysis for an example:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

139

Analog circuit implementation of the HNN
•
There are several types of implementation of the HNN. Software like Matlab or Labview contain packages of different neural networks. 

•
On a DSP one can exploit the fast matrix vector multiplication capabilities. 

•
The optical implementation gives us a very fast architecture. 

•
However the available software are very slow in contrast to the hardware implementations, while the DSP and optical implementation is not cut out for large scale. Due to the quadratically growing interconnections between neurons.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

140

Analog circuit implementation of the HNN
•
The first step in implementing Hopfield Neural Network as an analog circuit is to analyze the nonlinear state transition rule of the network:where we have set b= 0for the ease of simpler formulas. 

•
This is a discrete time state transition rule, in terms k=1,2,…

•
When we are implementing Hopfield Neural Networks as an analog circuit then this circuit can not handle discrete time but continuous time. This gives rise to the first question, namely how to change this network from discrete to continuous time?


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

141

Analog circuit implementation of the HNN
•
We need to modify the state transition from a discrete time step to an infinitesimally small time step (difference) and use a differentiable activation function instead of the sign function.


•
If we choose an arbitrary small time step


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

142

Exercises –example1
Given a DHNN by it’s weight matrix and bias vector.

a)
Determine and draw to the figure the state transitions and the stable point using the given values of the Lyapunov function if we use the network for minimization, and the initial state is:


b)
Verify the solution applying and computing the states according to the state transition rule.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

143

Exercises –example1solution
a)


4


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

144

Exercises –example1solution
b)


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

145

Exercises –example2
Give the stable point of the following HNN:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

146

Exercises –example3
We want to store the following samples in a HNN used as an associative memory:
•
Give the weight matrix and the bias vector of the network!

•
Are the samples orthogonal?

•
Show a stable point beside the stored sample points!

•
Mark the states in the figure from where the net converges to the stored samples:


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

147

Exercises –example4
We want to solve the following optimization problem with a Hopfield net:
•
Give the concrete recursive state update formula of this Hopfield net used for minimization!


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

148

Exercises –example5
We want to use a Hopfield net in a digital communication system for detection. The state of the linear channel distortion with an AWGN noise is assumed to be known and to be stationer. 
The impulse response of the channel is:and the SNR is 0dB. The block representation of the system model is:
•
Show that the HNN is the optimal detector for this problem.


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

149

Exercises –example5
•
Give the weight matrix and the bias vector of the Hopfield net for the given channel impulse response and noise power if the received signal is:

•
What will be the decoded message (   )? 

•
What would be the decoded message if a threshold detector would have been used instead of the HNN detector?


Note: the initial state of the HNN is random. In the example use 


Hopfield network, Hopfield net as associative memory and combinatorial optimizer

10/5/2011.

TÁMOP –4.1.2-08/2/A/KMR-2009-0006 

150

Summary
•
A method was shown how to use the HNN to minimize a quadratic cost function

•
This construction was used for solving combinatorial optimization problems which are traditionally NP problems, but with the HNN polynomial complexity approximation is given for the solution.

•
Examples have been shown of mapping combinatorial optimization problems into quadratic programming tasks

•
Possible analog circuit implementation was shown for the HNN