10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 1 Development of Complex Curricula for Molecular Bionics and Infobionics Programs within a consortial* framework** Consortium leader PETER PAZMANY CATHOLIC UNIVERSITY Consortium members SEMMELWEIS UNIVERSITY, DIALOG CAMPUS PUBLISHER The Project has been realised with the support of the European Union and has been co-financed by the European Social Fund *** **Molekuláris bionika és Infobionika Szakok tananyagának komplex fejlesztése konzorciumi keretben ***A projekt az Európai Unió támogatásával, az Európai Szociális Alap társfinanszírozásával valósul meg. PETER PAZMANY CATHOLIC UNIVERSITY SEMMELWEIS UNIVERSITY sote_logo.jpg dk_fejlec.gif INFOBLOKK 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 2 Peter Pazmany Catholic University Faculty of Information Technology Hopfield network, Hopfield net as associative memory and combinatorial optimizer www.itk.ppke.hu Hopfield hálózat, Hopfield, mint asszociatív memória és kombinatorikus optimalizáló J. Levendovszky, A. Olah, G. Treplan, D. Tisza Digitális-neurális-, éskiloprocesszorosarchitektúrákonalapulójelfeldolgozás Digital-and Neural Based Signal Processing & KiloprocessorArrays Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 3 Outline • Introduction • Hopfield net -Structure and operation • Hopfield net -Stability and convergence properties • Hopfield net as an Associative Memory (AM) • Capacity analysis of the Hopfield net • Applications of Hopfield net as AM • Hopfield net as combinatorial optimizers • The way towards CNN • Example Problems www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 4 Introduction (1) Hopfield neural network is a • Recurrent artificial neural network, • Invented by John Hopfield, • Serve asan associative memory system • Or operate as a combinatorial optimizer (quadratic programming) • A stable dynamic system, guaranteed to converge to a local minimum • Convergence to one of the stored patterns is not guaranteed. www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 5 Introduction (2) • Topology of Hopfield Neural Network www.itk.ppke.hu Numberof connections: Implementationdifficulty! Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 6 Structure and operation (1) Notations (1) • Weight matrix contains the Wijsynaptic weight strength feedback from neuron ito neuron j: • where www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 7 Structure and operation (2) Notations (2) • Bias vector contains the threshold values of each neuron: • Let state vector of the system be • where www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 8 Structure and operation (3) • The discrete Hopfield Neural Network can be regarded as a nonlinear recursion given in the form of for every neuron. • If we reduce our attention only to the sequential updating rule, the neuron selection rule becomes: www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 9 Structure and operation (4) • When investigating such a nonlinear recursion as an associative mapping, the following questions can arise: 1. How to construct matrix Wif one wants to store a set of patterns as the fix points of algorithm such as that holds. www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 10 Structure and operation (5) 2. What is the number of those fix points Mas a function of the dimension (number of neurons in the Hopfield net) ? In other words we want to reveal thestorage capacityof the Hopfield net as a function of the number of neurons N. www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 11 Structure and operation (6) 3.Is recursion stable and if yes then what are the its convergence properties? • Next we will thoroughly discuss these questions. Before getting down to a detailed analysis, we need some tools rooted in the classical stability theory called Lyapunov technique. www.itk.ppke.hu Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 12 Stability and convergence properties (1) • Lyapunov1functions are widely used in the study of dynamical systems in order to prove the stability of a system. T • This technique can be used to analyze the stability properties of the Hopfield Neural Networks. 1 Aleksandr Mikhailovich Lyapunov (June 6, 1857 -November 3, 1928) was a Russian mathematician, mechanician and physicist. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 13 Stability and convergence properties (2) • One possible Lyapunov function Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 14 Stability and convergence properties (3) Lyapunov’s weak theorem (1) • Let us assume that there is a nonlinear recursion given in the following general form • If one can define a function (the so-called Lyapunov function) • over the state space Y, for which Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 15 Stability and convergence properties (4) Lyapunov’s weak theorem (2) 1.L(y) has a global upper bound over the state space: 2. the change of L(y) denoted by in each step of the recursion; Then the recursion is stable and converges one of the local maxima of L(y). Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 16 Stability and convergence properties (5) Lyapunov’s weak theorem (3) • The exact proof, which can be found in numerous books dealing with control and stability theory, is omitted here. • However, it is easy to see if in each step the L(y) can only increase and at the same time there exist a global lower bound then the recursion cannot go on indefinitely but it will converge to one of the local minima. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 17 Stability and convergence properties (6) Lyapunov’s strong theorem (1) • Let us assume that there is a nonlinear recursion given in the following general form • If one can define a function (the so-called Lyapunov function) • over the state space Y, for which Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 18 Stability and convergence properties (7) Lyapunov’s strong theorem (2) 1.L(y) has a global lower and upper bound over the state space: 2. the change of L(y) denoted by in each step of the recursion; Then the recursion is stable and converges one of the local maxima of L(y) . Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 19 Stability and convergence properties (8) Lyapunov’s strong theorem (3) and its transient time can be upper bounded as . • Again the proof omitted, however it is easy to interpret the result as follows: in each step L(y) increases at least by . and in the worst case the maximum number of steps needed to cover the distance B- Ais • With this tools at our disposal we can embark on ascertaining the stability and convergence properties of the Hopfield net. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 20 Stability and convergence properties (9) • To use the Lyapunov technique we have to assume a Lyapunov function associated to recursion • According to Hopfield, Cohen and Grossberg, we define the corresponding Lyapunov function as follows: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 21 Stability and convergence properties (10) • Now we want to apply Lyapunov’s strong theorem, therefore we have to check the following three conditions: 1. existence of global upper bound; 2. existence of global lower bound; 3. it is true, that L(y) . .. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 22 Stability and convergence properties (11) The existence of global upper bound • To derive an upper bound we can use the Cauchy-Schwartz inequality as follows: • taking into account that we are dealing with binary state vectors, elements of {-1,1}N for which Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 23 Stability and convergence properties (12) The existence of global lower bound (1) • To derive a global lower upper for L(y) let us first broaden the state space from Y={-1,1}Nto the space of N dimensional real numbers and define • over this broadened state space. Therefore Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 24 Stability and convergence properties (13) The existence of global lower bound (2) • The minimum • can be easily calculated considering that • is continuum therefore the gradient of • exists and form the well known results related to quadratic forms the location of this maximum is given as xopt=W-1b. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 25 Stability and convergence properties (14) The existence of global lower bound (3) • Substituting xoptinto • one obtains that Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 26 Stability and convergence properties (15) The change of the Lyapunov function (1) • Let us define the change of the Lyapunov function as follows Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 27 Stability and convergence properties (16) The change of the Lyapunov function (2) • We apply the sequential update rule, which means that only the component • in the state vector y(k) changes, we can write Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 28 Stability and convergence properties (17) The change of the Lyapunov function (3) • Taking this into consideration .L(k) takes the following form • where Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 29 Stability and convergence properties (18) The change of the Lyapunov function (4) • Let us introduce quantity .as • To calculate the values of this expression we can create the following table: yl(k) yl(k+1) .yl(k) Wll.yl2 (k) 2.yl(k) .jWlj.yj(k)-bl .L(k) -1 +1 +2 4Wll +4 .>0 4(Wll+.)>0 +1 -1 -2 4Wll -4 -.<0 4(Wll+.)>0 Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 30 Stability and convergence properties (19) The change of the Lyapunov function (5) • From this table it can be seen that whenever a state transition occurs, then • This means that for each step when a state transition occurs the energy function increases. • Given a global upper and lower bound it follows that there exist a time step where the algorithm must stop in a local maxima. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 31 Stability and convergence properties (20) • Theorem 3.The Hopfield type of recursion 1. is stable; 2. it converges to the local maxima of the function 3. the transient time can be upper-bounded as Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 32 Hopfield Neural Network as an AM (1) • Analyzing the non-linear state recursion of the Hopfield Neural Network we have come to the conclusion that this is a finite state automata with binary state vectors, which gave rise to a new application of this network: Associative Mapping (AM). • We start the HNN from an initial state vector x, which corresponds to a corrupted version of a stored memory item, we call this vector clue. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 33 Hopfield Neural Network as an AM (2) • Then we start the iteration • of the network, and if the network gets stuck in one of its steady-states then we call this vector as recalled memory item. • This mapping is the so-called Associative Mapping. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 34 Hopfield Neural Network as an AM (3) • This new computational gives rise to the model in Figure where the Ndimensional binary vectors are mapped into two dimensions. The box represents the state space Y{-1, 1}N, there are a couple of fix points of the HNN s(1), s(2), s(3), s(4). • An Associative Mapping is a partitioning of the space Y{-1, 1}N. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 35 Hopfield Neural Network as an AM (4) • There is a separation of this state space, in terms if we start the network from a state x, which falls into the basin of convergence of the memory pattern s(4), then finally the network will stuck in this steady state. • In general this also holds for the other memory patterns, and their basin of attraction. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 36 Hopfield Neural Network as an AM (5) • A pretty general demonstration of the working of an Associative Mapping is depicted in figure. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 37 Hopfield Neural Network as an AM (6) • Lets assume that there are some stored memory items, for example a picture of a vase, a cat and a lorry, these are the three patterns. • An Associative Mapping means that if we have a corrupted and incomplete version of one of the memory patterns then it will be mapped to one of the stored items, which is the closest. A demonstration how Associative Mapping works. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 38 Hopfield Neural Network as an AM (7) • In order to use the Hopfield Neural Network to implement this new computational paradigm, we have to make sure that the network is stable, meaning the network will converge to a steady state. • However when we started to investigate the stability, we came up with the Lyapunov concept of stability, and there is a quadratic form • which is the Lyapunov function of the HNN, and we have proven that the Hopfield Network is stable. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 39 Hopfield Neural Network as an AM (8) • Furthermore we have come to the conclusion that the transient time of this network is O(N2), which is must faster than exhaustive search O(2N). • When one implements an Associative Memory, there is a given set of items, which are to be stored, this is called the set of stored memory items, and noted by S, • here the number of the stored memory items is M, which items are binary vectors, with dimension N, Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 40 Hopfield Neural Network as an AM (9) • These binary vectors can refer to images, speech patterns or bank account numbers, depending on the actual application. • The set Xrepresents the observation space, this can be any possible Ndimensional binary vector, • We can see that SX, because s(.)is also Ndimensional binary vector. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 41 Hopfield Neural Network as an AM (10) • Definition 1. An Associative Mapping .is defined as a mapping from the observation space Xto the set of stored memory items S, in such a way that it maps any specific observation vector xinto a stored memory item s(ß) for which it holds, that this stored memory item is the closest to the observation vector, according to a certain distance criterion d(). • Formally AM is a Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 42 Hopfield Neural Network as an AM (11) • We can define distance in any arbitrary way, which suits our application. However when we deal with binary vectors it is rather plausible that this distance is the Hamming distance, which measures in how many bits two vectors differ from each other. • There are two fundamental attributes of an Associative Memory: 1) stability, 2) capacity. • Where capacity boils down to that what is the size of the stored memory items, in our notation M= |S|. When we speak of the analysis of Hopfield Neural Network as an Associative Mapping we are trying to reveal these two properties. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 43 Capacity Analysis (1) • When we use the Hopfield Neural Network as an Associative Memory the threshold vector b is set to the all zero vector 0, and if we want to store in the network a predefined set of memory items then the components of the weight matrix is as follows: • which can be written with vector notations using the outer product operation: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 44 Capacity Analysis (2) • This rule was discovered and fully elaborated on in the work of D. O. Hebb3, a Canadian born psychologist. • That’s why it is named as Hebbian Learning Rule. He described this rule as a psychologist in a textual way and from this description it was mathematically inferred that is the learning rule. 3Donald Olding Hebb (July 22, 1904 -August 10, 1985) was a psychologist who was influential in the area of neuropsychology, where he sought to understand how the function of neurons contributed to psychological processes such as learning. He has been described as the father of neuropsychology and neural networks. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 45 Capacity Analysis (3) • By capacity we mean that how many stored memory items can be recalled from the Hopfield Neural Network. • In order to do that we start with a very elementary investigation and then are going to penetrate deeper and deeper into the capacity analysis, until we arrive at the stage of 1. Statistical Neurodynamics and 2. the Informational Theoretical Capacity. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 46 Capacity Analysis (4) Static Capacity (1) • The first issue is the so called Static Capacity Analysis and Fix point Analysis. Recall that there is a set of stored memory items, which are represented by binary vectors: • and the Wljelement of the weight matrix is set according to the Hebbian Learning Rule Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 47 Capacity Analysis (5) Static Capacity (2) • What we are going to investigate now is that if we pick up any stored memory vector s(ß)Sin order to recall this vector what we have to make sure that this vector is a fix point of the Hopfield Network, which means that the recursion • exhibits an equilibrium behavior, in terms of once we have reached s(ß) we can not get out of this state. This can be written with vector notations as follows: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 48 Capacity Analysis (6) Static Capacity (3) • One can see, that what we have spelled out here is the so called fix point of this non-linear recursion, which is a necessary condition for s(ß)being a stored memory item. • The question is whether under what condition we can enforce this equality, meaning that what is the upper limit on the capacity which enforces that equation • holds. Since we investigate this equation with respect to the number of stored memory items M, we are going to draw a condition under which this equation will hold. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 49 Capacity Analysis (7) Static Capacity (4) • In order to simplify the analysis let us analyze • which should hold for all components in order to obtain a steady state. We can replace Wljwith its definition and write Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 50 Capacity Analysis (8) Static Capacity (5) • However these are finite sum, as a result we can exchange the sequence of these summations, and we can rewrite this • From the outer summation where .sweeps from 1 to Mwe can single out one term, where .equals to ß, as s(ß)is an element of the set S, once .will hit the value of ß, as a result we can rewrite the previous expression as follows Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 51 Capacity Analysis (9) Static Capacity (6) • Where in the first part of the expression we have the square of s(ß)jwhich is always 1, because s(ß) {-1, 1}N, and adding up Ntimes 1 gives N, which is divided by Nyields 1. Let us denote the second part of the previous expression by .l, which gives the following • Now what we are investigating is that under what condition the above equation holds. We can see that if .lis zero then indeed this equation will be satisfied. This criterion can be satisfied if the capacity is smaller than N Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 52 Capacity Analysis (10) Static Capacity (7) • If we investigate the expression of .l • what we see is the inner product of the memory vectors. If • This entails that the memory items must be orthogonal to each other because their inner product should be 0. However we have Ndimensional memory items, and in an Ndimensional space the maximum number of orthogonal vectors is N. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 53 Capacity Analysis (11) Dynamic Capacity (1) • What we have investigated so far was a fix point analysis, but it does not necessarily entails that the Hopfield Network will converge to this fix point. Now we are going to pursue further investigation into this capacity matter, and the second stage of this investigation is that we are going to evaluate the Dynamic Capacity of the Hopfield Network, where the notion of Dynamic Capacity implies that we are investigating the steady states. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 54 Capacity Analysis (12) Dynamic Capacity (2) • Definition 2 (Steady state).Steady states are a subset of fix points, into which the Hopfield Network converges. • The stability of the Hopfield Network was proven by using the Lyapunov concept of stability, where the center point of the concept was that there is a Lyapunov function associated with the Hopfield Network. • Since in the case of applying the HNN as an Associative Mapping vector bis zero what remains is as follows: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 55 Capacity Analysis (13) Dynamic Capacity (3) • We have proven that the Hopfield Network is stable and converges one of the local maxima oft his Lyapunov function. • As a result if we want to make sure that an s(ß) vector, taken out of the set of stored memory items S, is a steady state then it is not enough to have it as a fix point, but we also have to make sure that the Lyapunov function has a local maxima over s(ß), meaning Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 56 Capacity Analysis (14) Dynamic Capacity (4) • First we deal with the Lyapunov function at the place s(ß) • which can be fully spelt out as follows • We can put instead of Wijits definition, Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 57 Capacity Analysis (15) Dynamic Capacity (5) • However these finite summations can be rearranged in the following way, if we collect the terms which depend on index iand j • what one has to note that the summation with respect to i is the same as the summation with respect to j, as a result we can write this in a more compact form: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 58 Capacity Analysis (16) Dynamic Capacity (6) • It gives rise to the following formula, if we notice that there is the inner product between two memory items • However in the very first investigation we pointed out that the stored memory items should be orthogonal to each other. As a result when . sweeps through 1 to M, once .will hit ßwhich can be singled out, giving the following expression Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 59 Capacity Analysis (17) Dynamic Capacity (7) • Due to the orthogonality the second term is going to be zero, giving • We have evaluated the left hand side of the inequality • and now we are going to evaluate the right hand side, in a very similar manner: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 60 Capacity Analysis (18) Dynamic Capacity (8) Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 61 Capacity Analysis (19) Dynamic Capacity (9) • However we can say that if we have two binary vectors of dimension Nand components -1 and 1 it can be verified that the inner product of two vectors a, b{-1, 1}N can be expressed by the means of Hamming distance as • where the Hamming distance d() are the number of components in which the two binary vectors differ from each other. Using this equation we can rewrite L(y) as follows Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 62 Capacity Analysis (20) Dynamic Capacity (10) • And now we are going to provide an upper-bound on this expression, taking into account that the minimum Hamming distance in which the state vector can differ from any stored memory items is 1. As a result we can write • And then we are done with because what we have obtained is the following Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 63 Capacity Analysis (21) Dynamic Capacity (10) • If this is fulfilled and indeed the stored memory items are going to be steady states, because this inequality holds and the Hopfield Network will converge to one of the local maxima of the Lyapunov function, and this local maxima is at the place of the stored memory item, as a result this is going to be a steady state. • However this result is devastating because this capacity is very small, the number of stored memory items is very limited by this expression, asymptotically it converges to 1, giving an unusable associative mapping, capable of storing one memory item. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 64 Capacity Analysis (22) Dynamic Capacity (11) • The Dynamic Capacity of the Hopfield Network. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 65 Capacity Analysis (23) Information Theoretical Capacity (1) • As we have seen, the Dynamic Capacity of the Hopfield Neural Network tends to be one as N(the dimension of stored patterns) increases to infinity. This strongly discourages us in using these networks as associative memory. • In this section we describe the solution to get out of this dead-lock. However this result is devastating because this capacity is very small, the number of stored memory items is very limited by this expression, asymptotically it converges to 1, giving an unusable associative mapping, capable of storing one memory item. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 66 Capacity Analysis (24) Information Theoretical Capacity (2) • The underlying assumption when we investigate the IT capacity of the HNN is that we choose the stored memory patterns as random variables. As a result • is independent identically distributed Bernoulli random variable, with properties Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 67 Capacity Analysis (25) Information Theoretical Capacity (3) • Which entails that the stored memory patterns are chosen as random series of coin flipping. • Orthogonal and random patterns. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 68 Capacity Analysis (26) Information Theoretical Capacity (4) • What we are going to investigate is that if we take a memory vector s(ß) then is it a fix point of the HNN or not. This is a random event, because the components of s(ß) are chosen randomly. • Furthermore applying the Hebbian Learning rule, the weight matrix is a random event as well. • So we can investigate the probability of s(ß) being a fix point: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 69 Capacity Analysis (27) Information Theoretical Capacity (5) • Definition 3. Information Theoretical Capacity means, that what is the number of possible stored memory items Mwhich guarantee that probability • asymptotically is going to be one, when Ntends to be infinity.: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 70 Capacity Analysis (28) Information Theoretical Capacity (6) • In order to evaluate this equation we have to perform some calculations with • which can be rewritten into the following form, • because being a vector valued equation, and the starting probability holds if the equation holds for every components of the vector. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 71 Capacity Analysis (29) Information Theoretical Capacity (7) • holds due to the fact, that the components of s(ß) are independent random variables, and we have replaced Wijwith its definition, which is coming from the Hebbian Learning Rule. • Restructuring this double summation we get to the following: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 72 Capacity Analysis (30) Information Theoretical Capacity (8) • Since s(ß) is among the stored memory patterns once .is going to hit ßand we can single out that case in the summation, which results in • where • because we are dealing with binary vectors with values {+1,-1}, and raising these values to the square we always get +1, which added up N times gives N. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 73 Capacity Analysis (31) Information Theoretical Capacity (9) • Let us introduce a new variable • .iis a random variable as well, because the multiplication of Bernoulli random variables results in a Bernoulli random variable. What is in this expression is a double summation of Bernoulli random variables. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 74 Capacity Analysis (32) Information Theoretical Capacity (10) • Due to the Central Limit Theorem, this approximates a Gaussian random variable • since the mean of each Bernoulli random variable is zero, the mean of the Gaussian random variable is zero as well, and because of the standard deviation of the Bernoulli random variables is one, and we add it up M- 1 times and normalize it with Nthe standard deviation of the Gaussian random variable is going to be Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 75 Capacity Analysis (33) Information Theoretical Capacity (11) • Furthermore since we have delved into an asymptotic investigation when Ntends to be infinity if we replace M - 1 with Mit does not make any difference, so finally we get • What we have arrived at is: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 76 Capacity Analysis (34) Information Theoretical Capacity (12) • It can be expressed by the means of conditional probabilities, namely where we know, that Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 77 Capacity Analysis (35) Information Theoretical Capacity (13) • We can rewrite these probabilities taking into account, that the sgn() function gives +1 for (1 + .i) if (1 + .i) > 0 holds, which results in • Now we can take advantage of that fact that .iis Gaussian: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 78 Capacity Analysis (36) Information Theoretical Capacity (14) • .(u) is the cumulative distribution function of the random variable .i, with point symmetry property 1 - .(-u) = .(u) . • What we have arrived to is the following • which is a simple formula to investigate this probability. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 79 Capacity Analysis (37) Information Theoretical Capacity (15) • In Definition 3. we are investigating that under what condition this probability tends to be one. However this is equivalent with investigating the logarithm of this probability when tends to zero: • which is equivalent with Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 80 Capacity Analysis (38) Information Theoretical Capacity (16) • Remind the following facts from asymptotic analysis: Having these approximations at hand we can see that if Ntends to be infinity and we reverse the fraction in the argument of .(u) then .(u) tends to have a very large argument, as a result we can approximate with formula above, which gives Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 81 Capacity Analysis (39) Information Theoretical Capacity (17) • Now the argument of the logarithm function tends to zero when N goes to infinity, and because of this we can use • to further rewrite this expression Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 82 Capacity Analysis (40) Information Theoretical Capacity (18) • Having this result, we have to find Mfor which this expression tends to zero, because we have evaluated the logarithm of the initial probability, and this logarithmic probability should tend to zero, if we want the probability to tend to one. • We can see that we are not in trouble to make this zero, if we choose Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 83 Capacity Analysis (41) Statistical Neurodynamics (1) • When deriving the IT capacity of the HNN we used only a fix-point analysis, we did not pay attention of the dynamics of the network. Still we have to investigate the dynamics of the HNN, whether we really will converge to the stored memory items if they are chosen to be Bernoulli random variables. • Statistical Neurodynamics relies on the theory of Statistical Physics where we investigate very huge systems, containing a lot of particles (for example the Oxygen atoms in a room). Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 84 Capacity Analysis (42) Statistical Neurodynamics (2) • Then we can characterize the state of the system by micro-states y(k), but this description is meaningless, because it contains too much information, does not reveal any important property of the system. • However in order to really characterize the statistical system the concept of macro-states can be developed, where the macro-state Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 85 Capacity Analysis (43) Statistical Neurodynamics (3) • And now we can investigate that if we know the state transition rule between the micro-states, Statistical Physics wants to derive the state transition rule for the macro-states. • For example if we take a room, we can observe the position of Oxygen atoms in there. A micro-state of this system would be characterized by giving the coordinates of each Oxygen atoms, which would be so much data that we could not retrieve any meaningful information from it. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 86 Capacity Analysis (44) Statistical Neurodynamics (4) • However defining macro-states as what is the average distribution of Oxygen atoms, then this is a meaningful information. • And when we are investigating the state-transmission, then for example if we turn up the heating at one corner, then we know from the basic rules of physics how the micro-states will change and then we are investigating how the macro-state will change, how the temperature will effect the average distribution of the Oxygen atoms. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 87 Capacity Analysis (45) Statistical Neurodynamics (5) • We define the macro state of the network as • The meaning of this macro state can be graphically represented in next Figure. Representation of the macro state. Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 88 Capacity Analysis (46) Statistical Neurodynamics (6) • Here the macro state a(k) corresponds to cos(.(k)). If we can prove that for a given (u) a(k+ 1) > a(k) and a(k) converges to 1, than it means that the cosine of .(k) converges to 1, which means that .(k) converges to 0. Consequently if .(k) tends to 0, this means that the micro state y(k) converges to s(). Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 89 Capacity Analysis (47) Statistical Neurodynamics (7) • Throughout these discussions we restrict ourselves to s() = (1, 1, . . . , 1), for the sake of easier formulas, but these results can be also derived for any arbitrary patterns. In this case if we consider • when s() is this special vector, we get Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 90 Capacity Analysis (48) Statistical Neurodynamics (8) • It is an empirical average of the random variable yi(k), which approximates the expected value E{yi(k)}of this random variable, due to the Law of Large Numbers. • We would like to derive the macro state transition rule, and for this we start by writing down the 0th macro state: • after this we take the 1st macro state, which is as follows: Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 91 Capacity Analysis (49) Statistical Neurodynamics (9) • It is an empirical average of random variables, because Wij is chosen according to the Hebbian Learning Rule, and s() i is subject to Bernoulli distribution, that is why yi(1) is a Bernoulli random variable. When we further elaborate on this expected value we can write Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 92 Capacity Analysis (50) Statistical Neurodynamics (10) • After this we can substitute the definition of Wij , which comes from the Hebbian Learning Rule we applied • the two summations can be reordered, which yields Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 93 Capacity Analysis (51) Statistical Neurodynamics (11) • The first summation here sweeps through all the possible values of , which contains as well, and because of this we can single this term out, and write • and because we have chosen in a special way, namely we can write Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 94 Capacity Analysis (52) Statistical Neurodynamics (12) • It was the value of the macro state at the 0thtime instance. • Elaborating on the second term of the summation we can see that it is a normalized summation of Bernoulli random variables, and due to the Central Limit Theorem this will approximate a Gaussian random variable .i in the following way Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 95 Capacity Analysis (53) Statistical Neurodynamics (13) • Considering the fact, that we are interested in the asymptotic behavior of the Hopfield Network, when N tends to infinity, then writing M instead of M -1 does not make any difference, which gives us Signal processing on digital, neural, and kiloprocessor based architectures: Hopfield Neural Network www.itk.ppke.hu 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 96 Capacity Analysis (54) Summary of the capacity Capacity limits of Hopfield net. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 97 Outline of applying the HNN as a combinatorial optimizer • Defining the problem set of combinatorial optimization • Binary Quadratic programming as a combinatorial optimization problem • HNN as a combinatorial optimizer• Philosophy • Minimization modification • Constraint mapping • Travelling salesman problem• Problem formulation • Cost function as a quadratic function • Constraints as linear combination of penalty functions Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 98 Outline of applying the HNN as a combinatorial optimizer • TSP problem mapped into a quadratic form • Solution with HNN • HNN for ISI corrupted signal detection• Problem formulation • Mapping the problem into a quadratic form • Solution with HNN • Performance analysis • Examples • Summary Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 99 Combinatorial optimization task • A discrete optimization task is an optimization task:where the domain of the objective function and the constraints are from a discrete set e.g. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 100 Combinatorial optimization task • A combinatorial optimization problem COP is a discrete optimization task where the domain of the functions is also a discrete set, but the elements are combinations of simpler elements. E.g. • The trivial solution for a discrete optimization problem is the exhaustive search. • Usually the combinatorial optimization tasks are NP problems, so for a large size the exhaustive search is intractable. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 101 Hopfield network as a combinatorial optimizer • We have learned that the HNN minimizes it’s energy function which is: • Note that if we don’t assume any constraints or incorporate the constraints in the energy function and choose the objective function to be the energy function, then the HNN can perform the combinatorial optimization. • So if we can address a combinatorial optimization problem as a quadratic function minimization, then a HNN can be used. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 102 HNN as a combinatorial optimizer –philosophy • We can approximate the solution of a traditionally NP problem Data representation Quadratic optimization problem Combinatorial optimization problem Binary representation HNN De- representation Optimal Solution Quadratic form Global minimum O(2N) or O(N!) O(N2) Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 103 HNN as a combinatorial optimizer –minimization • We have used the HNN for maximizing the energy function, we will prove that with a modification it can be used for minimization. • Modify the weight matrix as: • The new energy function is: • It can be proven that: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 104 HNN as a combinatorial optimizer –minimization • The two energy functions only differ in a constant term so the minimum is the same location: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 105 HNN as a combinatorial optimizer –minimization • The modified energy function does not alter the place of the minimum, so we can define a state transition rule which minimizes the new energy function: • The Modified HNN is capable of minimizing the quadratic form and the number of steps needed for this minimization is a polynomial function of the dimension of the network. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 106 HNN as a combinatorial optimizer –minimization • To prove this we have to ensure 3 properties for this structure: 1. there exists a global upper-bound B; 2. there exists a global lower-bound A; 3. the change in the Lyapunov function is always larger than . C:\home\tisda\itk.ppke.hu\wsn\tamop\Munka_NEURODSP\08_fejezet\lyap_min.png Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 107 HNN as a combinatorial optimizer –minimization • If properties shown we can state that the transient time is: • Following the methods shown at the maximization the lower and upper bounds can be derived similarly: • We have to elaborate on the change of the energy function to prove property 3. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 108 HNN as a combinatorial optimizer –minimization • Rewriting the change of the energy function: • To show that the change is bounded we introduce a table with the possible events: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 109 HNN as a combinatorial optimizer –minimization • Table of the possible changes in the energy function: • So the energy function always decreases which yields to a minimum point. • We can give a bound on a transition time: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 110 HNN as a combinatorial optimizer –minimization Local vs. global optima • The Hopfield network acts along the energy “surface” function and in every step decreases it. However if a valley other than the basin of the desired solution (marked with blue) exist in the energy function, it can stuck in a suboptimal answer if started from a “wrong” basin. • In the energy function we call the bottom of such valleys local minima. Local minima usually have higher value than the global minimum. C:\home\tisda\itk.ppke.hu\2009-2010-phd07-tavasz\ewcc\local_min copy.png Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 111 HNN as a combinatorial optimizer –minimization Strategies overcoming staying in local minima • There are a lot of different strategies to overcome stuckingin a local minima. A lot of them are heuristics or combined with other combinatorial optimizer strategies. • A few examples:• If stuck, “shake” ~ add noise to the state, maybe it shakes into a “better” valley –Noisy Hopfield Network • Chaotic Hopfield Network • Taboo search combined HNN • EDA+HNN • Hysteretic HNN • Multi stage, multi init.HNN, etc. C:\home\tisda\itk.ppke.hu\2009-2010-phd07-tavasz\ewcc\local_min copy.png Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 112 HNN as a combinatorial optimizer –constraint mapping • We have seen that the HNN is capable to minimize a quadratic function • But we cannot deal with the constraints in a straightforward manner. • The most common way to deal with the constraints is to incorporate them into the energy function as penalizing terms Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 113 HNN as a combinatorial optimizer –constraint mapping • Constructing an objective function as a linear combination of the original cost function and the penalizing terms • Although other construction could be used due to the linearity property the linear combination is the most commonly used choice. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 114 Travelling salesman COP • One of the most important COPs is the Travelling Salesman Problem (TSP). • We have Kcities. The TSP is that we have an agent at city 1 and we want him to visit all the cities exactly only once and arrive back to city 1 while we require him to travel the least (on the possible shortest route) Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 115 Travelling salesman COP • We have an edge weighted graph G(V,E) where the vertices are representing the cities the edges are the possible travelling connections between the cities and the distance between the cities (travelling costs) are the weights of the edges. • The TSP problem is the same as finding the shortest Hamiltonian cycle in an edge weighted graph • http://imgs.xkcd.com/comics/travelling_salesman_problem.png Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 116 Travelling salesman COP • Let us follow the steps of • Let’s transform the TSP into a quadratic optimization problem tractable by HNN Data representation Quadratic optimization problem Combinatorial optimization problem Binary representation HNN De- representation Optimal Solution Quadratic form Global minimum O(2N) or O(N!) O(N2) Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 117 Travelling salesman COP • Given a distance matrix D for the graph we can represent a trip by a matrix V having • For example the shown route can be described by the following route matrix: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 118 Travelling salesman COP • Note that Vcould be arbitrarily chosen, but for Vto describe a valid tour for the agent Vmust satisfy several constraints: a) Each row in Vmust contain exactly one 1-s, because the agent cannot be in two or more cities at the same time b) Each column in Vmust contain exactly one 1-s because we must visit all the cities and we must visit them only once. • We can describe mathematically these constraints as 1. Row orthogonality: 2. Column orthogonality: 3. Sum of the elements of Vmust be K: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 119 Travelling salesman COP • For Vto describe a valid tour (be a feasible solution) Vhas to be a permutation matrix • Having this notation we can formulate the cost of a route: • The objective function to be minimized is: • Note that the HNN works with state vectors of Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 120 Travelling salesman COP • We have to transform Vto the domain of y • After this we can write the objective function as: • is this a quadratic form? Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 121 Travelling salesman COP • This is a quadratic form which we will write in the traditional matrix-vector notation. Thus giving W and b V y matrix vector ??? W, b Transformation Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 122 Travelling salesman COP • We perform an index change: • From the original cost function • We come to: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 123 Travelling salesman COP De-representation yopt Weight matrix Bias vector We need a modified goal function to ensure a permutation matrix Vopt Is it a permutation matrix ??? Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 124 Travelling salesman COP • This energy function does not guarantee us to find a feasible solution (valid tour along the cities), so we have to incorporate the constraints for Vinto the energy function in such a way that the energy function has to be minimal if all the constraints are satisfied and the cost function is minimal. • Wechoose a weighted linear combination of the cost function and the constraint terms so that if any of the constraints are not satisfied then the energy function is penalized thus pressing it farther from the minimum. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 125 Travelling salesman COP • We have the new energy function as • Substitutinginto all terms, we get Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 126 Travelling salesman COP • This is a quadratic form again which we will write in the traditional matrix-vector notation. Thus giving W and b. • Let us first separate the quadratic terms the linear terms and the constant terms the same way as we did for only the cost term. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 127 Travelling salesman COP Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 128 Travelling salesman COP • After substituting and evaluating the parameters we get independent matrices and vectors to form the overall quadratic function: • Where the matrices and vectors correspond properties of the row, column orthogonality, the permutation matrix property and the press of the cost term. • The weights of the linear combinations can be adjusted over the solution process in each stage to emphasize one property over another. Usually heuristics are applied to change them. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 129 Signal detection as combinatorial optimization • One other example for a COP is a multipath propagated radio wave compensation and detection in communication. • We send a block of symbols but the receiver gets a noisy linear combination of them due to ISI and additive noise Inter Symbol Interference & Additive Gaussian Noise Sent signal Received signal Channel Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 130 Signal detection as combinatorial optimization • Wesendablockofsymbols • Butduetothechannelactslikealinearfilter,wereceiveanoiseaddedconvolutionofthesentsymbolswiththechannel’simpulseresponsefunction ISI Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 131 Signal detection as combinatorial optimization • We want to make a decision based on the knowledge of H (the channel matrix) and the received message x, what was the most probable sent information vector y Gaussian noise Received signal Sent signal Detected signal ISI ?? Detection ? Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 132 Signal detection as combinatorial optimization • We can use a simple decision rule, taking the sign of the received signal. • Threshold detector: Received signal Sent signal Detected signal ISI Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 133 Signal detection as combinatorial optimization • But we know more information of the underlying phenomenon (we know H and that a noise is added). So applying the Bayesian decision rule: • We know that the received signal is constructed by the channel as: • And that the noise is an additive white Gaussian noise • So the observed signal can be treated as a random variable: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 134 Signal detection as combinatorial optimization • We can describe the optimal decision based on the Bayesian rule: • We will show that this is a quadratic form indeed. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 135 Signal detection as combinatorial optimization • Expanding the expression: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 136 Signal detection as combinatorial optimization • So we have constructed a quadratic energy function that can be used as the energy function of the HNN. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 137 Signal detection as combinatorial optimization • The HNN with the given parameters can approximate the optimal detection. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 138 Signal detection as combinatorial optimization • Performance analysis for an example: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 139 Analog circuit implementation of the HNN • There are several types of implementation of the HNN. Software like Matlab or Labview contain packages of different neural networks. • On a DSP one can exploit the fast matrix vector multiplication capabilities. • The optical implementation gives us a very fast architecture. • However the available software are very slow in contrast to the hardware implementations, while the DSP and optical implementation is not cut out for large scale. Due to the quadratically growing interconnections between neurons. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 140 Analog circuit implementation of the HNN • The first step in implementing Hopfield Neural Network as an analog circuit is to analyze the nonlinear state transition rule of the network:where we have set b= 0for the ease of simpler formulas. • This is a discrete time state transition rule, in terms k=1,2,… • When we are implementing Hopfield Neural Networks as an analog circuit then this circuit can not handle discrete time but continuous time. This gives rise to the first question, namely how to change this network from discrete to continuous time? Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 141 Analog circuit implementation of the HNN • We need to modify the state transition from a discrete time step to an infinitesimally small time step (difference) and use a differentiable activation function instead of the sign function. • If we choose an arbitrary small time step Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 142 Exercises –example1 Given a DHNN by it’s weight matrix and bias vector. a) Determine and draw to the figure the state transitions and the stable point using the given values of the Lyapunov function if we use the network for minimization, and the initial state is: b) Verify the solution applying and computing the states according to the state transition rule. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 143 Exercises –example1solution a) 4 Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 144 Exercises –example1solution b) Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 145 Exercises –example2 Give the stable point of the following HNN: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 146 Exercises –example3 We want to store the following samples in a HNN used as an associative memory: • Give the weight matrix and the bias vector of the network! • Are the samples orthogonal? • Show a stable point beside the stored sample points! • Mark the states in the figure from where the net converges to the stored samples: Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 147 Exercises –example4 We want to solve the following optimization problem with a Hopfield net: • Give the concrete recursive state update formula of this Hopfield net used for minimization! Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 148 Exercises –example5 We want to use a Hopfield net in a digital communication system for detection. The state of the linear channel distortion with an AWGN noise is assumed to be known and to be stationer. The impulse response of the channel is:and the SNR is 0dB. The block representation of the system model is: • Show that the HNN is the optimal detector for this problem. Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 149 Exercises –example5 • Give the weight matrix and the bias vector of the Hopfield net for the given channel impulse response and noise power if the received signal is: • What will be the decoded message ( )? • What would be the decoded message if a threshold detector would have been used instead of the HNN detector? Note: the initial state of the HNN is random. In the example use Hopfield network, Hopfield net as associative memory and combinatorial optimizer 10/5/2011. TÁMOP –4.1.2-08/2/A/KMR-2009-0006 150 Summary • A method was shown how to use the HNN to minimize a quadratic cost function • This construction was used for solving combinatorial optimization problems which are traditionally NP problems, but with the HNN polynomial complexity approximation is given for the solution. • Examples have been shown of mapping combinatorial optimization problems into quadratic programming tasks • Possible analog circuit implementation was shown for the HNN