페이지

2022년 3월 4일 금요일

Hopfield networks and energy equations for neural networks

 As we discussed in Chapter 3, Building Blocks of Deep Neural Networks, Hebbian Learning states, "Neurons that fire together, wire together", "and many models, including the multi-layer perceptron, made use of this idea in order to develop learning rules. One of these models was the Hopfield network, developed in the 1970-80s by several researchers. In this network, each "neuron" is connected to every other by a symmetric weight, but no self-connections (there are only connections between neurons, no self-loops).

Unlike the multi-layer perceptrons and other architectures we studied in Chapter 3, Building Blocks of Deep Neural Networks, the Hopfield network is an undirected graph, since the edges go "both ways."

The neurons in the Hopfield network take on binary values, either (-1, 1) or (0, 1), as a thresholded version of the tanh or sigmoidal activation function:

The threshold values (sigma) never change during training; to update the weights, a "Hebbian" approach is to use a set of n binary patterns (configurations of all the neurons) and update as:

where n is the number of patterns, and e is the binary activations of neurons i and j in a particular configuration. Looking at this equation, you can see that if the neurons share a configuration, the connection between them is strengthened, while if they are opposite signs (one neuron has a sign of +1, the other -1), it is weakened. Following this rule to iteratively strengthen or weaken a connection leads the network to converge to a stable configuration that resembles a "memory" for a particular activation of the network, given some input. This represents a model for associative memory in biological organisms- the kind of memory that links unrelated ideas, just as the neurons in the Hopifield network are linked together.

Besides representing biological memory, Hopfield networks also have an interesting parallel to electromagnetism. If we consider each neuron as a particle or "charge," we can describe the model in terms of a "free energy" equation that represents how the particles in this system mutually repulse/attract each other and where on the distribution of potential configurations the system lies relative to equilibrium:


where we is the weights between neurons i and j, s is the "states" of those neurons (either 1, "on," or -1, "off"), and sigma is the threshold of each neuron (for example, the value that its total inputs must exceed to set it to "on"). When the Hopfield network is in its final configuration, it also minimizes the value of the energy function computed for the network, which is lowered by units with an identical state(s) being connected strongly(w). The probability associated with a particular configuration is given by the Gibbs measuer:

Here, Z(B) is a normalizing constant that represents all possible configurations of the newtwork, in the same respect as the normalizing constant in the Bayesian probability function you saw in Chapter 1, An Introduction to Generative AI: "Drawing" Data from Model.

Also notice in the energy function definition that the state of neuron is only affected by local connections (rather than the state of every other neuron in the network, regardless of if it is connected); this is also known as the Markov property, since the state is "memoryless," depending only on its immediate "past" (neighbors). In fact, the Hammersly-Clifford therem states that any distribution having this same memoryless property can be represented using the Gibbs measure.


댓글 없음: