1Cademy - Location-Based Addressing (NTM Architecture)

Learn Before

Addressing Mechanisms (NTM Architecture)

Formula

Location-Based Addressing (NTM Architecture)

When the content-based method is not well-suited for a problem, location-based addressing is used, which purely focuses on memory location using rotational shifting and weighting. Before rotational shifting is performed, both read and write heads produce a scalar interpolation gate $g_t$ (between 0 and 1). This value acts as a blend between $w_{t-1}$ (the weight produced in the previous time step by the read/write head) and $w_t^c$ (generated by the content-based system) to return a gated weight, $w_t^g$ :

w_t^g leftarrow g_t w_t^c + (1 - g_t) w_{t-1}

Depending on the value of $g_t$ , we might completely ignore weights produced by the content-based system or by the head in the previous time step. More precisely, if $g_t$ is equal to 0, we ignore the content-based system, and if it is 1, we ignore the previous head weight. After this procedure, shift weighting $s_t$ is applied. The simplest way to define the shift weighting is to use a softmax distribution. The rotation applied to the gated weight is written in the following formula:

w_t^{sim} leftarrow sum_{j=i}^{N-1} w_t^g(j) s_t(i-j)

If weights aren't sharp, this convolution procedure can lead to leakage or dispersion. In order to solve this problem, each head produces an additional scalar $\gamma_t \ge 1$ which makes sure to sharpen those weights: