Learn Before
Location-Based Addressing (NTM Architecture)
When the content-based method is not well-suited for a problem, location-based addressing is used, which purely focuses on memory location using rotational shifting and weighting. Before rotational shifting is performed, both read and write heads produce a scalar interpolation gate (between 0 and 1). This value acts as a blend between (the weight produced in the previous time step by the read/write head) and (generated by the content-based system) to return a gated weight, :
w_t^g leftarrow g_t w_t^c + (1 - g_t) w_{t-1}
Depending on the value of , we might completely ignore weights produced by the content-based system or by the head in the previous time step. More precisely, if is equal to 0, we ignore the content-based system, and if it is 1, we ignore the previous head weight. After this procedure, shift weighting is applied. The simplest way to define the shift weighting is to use a softmax distribution. The rotation applied to the gated weight is written in the following formula:
w_t^{sim} leftarrow sum_{j=i}^{N-1} w_t^g(j) s_t(i-j)
If weights aren't sharp, this convolution procedure can lead to leakage or dispersion. In order to solve this problem, each head produces an additional scalar which makes sure to sharpen those weights:
0
1
Contributors are:
Who are from:
Tags
Data Science