Layer Normalization

Layer Normalization refers to a neural network layer which normalizes (i.e. sets the mean to 0 and the standard deviation to 1) the values of the data it receives, across the channel and spatial dimensions (but not the batch dimension), then applies scaling and shift operations (both by learnable tensors). Example: in a convolutional neural network for N-dimensional images of M channels, a layer norm (LN) layer placed immediately after the input layer would compute one pair of average and standard deviation values per sample in the batch, regardless of the number of dimensions N or channels M.