Einstein summation convention on the operands. Implemented in most tensor libraries. Saves time transposing and with broadcasting.
| operation | array | einsum |
|---|---|---|
a | einsum('i', A) | |
dot(a, b), inner(a, b) | einsum('i,i', a, b) | |
multiply(a, b) a * b | einsum('i,i->i', a, b) | |
outer(a, b) | einsum('i,j->ij', a, b) | |
A.T | einsum('ji', A) | |
diag(A) | einsum('ii->i', A) | |
trace(A) | einsum('ii', A) | |
sum(A) | einsum('ij->', A) | |
sum(A, axis=0) | einsum('ij->j', A) | |
matmul(A, B), A @ B | einsum('ij,jk->ik', A, B) | |
matmul(A, B.T), A @ B.T | einsum('ij,kj->ik', A, B) | |
| batched-matrix-batched-matrix | einsum('bij,bjk->bik', A, B) | |
| each value of | A[:, :, None, None] * B | einsum('ij,kl->ijkl', A, B) |
einsum('ik,jkl,il->ij', [A, B, C]) |
Omitting the arrow '->' will take the labels that appeared once and arrange them in alphabetical order. For example, 'ij,jk->ik' is equivalent to 'ij,jk'.
einsum allows the ellipses syntax '...' for axes we’re not particularly interested in, like batch dimensions. For example, einsum('...ij,ji...->...', A, B) would multiply just the last two axes of A with the first two of B.
Transposes such as einsum('ijk...->kji...', A) are the same as swapaxes(A, 0, 2).
Further resources from Tim Rocktäschel, ajcr and tensorchiefs . For a visual understanding of what is going on under the hood, see Olexa Bilaniuk’s post.
Similar notation has been used for shapes in einshape.