Einstein summation convention on the operands. Implemented in most tensor libraries. Saves time transposing and with broadcasting.
operation | array | einsum |
---|---|---|
(view) | a | einsum('i', A) |
(1-D) | dot(a, b) , inner(a, b) | einsum('i,i', a, b) |
(1-D) | multiply(a, b) a * b | einsum('i,i->i', a, b) |
outer(a, b) | einsum('i,j->ij', a, b) | |
A.T | einsum('ji', A) | |
diag(A) | einsum('ii->i', A) | |
trace(A) | einsum('ii', A) | |
sum(A) | einsum('ij->', A) | |
sum(A, axis=0) | einsum('ij->j', A) | |
matmul(A, B) , A @ B | einsum('ij,jk->ik', A, B) | |
matmul(A, B.T) , A @ B.T | einsum('ij,kj->ik', A, B) | |
batched-matrix-batched-matrix | einsum('bij,bjk->bik', A, B) | |
each value of multiplied by | A[:, :, None, None] * B | einsum('ij,kl->ijkl', A, B) |
einsum('ik,jkl,il->ij', [A, B, C]) |
Omitting the arrow '->'
will take the labels that appeared once and arrange them in alphabetical order. For example, 'ij,jk->ik'
is equivalent to 'ij,jk'
.
einsum
allows the ellipses syntax '...'
for axes we’re not particularly interested in, like batch dimensions. For example, einsum('...ij,ji...->...', A, B)
would multiply just the last two axes of A
with the first two of B
.
Transposes such as einsum('ijk...->kji...', A)
are the same as swapaxes(A, 0, 2)
.
Further resources from Tim Rocktäschel, ajcr and tensorchiefs . For a visual understanding of what is going on under the hood, see Olexa Bilaniuk’s post.
Similar notation has been used for shapes in einshape.