Brief Einsum Tutorial
What does np.einsum('ij,jk->ik', A, B)
mean?
Basically, we have two input matrices $A$ and $B$ and an output matrix $C$ is defined such that: \(C_{ik} = \sum_{j = 1}^n A_{ij} B_{jk}\) Note that this is the formula for standard matrix multplication.
Rules (from here)
- Repeated letters on either side of the
,
means values along those axes are multiplied together. - Letters on the left of
->
that aren’t on the right of->
means that values along that axis are summed together.
Examples
Vectors
Assuming we have vectors $A, B \in \mathbb{R}^n$ , then…
np.einsum('i', A)
=> returns a vector $C \in \mathbb{R}^n$ identical to $A$
\(C_i = A_i\)
np.einsum('i->', A)
=> sums all elements of $A$ into a scalar $C \in \mathbb{R}$
\(C = \sum_{i = 1}^n A_i\)
np.einsum('i,i->i', A, B)
=> element wise multplication of $A$ and $B$ into a vector $C \in \mathbb{R}^n$
\(C_i = \sum_{i = 1}^n A_i B_i\)
np.einsum('i,i', A, B)
=> dot product of $A$ and $B$ into a scalar $C \in \mathbb{R}$, i.e. $C = A^TB$
\(C = \sum_{i = 1}^n A_i B_i\)
np.einsum('i,j->ij', A, B)
=> outer product of $A$ and $B$ into a 2-dimensional matrix $C \in \mathbb{R}^{n \times n}$, i.e. $C = A \otimes B = A B^T$
\(C_{ij} = A_i B_j\)
Matrices
Assuming we have matrices $A \in R^{n \times m}$ and $B \in R^{m \times p}$ , where $m$ and $p$ are compatible with $n$ and $m$ where appropriate, then…
np.einsum('ij', A)
=> returns a matrix $C \in \mathbb{R}^{n \times m}$ identical to $A$
\(C_{ij} = A_{ij}\)
np.einsum('ji', A)
=> a matrix $C \in \mathbb{R}^{m \times n}$ identical to $A^T$
\(C_{ij} = A_{ji}\)
np.einsum('ij->', A)
=> sums all elements of $A$ into scalar $C \in \mathbb{R}$
\(C = \sum_{i=1}^n \sum_{j=1}^m A_{ij}\)
np.einsum('ii->i', A)
=> returns the diagonal elements of $A$ as a vector $C \in \mathbb{R}^n$, i.e. $C = \text{diag}(A)$
\(C_{i} = A_{ii}\)
np.einsum('ii', A)
=> returns the trace of $A$ as a scalar $C \in \mathbb{R}$, i.e. $C = \text{trace}(A)$
\(C = \sum_{i = 1}^n A_{ii}\)
np.einsum('ij->i', A)
=> sums rows of $A$ into a vector $C \in \mathbb{R}^n$, i.e. $C = \text{np.sum}(A, \text{axis}=1)$
\(C_{i} = \sum_{j = 1}^m A_{ij}\)
np.einsum('ij->j', A)
=> sums columns of $A$ into a vector $C \in \mathbb{R}^m$, i.e. $C = \text{np.sum}(A, \text{axis}=0)$
\(C_{j} = \sum_{i = 1}^n A_{ij}\)
np.einsum('ij,ij->ij', A, B)
=> elementwise product of $A \in \mathbb{R}^{n \times m}$ and $B \in \mathbb{R}^{n \times m}$ into matrix $C \in \mathbb{R}^{n \times m}$, i.e. $C = A \odot B$
\(C_{ij} = A_{ij} B_{ij}\)
np.einsum('ij,jk->ik', A, B)
=> matrix multplication of $A \in \mathbb{R}^{n \times m}$ and $B \in \mathbb{R}^{m \times p}$ into matrix $C \in \mathbb{R}^{n \times p}$, i.e. $C = AB$
\(C_{ik} = \sum_{j = 1}^{m} A_{ij} B_{jk}\)