<< Back to posts

Brief Einsum Tutorial

Posted on July 26, 2024 • Tags: torch numpy einsum math linear algebra

What does np.einsum('ij,jk->ik', A, B) mean?

Basically, we have two input matrices $A$ and $B$ and an output matrix $C$ is defined such that: \(C_{ik} = \sum_{j = 1}^n A_{ij} B_{jk}\) Note that this is the formula for standard matrix multplication.

Rules (from here)

  1. Repeated letters on either side of the , means values along those axes are multiplied together.
  2. Letters on the left of -> that aren’t on the right of -> means that values along that axis are summed together.

Examples

Vectors

Assuming we have vectors $A, B \in \mathbb{R}^n$ , then…

np.einsum('i', A) => returns a vector $C \in \mathbb{R}^n$ identical to $A$ \(C_i = A_i\)

np.einsum('i->', A) => sums all elements of $A$ into a scalar $C \in \mathbb{R}$ \(C = \sum_{i = 1}^n A_i\) np.einsum('i,i->i', A, B) => element wise multplication of $A$ and $B$ into a vector $C \in \mathbb{R}^n$ \(C_i = \sum_{i = 1}^n A_i B_i\) np.einsum('i,i', A, B) => dot product of $A$ and $B$ into a scalar $C \in \mathbb{R}$, i.e. $C = A^TB$ \(C = \sum_{i = 1}^n A_i B_i\) np.einsum('i,j->ij', A, B) => outer product of $A$ and $B$ into a 2-dimensional matrix $C \in \mathbb{R}^{n \times n}$, i.e. $C = A \otimes B = A B^T$ \(C_{ij} = A_i B_j\)

Matrices

Assuming we have matrices $A \in R^{n \times m}$ and $B \in R^{m \times p}$ , where $m$ and $p$ are compatible with $n$ and $m$ where appropriate, then…

np.einsum('ij', A) => returns a matrix $C \in \mathbb{R}^{n \times m}$ identical to $A$ \(C_{ij} = A_{ij}\) np.einsum('ji', A) => a matrix $C \in \mathbb{R}^{m \times n}$ identical to $A^T$ \(C_{ij} = A_{ji}\) np.einsum('ij->', A) => sums all elements of $A$ into scalar $C \in \mathbb{R}$ \(C = \sum_{i=1}^n \sum_{j=1}^m A_{ij}\) np.einsum('ii->i', A) => returns the diagonal elements of $A$ as a vector $C \in \mathbb{R}^n$, i.e. $C = \text{diag}(A)$ \(C_{i} = A_{ii}\) np.einsum('ii', A) => returns the trace of $A$ as a scalar $C \in \mathbb{R}$, i.e. $C = \text{trace}(A)$ \(C = \sum_{i = 1}^n A_{ii}\) np.einsum('ij->i', A) => sums rows of $A$ into a vector $C \in \mathbb{R}^n$, i.e. $C = \text{np.sum}(A, \text{axis}=1)$ \(C_{i} = \sum_{j = 1}^m A_{ij}\) np.einsum('ij->j', A) => sums columns of $A$ into a vector $C \in \mathbb{R}^m$, i.e. $C = \text{np.sum}(A, \text{axis}=0)$ \(C_{j} = \sum_{i = 1}^n A_{ij}\) np.einsum('ij,ij->ij', A, B) => elementwise product of $A \in \mathbb{R}^{n \times m}$ and $B \in \mathbb{R}^{n \times m}$ into matrix $C \in \mathbb{R}^{n \times m}$, i.e. $C = A \odot B$ \(C_{ij} = A_{ij} B_{ij}\) np.einsum('ij,jk->ik', A, B) => matrix multplication of $A \in \mathbb{R}^{n \times m}$ and $B \in \mathbb{R}^{m \times p}$ into matrix $C \in \mathbb{R}^{n \times p}$, i.e. $C = AB$ \(C_{ik} = \sum_{j = 1}^{m} A_{ij} B_{jk}\)

References