<< Back to posts

Brief Einsum Tutorial

Posted on July 26, 2024 • Tags: torch numpy einsum math linear algebra

What does np.einsum('ij,jk->ik', A, B) mean?

Basically, we have two input matrices AA and BB and an output matrix CC is defined such that: Cik=j=1nAijBjkC_{ik} = \sum_{j = 1}^n A_{ij} B_{jk} Note that this is the formula for standard matrix multplication.

Rules (from here)

  1. Repeated letters on either side of the , means values along those axes are multiplied together.
  2. Letters on the left of -> that aren’t on the right of -> means that values along that axis are summed together.

Examples

Vectors

Assuming we have vectors A,BRnA, B \in \mathbb{R}^n , then…

np.einsum('i', A) => returns a vector CRnC \in \mathbb{R}^n identical to AA Ci=AiC_i = A_i

np.einsum('i->', A) => sums all elements of AA into a scalar CRC \in \mathbb{R} C=i=1nAiC = \sum_{i = 1}^n A_i np.einsum('i,i->i', A, B) => element wise multplication of AA and BB into a vector CRnC \in \mathbb{R}^n Ci=i=1nAiBiC_i = \sum_{i = 1}^n A_i B_i np.einsum('i,i', A, B) => dot product of AA and BB into a scalar CRC \in \mathbb{R}, i.e. C=ATBC = A^TB C=i=1nAiBiC = \sum_{i = 1}^n A_i B_i np.einsum('i,j->ij', A, B) => outer product of AA and BB into a 2-dimensional matrix CRn×nC \in \mathbb{R}^{n \times n}, i.e. C=AB=ABTC = A \otimes B = A B^T Cij=AiBjC_{ij} = A_i B_j

Matrices

Assuming we have matrices ARn×mA \in R^{n \times m} and BRm×pB \in R^{m \times p} , where mm and pp are compatible with nn and mm where appropriate, then…

np.einsum('ij', A) => returns a matrix CRn×mC \in \mathbb{R}^{n \times m} identical to AA Cij=AijC_{ij} = A_{ij} np.einsum('ji', A) => a matrix CRm×nC \in \mathbb{R}^{m \times n} identical to ATA^T Cij=AjiC_{ij} = A_{ji} np.einsum('ij->', A) => sums all elements of AA into scalar CRC \in \mathbb{R} C=i=1nj=1mAijC = \sum_{i=1}^n \sum_{j=1}^m A_{ij} np.einsum('ii->i', A) => returns the diagonal elements of AA as a vector CRnC \in \mathbb{R}^n, i.e. C=diag(A)C = \text{diag}(A) Ci=AiiC_{i} = A_{ii} np.einsum('ii', A) => returns the trace of AA as a scalar CRC \in \mathbb{R}, i.e. C=trace(A)C = \text{trace}(A) C=i=1nAiiC = \sum_{i = 1}^n A_{ii} np.einsum('ij->i', A) => sums rows of AA into a vector CRnC \in \mathbb{R}^n, i.e. C=np.sum(A,axis=1)C = \text{np.sum}(A, \text{axis}=1) Ci=j=1mAijC_{i} = \sum_{j = 1}^m A_{ij} np.einsum('ij->j', A) => sums columns of AA into a vector CRmC \in \mathbb{R}^m, i.e. C=np.sum(A,axis=0)C = \text{np.sum}(A, \text{axis}=0) Cj=i=1nAijC_{j} = \sum_{i = 1}^n A_{ij} np.einsum('ij,ij->ij', A, B) => elementwise product of ARn×mA \in \mathbb{R}^{n \times m} and BRn×mB \in \mathbb{R}^{n \times m} into matrix CRn×mC \in \mathbb{R}^{n \times m}, i.e. C=ABC = A \odot B Cij=AijBijC_{ij} = A_{ij} B_{ij} np.einsum('ij,jk->ik', A, B) => matrix multplication of ARn×mA \in \mathbb{R}^{n \times m} and BRm×pB \in \mathbb{R}^{m \times p} into matrix CRn×pC \in \mathbb{R}^{n \times p}, i.e. C=ABC = AB Cik=j=1mAijBjkC_{ik} = \sum_{j = 1}^{m} A_{ij} B_{jk}

References