*See the previous post for Weeks 3 and 4*.

This week I’ve been mostly doing background reading. This post is mostly a summary of what I learned.

## Unification

In short, unification is the process of finding substitutions of variables within two terms two terms to make them identical. For example, if we have the expressions $x + 2y$ and $a + 3b$ , the substitution $\{x \mapsto a, y \mapsto 3, b \mapsto 2\}$ is a unifier, since applying the substitution to both expressions makes gives us the identical expression of $a + 3 \cdot 2$ . While this particular substitution includes variables from both expressions, we’re mostly interested in rules involving substitutions of variables from just one expression (a case of unification known as matching). Several well-known algorithms for unification already exist.

### Unification in SymPy

SymPy also has an implementation of a unification algorithm that is able to take the commutativity of operations into account. Suppose we wanted to unify the matrix expressions $A^{T}B^2C^{-1}$
and $XY^{-1}$
. This is essentially the problem of finding a substitution that makes these two expressions equal. Using the `sympy.unify.usympy`

module, we can discover what this substitution is:

```
>>> from sympy.unify.usympy import *
>>> from sympy.abc import N
>>> m = lambda x: MatrixSymbol(x, N, N)
>>> A, B, C, X, Y = map(m, ['A', 'B', 'X', 'Y'])
>>> e1 = A.T * B**2 * C.I
>>> e2 = X * Y **(-1)
>>> next(unify(e1, e2, variables=[X, Y]))
{X: A.T*B**2, Y: C}
```

We’ve reduced this to a matching problem in which the variables are specified only in `e2`

. What’s important to note here is that the matching rule within `e2`

we specified ($XY^{-1}$
) was a compound expression. This is something that is currently not possible for non-commutative expressions (such as matrix multiplication) using SymPy’s `Wild`

interface. `unify`

allows use to express substitution rules that are able to match across sub-expressions in matrix multiplication.

Through unification, we can express substitution rules for optimization as a simple term-rewriting rule. In my previous blog post, I mentioned rewriting the matrix multiplication $Ax$
as a solving operation of `MatSolve(A, x)`

under certain assumptions. The actual implementation is restricted to cases where both the `A`

and the `x`

are matrix symbols, and the optimization can’t identify cases where either the `A`

or the `x`

is a compound expression. With unification, we can identify the same pattern in more complex subexpressions. If we’re given the matrix expression $A^T (A B)^{-1} x y$
, a unification based transformer can produce `MatSolve(AB, x)`

, provided that the shapes of the matrices match the given rule.

## Codegen Tensors

I also looked into generating C and Fortran code from SymPy matrix expressions. For the purposes of code generation, SymPy has a relatively new `array_utils`

module. The AST nodes in this module express generalizations of operations on matrices, which require a bit of background in tensors.

Many array operations (including matrix multiplication) involve *contraction* along an axis. Contractions are a combination of multiplication and summation along certain axis of a tensor^{1}. In assigning the matrix multiplication $AB$
to the $n \times n$
matrix $C$
, we can explicitly write the summations (using subscripts for indexing matrix elements) as

$C_{ik} = \sum_{j = 1}^{n} A_{ij} B_{jk}$

The index $j$
is contracted, as it is shared between both $A$
and $B$
, and describing this summation operation as a whole boils down to which indices are shared between the matrices. This is essentially what the `array_utils`

classes do. This is what happens when we use `array_utils`

to convert the matrix multiplication to an equivalent contraction operation:

```
>>> from sympy.codegen.array_utils import CodegenArrayContraction
>>> from sympy.abc import N
>>> A = MatrixSymbol('A', N, N)
>>> B = MatrixSymbol('B', N, N)
>>> CodegenArrayContraction.from_MatMul(A * B)
CodegenArrayContraction(CodegenArrayTensorProduct(A, B), (1, 2))
```

We’re given a new`CodegenArrayContraction`

object that stores, along with the variables `A`

and `B`

, tuples of integers representing contractions along certain indices. Here, the `(1, 2)`

means that the variable at index 1 and index 2 (indices start at 0) are shared. We can confirm this by looking at the above summation, since both the second and third indices out of all indices that appear in the expression are $j$
.

## Next Steps

For next week, I’ll try to re-implement the rewriting optimization in terms of `unify`

. This will both make it easier to express rules and extend to sub-expressions as well. I’ll also start with implementing additional printers for the C and Fortran printers. The printer will probably just print naive `for`

loops to keep things simple (and it would probaly be better to use something like Theano for highly optimized code).

For our purposes, we can think of tensors as just $n$ -dimensional arrays. Most of my reading on tensors was Justin C. Feng’s The Poor Man’s Introduction to Tensors.↩