Google Summer of Code Week 5: Unification and Tensors

Posted on June 28, 2019 by Ankit Pandey

See the previous post for Weeks 3 and 4.

This week I’ve been mostly doing background reading. This post is mostly a summary of what I learned.

Unification

In short, unification is the process of finding substitutions of variables within two terms two terms to make them identical. For example, if we have the expressions x+2yx + 2y and a+3ba + 3b , the substitution {xa,y3,b2}\{x \mapsto a, y \mapsto 3, b \mapsto 2\} is a unifier, since applying the substitution to both expressions makes gives us the identical expression of a+32a + 3 \cdot 2 . While this particular substitution includes variables from both expressions, we’re mostly interested in rules involving substitutions of variables from just one expression (a case of unification known as matching). Several well-known algorithms for unification already exist.

Unification in SymPy

SymPy also has an implementation of a unification algorithm that is able to take the commutativity of operations into account. Suppose we wanted to unify the matrix expressions ATB2C1A^{T}B^2C^{-1} and XY1XY^{-1} . This is essentially the problem of finding a substitution that makes these two expressions equal. Using the sympy.unify.usympy module, we can discover what this substitution is:

We’ve reduced this to a matching problem in which the variables are specified only in e2. What’s important to note here is that the matching rule within e2 we specified (XY1XY^{-1} ) was a compound expression. This is something that is currently not possible for non-commutative expressions (such as matrix multiplication) using SymPy’s Wild interface. unify allows use to express substitution rules that are able to match across sub-expressions in matrix multiplication.

Through unification, we can express substitution rules for optimization as a simple term-rewriting rule. In my previous blog post, I mentioned rewriting the matrix multiplication AxAx as a solving operation of MatSolve(A, x) under certain assumptions. The actual implementation is restricted to cases where both the A and the x are matrix symbols, and the optimization can’t identify cases where either the A or the x is a compound expression. With unification, we can identify the same pattern in more complex subexpressions. If we’re given the matrix expression AT(AB)1xyA^T (A B)^{-1} x y , a unification based transformer can produce MatSolve(AB, x), provided that the shapes of the matrices match the given rule.

Codegen Tensors

I also looked into generating C and Fortran code from SymPy matrix expressions. For the purposes of code generation, SymPy has a relatively new array_utils module. The AST nodes in this module express generalizations of operations on matrices, which require a bit of background in tensors.

Many array operations (including matrix multiplication) involve contraction along an axis. Contractions are a combination of multiplication and summation along certain axis of a tensor1. In assigning the matrix multiplication ABAB to the n×nn \times n matrix CC , we can explicitly write the summations (using subscripts for indexing matrix elements) as

Cik=j=1nAijBjkC_{ik} = \sum_{j = 1}^{n} A_{ij} B_{jk}

The index jj is contracted, as it is shared between both AA and BB , and describing this summation operation as a whole boils down to which indices are shared between the matrices. This is essentially what the array_utils classes do. This is what happens when we use array_utils to convert the matrix multiplication to an equivalent contraction operation:

We’re given a newCodegenArrayContraction object that stores, along with the variables A and B, tuples of integers representing contractions along certain indices. Here, the (1, 2) means that the variable at index 1 and index 2 (indices start at 0) are shared. We can confirm this by looking at the above summation, since both the second and third indices out of all indices that appear in the expression are jj .

Next Steps

For next week, I’ll try to re-implement the rewriting optimization in terms of unify. This will both make it easier to express rules and extend to sub-expressions as well. I’ll also start with implementing additional printers for the C and Fortran printers. The printer will probably just print naive for loops to keep things simple (and it would probaly be better to use something like Theano for highly optimized code).


  1. For our purposes, we can think of tensors as just nn -dimensional arrays. Most of my reading on tensors was Justin C. Feng’s The Poor Man’s Introduction to Tensors.