I spent a large part of last week travelling, so I’m combining the blog posts for the last two weeks.

I’m finished with the pull request for the LFortran code printer for now, though it’s definitely way too incomplete to be merged. The code passes *most* of the rudimentary tests I’ve added.

Here’s a simple example of one of the failing LFortran tests: Suppose we want to generate Fortran (using LFortran) code from the mathematical expression − *x*. SymPy sees this expression as multiplication with -1, as it implements only addition and multiplication in its arithmetic operations:

Directly converting same mathematical expression in Fortran as `-x`

we can see that LFortran instead sees it as unary subtraction:

```
>>> from lfortran import *
>>> src_to_ast("-x", False)
<lfortran.ast.ast.UnaryOp object at 0x7f9027f1aba8>
```

This is a major problem for the tests, which right now look to see if the Lfortran-parsed output of `fcode`

(SymPy’s current Fortran code generator) on an expression matches the same directly translated AST. This won’t be true for − *x*, since the translated expression is a multiplication `BinOp`

while the parsed expression in an `UnaryOp`

.

One solution might be to not parse `fcode`

’s output and instead just check for equivalence between strings. This would mean dealing with the quirks of the code printers (such as their tendency to produce excessive parenthesis), and take away some of the advantages of direct translation. The more probable solution would be to introduce substitution rules within the LFortran AST.

I filed issue #17006, in which `lambdify`

misinterpreted identity matrices as the imaginary unit. The fix in #17022 is pretty simple: just generate identity matrices with `np.eye`

when we can.

I also went through the matrix expression classes to see which ones weren’t supported by the NumPy code printer and filed issue #1703. These are addressed by another contributor in #17029.

Most of this week was spent on implementing an optimization for the NumPy generator suggested by Aaron: given the expression *A*^{ − 1}*b* where *A* is a square matrix and *b* a vector, generate the expression `np.linalg.solve(A, b)`

instead of `np.linalg.inv(A) * b`

. While both `solve`

and `inv`

use the same LU-decomposition based LAPACK `?gesv`

functions ^{1}, `solve`

is called on a vector while the `inv`

on a (much larger) matrix. In addition to cutting down on the number of operations, this optimization might also remove any errors introduced in calculating the inverse.

My pull request for this optimization is available at #17041, which uses SymPy’s assumption system to make sure that *A* is full-rank (a constraint imposed by `solve`

). My initial approach was to embed these optimizations directly in the code printing base classes. After some discussion with Björn, we decided it would be better to separate optimization from printing as much as possible, leading to the representation of the solving operation as its own distinct AST node. This approach is much better than the original, since it was fairly easy to the optimization to the Octave/Matlab code printer.

For this week, I’ll be continuing with the matrix optimization PR. I’ll try to find other optimizations that can be applied (such as the evaluation order of complicated matrix expressions) and look into using Sympy’s unification capabilities in simplifying the expression of optimization rules.

You can find the C definitions for the functions eventually called by

`inv`

and`solve`

. These are written in a special templated version of C, but you can find the template variable definitions a bit higher up in the source.↩

For this week, I’ve continued working on adding support for LFortran to SymPy’s code generation capabilities. This week mostly involved getting the infrastructure for testing the functionality of the new code generator working. I also extended the number of expressions the generator can handle, in addition to adding to LFortran’s ability to parse numbers upstream.

I’ve added support for four more expression types that the generator can handle: `Float`

, `Rational`

, `Pow`

and `Function`

. Since our base translation class was already in place from last week, implementing these was relatively straightforward and involved just defining the node visitors for each expression type (The commit that implements this can be found here). Here’s a demonstration showing the abstract syntax tree generated from translating the expression $\left(\frac{4}{3}\right)^{x}$:

```
>>> from sympy.abc import x
>>> from sympy.codegen.lfort import sympy_to_lfortran
>>> from lfortran.asr.pprint import pprint_asr
>>> pprint_asr(sympy_to_lfortran(Rational(4, 3) ** x))
expr.BinOp
├─left=expr.BinOp
│ ├─left=expr.Num
│ │ ├─n='4_dp'
│ │ ╰─type=ttype.Real
│ │ ├─kind=4
│ │ ╰─dims=[]
│ ├─op=operator.Div
│ ├─right=expr.Num
│ │ ├─n='3_dp'
│ │ ╰─type=ttype.Real
│ │ ├─kind=4
│ │ ╰─dims=[]
│ ╰─type=ttype.Real
│ ├─kind=4
│ ╰─dims=[]
├─op=operator.Pow
├─right=x
╰─type=ttype.Real
├─kind=4
╰─dims=[]
```

However, the translator fails for expressions that should in theory work. Right now, we can’t add an integer to a symbol because symbols default to real numbers, resulting in a type mismatch:

Fortran allows the implicit conversion of a float to a real, and the expression shouldn’t generate an error. This is functionality that will hopefully be implemented by the time I come back to this project close to the end of the summer.

I also added the initial infrastructure for testing the new code generation functions, with the starting commit available here. As Aaron mentioned in one of our meetings, the plan right now is for code generated by the LFortran backend to be equivalent to the output generated by the existing `fcode`

at the AST level. Each test should be in the form of an assertion that tests the (parsed) output of `fcode`

applied to a SymPy expression against the same AST generated by our newly implemented `sympy_to_lfortran`

. The LFortran project already has code to check generated ASTs against expected values, so I adapted this to the testing library of our code generator (I’m also not sure how this works in terms of licensing, since both SymPy and LFortran use the BSD-3 license).

One problem that immediately became apparent was the way that LFortran represents numbers. Looking at the expression tree above, the real numbers are actually stored as strings. On the parser side, LFortran stores a real number as the string used to represent that number. This means that the ASTs of two expressions that represent the same number in different ways are not identical (for example, `1.0_dp`

and `1.d0`

both represent the same double precision floating point number, but the strings stored by LFortran will be different). It’s only at the “annotation” stage of evaluation that LFortran canonicalizes floating point representations. For now, the tests use the annotation function of this stage, and I filed a merge request on the LFortran project to add support for parsing numbers in the way that `fcode`

generates them.

While the initial infrastructure is in place, I haven’t added any tests yet. Since the LFortran project is still in early alpha, the functionality needed to compare the syntax tree made by the builder API against the syntax tree parsed from the output of `fcode`

hasn’t been implemented yet. Again, this is something that will hopefully be implemented in LFortran near the end of the summer when I start on this portion of the project again.

After I filed the merge request to add the functionality I needed to LFortran, Ondřej (the creator of LFortran and one of my mentors) mentioned that he was planning on eventually removing the module I contributed to. The merge request I filed actually wasn’t the one I had in mind at first. I thought about adding support for canonicalizing number nodes right after they’re created in the builder API, but I decided against this because I felt that any changes I made would have to be minimally invasive. In retrospect, this was probably a misplaced concern, since it’s important to consider the development stage of a project when deciding how much of it should be changed. Because of this, LFortran will probably end up with something I opted at the moment to not implement.

There’s still some work left to be done with LFortran, such as filing issues I encountered and preparing the pull request for a merge (though it’ll probably remain a work in progress for some time). After that, I’ll be finished with LFortran for the time being and move on to extending support for matrix expressions in the Python code generator. The Python code generator can already convert (most) matrix expressions through NumPy, though there are still some bugs owing to an incomplete implementation. For next week, I’ll have to figure out what this missing functionality is how it can be implemented.

]]>`fcode`

, which converts a SymPy expression to an equivalent expression in Fortran, utilizing only LFortran as a backend. This post is an outline of what I’ve done (and learned) over last week.
LFortran is a Fortran (with some extensions) to LLVM compiler. One advantage that this design provides is that it enables interactive execution of Fortran code. LFortran can also be used as a Jupyter kernel, which means it can be used in a Jupyter notebook environment (you can even find an online interactive demo here).

In addition to being able to parse code, LFortran also provides the functionality of traversing a parse tree and generating the equivalent Fortran code. This means that if we want to generate Fortran code from a SymPy expression, the only work that we have to do is convert the SymPy expression tree to its LFortran equivalent.

LFortran provides a number of convenience functions for building a Fortran AST. Since LFortran is still in early alpha, there are currently only about a dozen builder functions. However, these few basic functions are enough for constructing simple expressions in the Fortran AST. As an example, if we wanted to construct the expression represented by `c = a + b`

, where each variable involved is an integer, we could do something like:

```
>>> import lfortran.asr.builder as builder
>>> import lfortran.asr.asr as asr
>>> integer = builder.make_type_integer()
>>> a = asr.Variable(name="a", type=integer)
>>> b = asr.Variable(name="b", type=integer)
>>> c = asr.Variable(name="c", type=integer)
>>> sum = builder.make_binop(a, asr.Add(), b)
>>> expr = asr.Assignment(c, sum)
```

LFortran also provides functionality to visualize what the expression tree looks like:

```
>>> import lfortran.asr.pprint as pprint
>>> pprint.pprint_asr(expr)
stmt.Assignment
├─target=c
╰─value=expr.BinOp
├─left=a
├─op=operator.Add
├─right=b
╰─type=ttype.Integer
├─kind=4
╰─dims=[]
```

I’ve started with the implementation of a basic SymPy to LFortran converter utilizing the AST builder described above, with the current pull request available on the SymPy GitHub. The converter follows the same node visitor class structure as all of the other code printers (it even inherits the `CodePrinter`

class, despite the methods not producing strings but rather AST nodes). Here’s a simple example that demonstrates the conversion of a simple expression to an equivalent in LFortran:

```
>>> from sympy.abc import x
>>> from sympy.codegen.lfort import sympy_to_lfortran
>>> import lfortran
>>> e = x + 1
>>> e_converted = sympy_to_lfortran(e)
>>> lfortran.ast_to_src(lfortran.asr_to_ast(e_converted)).replace('\n', '')
'(x) + (1)'
```

There are two things to notice here. The first is that I had to replace all the newlines in the generated expression, since a bug in LFortran causes too many newlines to be printed. The second is that there are a number of redundant parentheses in the printed expression. While this isn’t an outright bug, it’s another aspect of LFortran that is currently being improved upon.

I’ve also add another function, `sympy_to_lfortran_wrapped`

, which wraps an expression in a function definition, (poorly) emulating the wrapping part of `autowrap`

:

```
>>> from sympy.codegen.lfort import sympy_to_lfortran_wrapped
>>> e_wrapped = sympy_to_lfortran_wrapped(e)
>>> print(lfortran.ast_to_src(lfortran.asr_to_ast(e_wrapped)))
integer function f(x) result(ret)
integer, intent(in) :: x
ret = 1 + x
end function
```

Since LFortran can directly compile the AST to an LLVM intermediate representation, a future implementation of `autowrap`

might be implemented by compiling the output of this function (instead of first completely generating the code and then feeding it to `gfortran`

as it’s done right now).

For the next couple of days, I will try to extend the types of SymPy expressions that may be converted. One thing to note is that there isn’t a perfect correspondence between SymPy and LFortran AST nodes. LFortran supports nodes for operations like unary subtraction and division, which SymPy converts into multiplication and division respectively. On top of this, I’ll also add some tests for the functionality that I have implemented so far. After that, I’ll start with work on SymPy’s matrix expression code generation (the second part of my GSoC project) and pick LFortran up again close to the end of the summer.

]]>