GSoC’24 - LPython: Project Completion

GSoC
Code
Internship
Python
LPython
Author

Vipul Cariappa

Published

August 25, 2024

This is the final report of my work done during the Google Summer of Code 2024 period with the Python Software Foundation, LPython sub-organisation.

LPython is a statically typed, compiled variant of Python. It is much faster than CPython (the most common Python interpreter). LPython is able to be fast because of the static types as compared to dynamic types in Python and being a compiled language. Our compiler can also be used to trans-compiler Python code to C, C++ and Fortran code. LFortran is a sister project to LPython. Both use the same back-end for target code generation and other compiler optimisations.

I implemented a REPL (read-evaluate-print-loop) shell for LPython. This required modification of the way we compile. I used LLVM’s JIT compiler to just-in-time compile the code. Work on REPL laid the groundwork for implementing a Jupyter Kernel. Jupyter Kernel had already been implemented for LFortran, and I only had to adapt it to work with LPython. Following that, I worked on interoperability of LPython with CPython. We intend to provide first class support to use CPython libraries within LPython.

Detailed Description

Read Evaluate Print Loop

I faced a lot of issues initially. Our code base was designed for ahead-of-time compilation. I had to adapt it to work for just-in-time compilation. I was using LLVM 11 for development. But the project intends to support all the LLVM versions from 10. This required additional care for changes in LLVM’s API across versions.

Pull Requests

There are a few missing features. Error messages that are produced in REPL are fuzzy. Top level printing of a few aggregate datatypes are not yet implemented.

Jupyter Kernel

A single pull request with more than 1000 lines of code changes. Jupyter Kernel We are using xeus library to build LPython’s Jupyter Kernel. xeus is a C++ library used to create Jupyter Kernels.

Interoperability with CPython

LPython previously used to support using CPython libraries only when using the C back-end (i.e. trans-compiling LPython code to C source code). I have written an ASR pass, that work on the syntax tree to produce or generate the additional logic for type conversions between CPython and LPython types, and function resolution to find and call CPython functions from LPython. As this implementation works on the syntax tree directly, all the back-ends can use it out of the box without any additional changes specific to each back-end.

Pull Requests

There is a small bug in this implementation. There is no error handling. This is required when type conversions are not possible or function resolution fails. Presently, it is undefined behaviour. I will be fixing this within a week.

Pull requests that were not merged

  • Support to redefine of function in REPL
    Implementation of function redefinition is tricky with compiled languages. There are many questions regarding the implementation details; Should the previous definition of the function be kept in memory or deallocated? What if, there is a pointer to the old definition? Should it be updated? If g calls f, and we redefine f, should g now be calling the new definition of f or the old definition of f? I implemented according to what felt correct to me. But then when we compared it to the behaviour of CPython, it was decided to hold off further work. You can find the detailed explanation in this blog.

Issues Opened

Acknowledgement

I would like to thank Google Summer of Code for providing this opportunity, and my mentors Ubaid Shaikh and Ondřej Čertík for there guidance and help throughout the project.