Microsoft recently announced a new Office feature, Python in Excel. Currently only available to Windows Insiders, this new feature adds the ability to run Python code in Excel.
This is a long asked for feature, and something that has been possible for many years via the third party Excel add-in PyXLL. Seemingly Microsoft's new Python in Excel feature will make this more widely available, but how does it work and what are the advantages to using it?
We have been using Microsoft's Python in Excel for several days now to get to grips with what it can and can't do. On the whole, if you have never tried writing a line of Python before then it could be an interesting experience for you to try! You don't need to install anything, you can simply start writing Python code in Excel. As of right now though, once you get through the initial excitement of being able to run a bit of Python code its uses seem somewhat limited.
The main contender to Microsoft's Python in Excel, PyXLL, is much more mature and may appeal to a slightly different audience. PyXLL takes a very different approach and allows Python developers to expose their own Python code, written outside of Excel, to be called from within Excel. This separation of code and data (having the code outside of Excel rather than saved in the workbook) keeps the complexity out of Excel and enables the same code to be deployed across entire organizations. PyXLL requires some setup to get started, but is likely to be of more interest to professional Excel users for whom consistency and performance is important.
In the rest of this article we will look at some of the features of Microsoft's Python in Excel and dive into some detail of how it all works.
Table of Contents
Run Python code in the Azure Cloud
The first thing you should know about this new feature is that it relied on the Microsoft Azure Cloud platform to run your Python code. This means that you don't need to have Python installed locally or manage your own Python environment, but it also means that currently you can't install or use any Python packages that aren't already included.
Using Python in Excel allows you to write Python snippets in Excel cells, in the same way you can enter a formula. With a Python formula you can do much more complex tasks in a single cell. Your Python code is uploaded to the Azure cloud service and is run there, rather than on your PC. Your Python snippet can access values in other cells from the Excel workbook, but for security reasons it cannot access the internet, files, databases, or other services.
Although this severely limits what you can do, you can access ranges of Excel data as DataFrame objects and use the Python pandas package to manipulate them before returning the results to Excel. You can also use the Python seaborn package to plot charts in Excel from Python.
How does PyXLL compare?
PyXLL runs in your local Excel process. All of your code and data stays on your PC. You need to have a local Python environment to use, and you can import and use any packages installed in that environment.
Python code running in PyXLL runs on your local PC with all of your local user permissions. You can access any resources you require through normal Python code, just in the same way you would if writing a Python script.
If you are building Python tools to deploy to multiple computers then packaging and deployment requires a little thought, but PyXLL has specific features and guides to help with exactly that.
Testing, Code Quality and Debugging
Python developers have plenty of tools available to help them write reliable code. Among those are packages like pytest for unit testing, pylint for linting (or static code analysis), and black for consistent code formatting. Professional Python developers depend on tools like this to reduce bugs before deploying any code.
Most modern Python IDEs like VS Code, PyCharm and WingIDE have excellent support for all of these types of tools. So far, they are all glaring omissions for Python in Excel however. It is worth noting that Python in Excel is still in beta and so it is highly likely that some of these will be added in the future.
Python developers typically arrange their code into packages, modules, functions and classes. This allows the same code to be reused within a project, or even between projects. It's not unique to Python, virtually all programming languages work in similar ways. Regardless of language, developers are taught to structure their code to allow for reuse and to allow testing of individual functions. Without this sort of structure, code becomes increasingly hard to reason about and impossible to maintain.
Bugs when the Python code is written directly in the Excel sheet can be confusing.
When writing code in an Excel cell at the moment there is no access to any of these tools. In the current iteration there is nothing to help you write good Python code, and there are code checks or testing tools. Debugging, or stepping through code, like you can in the VBA editor and most Python IDEs is something that is also sorely missing.
Python code cannot be reused between workbooks, and writing code in cells encourages the same copy and paste style of building spreadsheets that is common when using traditional formulas. In the same way that if you discover a bug in a formula that's been filled down or copied around it's hard to fix all instances, now with Python cells that same problem exists for Python code in your spreadsheets too.
The Microsoft Python in Excel feature is still in beta, but since it will introduce a new wave of people to Python giving them tools to make learning good quality Python easier seems like an obvious thing for Microsoft to address before opening it to a wider audience. At the very least they need to add a way to debug Python code otherwise a lot of people will simply become too frustrated and fed up to persevere.
How does PyXLL compare?
With PyXLL, your Python code lives outside of the Excel workbook and the PyXLL add-in simply imports it. This means you can use your preferred Python development environment to develop, test and debug your Python code.
Have you ever had a workbook become corrupted? Or what about a change that you can't track down, despite a folder of files named "workbook_v1/2/3/etc.xlsx"? Software developers use Version Control Systems (Git is a popular one) to track code changes. With PyXLL the Python code is separate from the workbook so you can use the same tools when developing your Python Excel functions.
As your Python code is deployed separately from the workbook, any fixes or updates to Excel functions written with PyXLL can be deployed across your entire organization to ensure that everyone is working with the same version of the code.
Develop PyXLL functions using proper Python development tools and expose them to Excel as easy to use Excel functions.
Faster Spreadsheets?
Using Python you can perform more operations in a few lines of Python code than would be possible using a traditional Excel formula. Once the data is in Python it can be faster to operate on that data using packages like pandas and numpy. This should result in faster spreadsheets, but there are a couple of major caveats to that.
Usually in Excel, a formula takes various inputs. When you enter a formula such as =SUM(A1:A10) the range A1:A10 is the input to that formula, and the result is recalculated when the input changes.
With the new Python cells, these cells don't take inputs and instead your Python code references ranges using the new Python function "xl". Python cells are run in order from left to right and top to bottom, regardless of what inputs have changed.
This means that if you have multiple Python cells in your workbook all of those will calculate every time you recalculate the workbook, even if they don't need to. This can slow down your workbooks to the point where they are unusable.
Microsoft have anticipated this problem and have added a mode so that Excel will recalculate everything excluding Python cells. This gets around the problem of the Python code taking a long time to run, but it introduces a potentially dangerous problem of inconsistent results in your spreadsheet until you perform a lengthy full recalculation.
Why Microsoft decided to Excel Python cells from its usual calculation model is one of the most confusing aspects of this feature, and one that makes it more of an interesting toy than the killer feature it might have been.
How does PyXLL compare?
PyXLL embeds Python into Excel directly. It doesn't send data between your local PC and the cloud and can therefore be a lot faster. It's been used by the most demanding professional developers over the last 14 years and is fine tuned to be the most performant way of calling Python code from Excel.
Rather than having Python cells that execute arbitrary Python code in order of where those cells are located on the sheet, PyXLL exposes Python functions to Excel as User Defined Functions (or UDFs). These UDFs are exactly the same as Excel's built in functions (like SUM). They take inputs and return results. When an input changes, the function is called. If an input hasn't changed, the function doesn't need to be called.
Modern spreadsheets all use a dependency graph to track dependencies between cells so that when something changes only the things that are dependent on that change need to be recalculated. PyXLL functions fit in with this well understood model so that you can get the best performance from your spreadsheets possible. Oh, and Python profiling tools can also be used to find out exactly where the time is going if things aren't fast enough!
PyXLL
- Uses your local Python environment for incredible performance, with access to all of your installed packages.
- Write custom Excel functions that look and behave like native Excel functions. No special syntax or confusing and slow recalculations.
- Python code is managed outside of Excel. Develop and debug Python code using your preferred Python IDE.
Microsoft's "Python In Excel"
- Runs in the Azure Cloud with no network access. Unable to install packages.
- No custom functions, Python code is written directly in Excel cells. All Python code cells are executed in the cloud each time any change occurs.
- All Python code must be developed and saved in the workbook. No tools available for version control, testing or debugging.
Microsoft adding Python to Excel is huge and is going to dramatically increase the demand for people learning Python. At the same time, these professionals have demanding work ahead of them, and some of the design choices Microsoft has moved forward with are going to move people to products that work efficiently, and pass through the robust features that can be created in a Python environment. Pyxll is a superior product. Click here to get started today, with a risk free 30 day trial! Use code PYFI10 for 10% off your first year!