Using autoreload to speed up IPython and Jupyter work

I try to do all of my interactive Python development with either Jupyter notebooks or an IPython session. One of the main reasons I like these environments is the %autoreload magic. What’s so special about %autoreload and why does it often make development faster and simpler? 

Why IPython and Jupyter?

Before going further, if you haven’t yet used both IPython and Jupyter, check out the ipython interactive tutorial first. It does a good job of explaining why using IPython is superior to the default Python interpreter. It has a host of useful features, but in this article I will only be talking about one feature (magics) and specifically one of those magics (%autoreload). Jupyter notebooks, like IPython, support most of the same magics, so much of the tutorial will work in either an interactive IPython session or a Jupyter notebook session. One thing to note is that I’m talking about Python here, not other languages running in a Jupyter notebook.

What is a magic?

Magics are just special functions that you can call in your IPython or Jupyter session. They come in two forms: line and cell. A line magic is prefixed with one %, a cell magic is prefixed with two, %%. A line magic consumes one line, whereas a cell magic consumes the lines below the magic, allowing for more input. For this article, we’ll look at just one of the line magics, the %autoreload magic.

Why autoreload?

The %autoreload magic changes the Python session so that modules are automatically reloaded in that session before entering the execution of code typed at the IPython prompt (or the Jupyter notebook cell). What this means is that modules loaded into your session can be modified (outside your session), and the changes will be detected and reloaded without you having to restart your session.

This can be tremendously useful. Let me describe a typical scenario. Let’s say you have a Jupyter notebook that you’ve created and are enhancing, and you require data from several sources. You get the data by executing functions in modules you import at the beginning of your session, and those modules are Python code that you control. This will be a very typical use case for many users. Futhermore, let’s say in your notebook you load all the data into memory and this takes a full 5 minutes. You then start to work with the data and soon realize that you need slightly different data from one of the functions in one of the modules you control, so you need to add another parameter to query data differently. How do you

  1. Make this change
  2. Test this change
  3. Continue your work

In most cases you will open the underlying code in your editor or IDE, modify it, test it in another session (or with unit tests), then optionally install changes locally. But what about the notebook that already has some of the data already loaded? One way to continue your work is to restart your Jupyter kernel to pick up the changes you just made, reload all data into memory (taking 5 minutes at least), and then continue your work.

But there’s a better way, using autoreload. In your Jupyter session, you first load the autoreload extension, using the %load_ext magic.

%load_ext autoreload

Now, the %autoreload magic is available in your session. It can take a single argument that specifies how autoreloading of modules will behave. The extension also provides another magic, %aimport, which allows for fine-grained control of which modules are affected by the autoreload. If no arguments are given to %autoreload, then it will reload all modules immediately (except those excluded by %aimport as seen below). You can run it once and then use your updated code.

The optional argument for autoreload has three valid values:

  • 0 – disable automatic reloading
  • 1 – reload all the modules imported by %aimport every time before executing Python code that has been typed
  • 2 – reload all modules (except those excluded by %aimport) every time before executing Python code that has been typed

To regulate the modules affected by autoreload, use the %aimport magic. It works as follows:

  • no arguments – lists the modules that will be imported or not imported
  • with one argument – the module provided will be imported with %autoreload 1
  • with comma separated arguments – all modules in list will be imported with %autoreload 1
  • with a - before argument – that module will not be autoreloaded

For me, the most common way I use %autoreload is to just include everything during my initial development work when I’m likely to be changing Python modules and notebook code (i.e. to run %autoreload 2), and to not use it at all otherwise. But having the control can be useful, especially if you are loading a lot of modules.

Example

For a concrete example that you can use to follow along, make two Python files, auto.py and auto2.py, and save them alongside a Jupyter notebook with the imports below. Each of the Python files should have a simple function in them, as follows:

# in auto.py
def my_api(model, year):
    # dummy result
    return { 'model': model, 'year': year, }

# in auto2.py
def my_api2(model, year):
    # dummy result
    return { 'model': model, 'year': year, }

Now, let’s import both modules and inspect the API methods using the IPython/Jupyter help by appending a ? to the function. You should see that imported module matches your code in the Python file.

import auto
import auto2

auto.my_api?
Signature: auto.my_api(model, year)
Docstring: <no docstring>
File:      ~/projects/python_blogposts/tools/auto.py
Type:      function

Now, in a separate editor, add a third argument (maybe have it take a third color argument) to the auto.my_api function. Save the file. Do we see it? Refresh the help cell to see.

No, not yet. Let’s turn on autoreload.

%autoreload 2

Now, when I inspect auto.my_api, I see the new argument. It worked!

Now I can modify settings so that only the auto2 module is reloaded, not auto. But first, let’s see the modules to reload and skip. By default, it includes all modules and skips none (because I used 2 as the initial argument).

%aimport
Modules to reload:


Modules to skip:

Let’s turn off auto.

%aimport -auto
%aimport
Modules to reload:


Modules to skip:
auto

Now, if I modify the code in auto, I shouldn’t see the changes in this session. Using %aimport you can restrict which code is being reloaded.

Caveats

It’s important to note that module reloading is not perfect. You should not leave this on for production code, it will slow things down. Also, if you are live editing your code and leave it in a broken state, the most recent successfully loaded code will be the code running in your session, so it can make things confusing for you. This is probably not the way you want to modify large amounts of code, but when making incremental changes, it can work well.

To observe what broken code will look like, open the module that is being autoreloaded (auto2.py) and add a syntax error (for example, maybe put in mismatched parens somewhere) and save the file, then execute the function from that module in a notebook cell. You should see autoreload report a traceback of the syntax error in the cell. You’ll only see this error once, if you re-execute the cell it won’t show you the same error, but will use the version of the code last loaded.

Also, note that there are a few things that don’t work all the time, like removing functions from a module, changing a @property in a class to an ordinary method, or reloading C extensions. In those cases, you’ll need to restart your session. You can see more details in the docs.

Summary

If you’ve never used %autoreload before, give it a try next time you have an IPython or Jupyter session with a lot of data in it and want to make a small change to a local module. Hopefully it will save you some time.

You may want to check out this article on how you can use other magics to view your variables in a Jupyter or IPython session.

Don't miss any articles!

If you like this article, give me your email and I'll send you my latest articles along with other helpful links and tips with a focus on Python, pandas, and related tools.

Invalid email address
I promise not to spam you, and you can unsubscribe at any time.

Have anything to say about this topic?