Using Jupyter Notebooks with Hugo

Table of Contents

Convert Jupyter notebooks to markdown, and hide or collapse your cells/inputs/outputs.

This site is built with Hugo, a framework to build static websites. One issue, however, is that Hugo does not support jupyter notebook (.ipynb) files natively. In my search to find how to use jupyter notebooks with my site, I found and tried many options but ultimately landed with using nbconvert.

Want to see what this will look like at the end? Head over to my predicting recipe website traffic project

Things I tried or looked at
#

I tried to use over half a dozen options before landing on nbconvert. Some of these were:

hugo_jupyter, CLI converter
Academic File Converter, CLI converter
Hugo Blox, a theme for Hugo
Shortcodes, such as:
- reading the .ipynb directly
- using nbconvert to convert .ipynb to html and then using an iframe to display).
Mercury, hosting an interactive notebook
Quarto, a different framework to build a site or use it within Hugo

Some of these options I tried and ran into issues, or other solutions seemed to be different than my current needs. While digging through the various options though I realized that several of the options called for nbconvert. Using nbconvert seemed like a natural choice after seeing it used in several different solutions.

Using nbconvert with Hugo
#

Here are the steps I took to use nbconvert with Hugo:

1. Install nbconvert
#

See installation instructions. Out of the box, nbconvert works well.

jupyter nbconvert --to markdown path/to/notebook.ipynb

This produces a notebook.md but in order to make it a page that renders in hugo we’ll need to do a bit more.

2. Add Frontmatter
#

In your notebook, create a markdown cell as the first cell and add your frontmatter:

---
date: 2024-02-02T04:14:54-08:00
draft: false
title: Example
---

3. Check Folder Structure
#

Put your notebook inside of the folder you want your page to be, and rename the notebook as index.ipynb.

This assumes the following folder structure:

path/to/project
└── content
    └── category-name
        └── page-name
            └── index.ipynb <--- your notebook titled 'index'

Now jupyter nbconvert --to markdown path/to/index.ipynb will produce an index.md within your folder.

Run hugo server and your page should render your notebook fully. And since it is in markdown, it will render with proper colors for your theme (which was one issue I had with some options that wanted to rely on html output).

Alternatively, you can have your notebooks saved in some other folder such and use the --output index --output-dir path/to/dir arguments for it to generate an index.md within your specified directory. The first argument changes the output filename to index.md and the second argument changes the output folder.

Hiding Jupyter Cells, Inputs, and/or Outputs
#

You now can have your jupyter notebook as a hugo webpage, but sometimes you do not want all of your cells to be displayed.

1. Create config file
#

path/to/project/python/jupyter/nbconvert_config.py:

c.TagRemovePreprocessor.enabled = True
c.TagRemovePreprocessor.remove_cell_tags = ['hide_cell']
c.TagRemovePreprocessor.remove_input_tags = ['hide_input']
c.TagRemovePreprocessor.remove_all_outputs_tags = ['hide_output']

2. Tag Cells
#

Open your notebook and go to your cells and add the tags hide_cell, hide_input, or hide_output as approropriate.

3. Run nbconvert with config
#

Use the following command to run nbconvert and specifying your config file:

jupyter nbconvert --to markdown --config path/to/project/python/jupyter/nbconvert_config.py path/to/index.ipynb

Now that cells/inputs/outputs that were tagged will be removed (not collapsed), meaning they will not be shown in your markdown.

Alternatively, you can store your config file in .venv/etc/jupyter as jupyter_nbconvert_config.py which will be picked up automatically when you use nbconvert without using the --config argument. Note this will occur even if you are using nbconvert for some other purpose within your environment. If you run into issues with the file not being picked up add --debug to your arguments and look for the paths nbconvert is checking for the config file.

Collapsing Jupyter Cells, Inputs, and Outputs
#

Removing cells is nice in some cases, but I want to be able to collapse sections of my notebook so they are there in case someone is so inclined to read through them. Collapsed sections can let you structure your notebooks to form a report but still contain all the relevant info. This is a little more involved than removing cells/inputs/outputs but we’ll take it one step at a time. Note that I assume you’re not doing the alternative approaches I listed above and if you are you will of course need to adjust things slightly.

The issue I ran into is that when trying to handle collapsing input/outputs within the preprocess, nbconvert expects your inputs and outputs to look a certain way and it will format them accordingly. Specifically, it will format even our detail shortcode so we get something like this for inputs:




    
        Input collapsed. Click to expand:
    

    print("collapse_input's output")

collapse_input's output

And then the output looks like this:

print("collapse_output")




    
        Output collapsed. Click to expand:
    

    collapse_output

While this works, it isn’t saving any space because the detail shortcodes are actually being put inside of coding blocks which is just a peculiarity with how nbconvert interacts with what we’re trying to do. To solve this we will be writing our own postprocessing script.

1. Enable collapsing sections in markdown with a shortcode.
#

First, we need a way to collapse sections of text. In HTML this is done with a <detail> and <summary> HTML tags but Hugo does not have a way to utilize these within markdown directly. I have written a guide on how to enable collapsing sections of text in hugo via a shortcode here.

This guide assumes your shortcode can be enabled with

{{< details summary="Input collapsed:" altSummary="Input expanded:" >}}
Text goes here
{{< /details >}}

Which renders as:

Input collapsed:

Text goes here

UPDATE: Hugo has added a details shortcode in version 0.140.0. This guide assumes you can pass an altSummary to the details shortcode that toggles the display when clicked. If you are using the default details shortcode then the code provided will need to be modified to remove references to altSummary.

2. Create collapse_preprocessor.py:
#

Create the following file: path/to/project/python/jupyter/collapse_preprocessor.py:

from nbconvert.preprocessors import Preprocessor
import uuid

class CollapsePreprocessor(Preprocessor):
    def preprocess(self, nb, resources):
        grouped_cells = []
        collapse_group = []

        def generate_cell_id(id_length=8):
            return uuid.uuid4().hex[:id_length]
        
        def append_collapsed():
                # add details shorttag to beginning and end of collapse group.
                if len(collapse_group) == 1:
                    grouped_cells.append({'cell_type': 'markdown', 'id': generate_cell_id(), 'metadata': {'tags': []}, 'source': f'{{{{< details summary="1 cell collapsed:" altSummary="1 cell expanded:" >}}}}'})
                else:
                    grouped_cells.append({'cell_type': 'markdown', 'id': generate_cell_id(), 'metadata': {'tags': []}, 'source': f'{{{{< details summary="{len(collapse_group)} cells collapsed:" altSummary="{len(collapse_group)} cells expanded:" >}}}}'})

                for c_cell in collapse_group: 
                    grouped_cells.append(c_cell)
                grouped_cells.append({'cell_type': 'markdown', 'id': generate_cell_id(), 'metadata': {'tags': []}, 'source': '{{< /details >}}'})

        for cell in nb.cells:
            # check for cell.id
            if not hasattr(cell, 'id') or cell.id is None:
                cell.id = generate_cell_id()

            # check for collapse_cell tag and add to collapse group
            if 'collapse_cell' in cell.metadata.get('tags', []):
                collapse_group.append(cell)
            else:

                # format and add collapse group to grouped_cells
                if collapse_group:
                    append_collapsed()
                    collapse_group = []

                # collapse input/output
                if cell.cell_type == "code":
                    if 'collapse_input' in cell.metadata.get('tags', []):
                        cell.source = f'{{{{< detailsInput >}}}}\n```python\n{cell.source}\n```\n{{{{< /detailsInput >}}}}'
                    if 'collapse_output' in cell.metadata.get('tags', []):
                        new_outputs = []
                        for output in cell.outputs:
                            if 'text' in output:
                                output['text'] = f'{{{{< detailsOutput >}}}}\n{output["text"]}\n{{{{< /detailsOutput >}}}}'
                            new_outputs.append(output)
                        cell.outputs = new_outputs
                
                # add cell to grouped cells
                grouped_cells.append(cell)

        # format and append last cells
        if collapse_group:
            append_collapsed()

        nb.cells = grouped_cells
        return nb, resources

At a high level this is iterating through each cell in your notebook and determining if it has the collapse_cell tag, and if it does then it will group it with adjacent cells with the same tag and wrapping them in our details shortcode. It is also looking for collapse_input and collapse_output and adding placeholder text that will be picked up in postprocessing. Lastly, to ensure compatibility it is adding cell ids if one is not present. This is necessary because the preprocessing is adding in additional cells for formatting and without generating a cell.id nbconvert will throw soft errors that may become hard errors in the future.

3. Update nbconvert_config.py
#

You will then need to update your nbconvert_config.py:

from python.jupyter.collapse_preprocessor import CollapsePreprocessor

c = get_config()

c.TagRemovePreprocessor.enabled = True
c.TagRemovePreprocessor.remove_cell_tags = ['hide_cell']
c.TagRemovePreprocessor.remove_input_tags = ['hide_input']
c.TagRemovePreprocessor.remove_all_outputs_tags = ['hide_output']
c.MarkdownExporter.preprocessors = [CollapsePreprocessor]

This adds your preprocessor to your config so nbconvert knows to use it. Now when you run your conversion with your config it will utilize the collapse_preprocessor.py.

If you are only planning to use collapse_cell (and not collapsing input/outputs) then you can stop here. Using the same command as above nbconvert will use your preprocessing script and group collapsed cells together. To look like this:

3 cells collapsed:

print("collapse_cell_1")

collapse_cell_1

print("collapse_cell_2")

collapse_cell_2

print("collapse_cell_3")

collapse_cell_3

4. Create collapse_postprocessor.py:
#

We now will be creating a postprocessor to fix some of the formatting issues. Create the following file: path/to/project/python/jupyter/collapse_postprocessor.py:

import sys

class CollapsePostprocessor:
    def __init__(self, filepath):
        self.filepath = filepath

    def process(self):
        try:
            with open(self.filepath, 'r', encoding='utf-8') as file:
                content = file.read()

                # collapsed inputs/outputs
                content = content.replace('```python\n{{< detailsInput >}}', '{{< details summary="Input collapsed:" altSummary="Input expanded:" >}}')
                content = content.replace('{{< /detailsInput >}}\n```', '{{< /details >}}\n') 
                content = content.replace('    {{< detailsOutput >}}', '{{< details summary="Output collapsed:" altSummary="Output expanded:" >}}')
                content = content.replace('    {{< /detailsOutput >}}', '{{< /details >}}\n')               

            with open(self.filepath, 'w', encoding='utf-8') as file:
                file.write(content)
        except FileNotFoundError:
            sys.exit(f"Could not locate {self.filepath}")

This postprocessor will search for the beginning and end of the detailsInput and detailsOutput placeholders and then format them as our expected details shortcode.

5. Putting it all together:
#

Now we can tie everything together for ease of use. We will be creating a script to execute nbconvert and collapse_postprecessor.py

path/to/project/python/jupyter/ipynb_to_md.py:

import sys
import os.path
import subprocess
from collapse_postprocessor import CollapsePostprocessor

def main():

    # check arguments
    if len(sys.argv) == 1:
        notebook_filepath = input("filepath argument not provided. Please provide: ")
    elif len(sys.argv) == 2:
        notebook_filepath = sys.argv[1]
    elif len(sys.argv) >= 3:
        raise IOError("Invalid # of arguments.  Usage: script.py <filepath/filename.ipynb>")
    
    # check config + pre/post processing filepaths
    directory_path = os.path.dirname(os.path.abspath(__file__))
    config_filepath = os.path.join(directory_path, "nbconvert_config.py")
    preprocessor_filepath = os.path.join(directory_path, "collapse_preprocessor.py")
    postprocessor_filepath = os.path.join(directory_path, "collapse_postprocessor.py")

    for filepath in [notebook_filepath, config_filepath, preprocessor_filepath, postprocessor_filepath]:
        if not os.path.isfile(filepath):
            raise IOError(f"Could not locate {filepath}")
    
    
    # nbconvert w/ preprocessing
    subprocess.run(['jupyter', 'nbconvert', '--to', 'markdown', '--config', config_filepath, '--output', 'index', notebook_filepath])

    # run postprocessor
    output_directory = os.path.dirname(os.path.abspath(notebook_filepath))
    output_filepath = os.path.join(output_directory, "index.md")
    postprocessor = CollapsePostprocessor(output_filepath)
    postprocessor.process()

if __name__ == "__main__":
    main()

To use: python python/jupyter/ipynb_to_md.py path/to/notebook.ipynb

The script will check to see if the notebook file exists, and it will also check for the config files listed above. It then will run nbconvert this time specifying the --output index argument so your notebook doesn’t need to be called index.ipynb (it will generate a index.md regardless of notebook name). Once the markdown conversion is finished the script runs the postprocessor to fix our shortcode placement. The end result is that our input and output collapsed cells will now look like this:

Input collapsed:

print("collapse_input")

collapse_input

And

print("collapse_output")

Output collapsed:

collapse_output

Summary
#

And that should be it! You now will be able to convert your jupyter notebooks to markdown for use within Hugo or other static websites. You also can choose to use the tags hide_cell, hide_input, and hide_output to prevent these from appearing in your markdown. Finally, you can instead collapse cells with collapse_cell (adjacent collapsed cells are grouped), and collapse input/outputs with collapse_input, and collapse_output. This will let you use notebooks for data science or some other purpose, giving a clean report but still preserving the details.

Things I tried or looked at #

Using nbconvert with Hugo #

1. Install nbconvert #

2. Add Frontmatter #

3. Check Folder Structure #

Hiding Jupyter Cells, Inputs, and/or Outputs #

1. Create config file #

2. Tag Cells #

3. Run nbconvert with config #

Collapsing Jupyter Cells, Inputs, and Outputs #

1. Enable collapsing sections in markdown with a shortcode. #

2. Create collapse_preprocessor.py: #

3. Update nbconvert_config.py #

4. Create collapse_postprocessor.py: #

5. Putting it all together: #

Summary #