My Jupyter notebooks is getting long, which makes it difficult to navigate.
I want to save each chapter (Cel starting with Heading 1) to a different file. How can I do that? Cut and paste of multiple cells between notebooks seems not possible.
My Jupyter notebooks is getting long, which makes it difficult to navigate.
I want to save each chapter (Cel starting with Heading 1) to a different file. How can I do that? Cut and paste of multiple cells between notebooks seems not possible.
This is the method I use - it is a little awkward, but it works:
I believe that the developers may be working on a better solution for a future release.
The easiest way might be to edit the .ipnb file in a text editor. Below I listed the content of a very simple notebook.
The notebook looks like
Chapter 1
In [1]: 1+1
Out[1]: 2
Chapter 2
In [2]: 2+2
Out[2]: 4
To take out chapter 1 and place it behind chapter 2, this is what you can do
You can manipulate multiple notebooks in a simlar fashion.
This is the .ipnb file for the example
{
 "metadata": {
  "name": "",
  "signature": ""
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Chapter 1"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "1+1"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 1,
       "text": [
        "2"
       ]
      }
     ],
     "prompt_number": 1
    },
    {
     "cell_type": "heading",
     "level": 1,
     "metadata": {},
     "source": [
      "Chapter 2"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "2+2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 2,
       "text": [
        "4"
       ]
      }
     ],
     "prompt_number": 2
    }
   ],
   "metadata": {}
  }
 ]
}
A notebook file is json format, so I get all data as JSON format and split it into several files automatically.
This code is what I made.
The code seems to be complex, but it is simple if you just check it for a while and this is an example of a separate file, http://www.fun-coding.org/DS&AL4-1.html which I also transformed as HTML after I split it.
import json
from pprint import pprint
import re
def notebook_spliter(FILENAME, chapter_num):
    with open(FILENAME + '.ipynb') as data_file:    
        data = json.load(data_file)
    copy_cell, chapter_in = list(), False
    regx = re.compile("## [0-9]+\. ")
    for num in range(len(data['cells'])):
        if chapter_in and data['cells'][num]['cell_type'] != 'markdown':
            copy_cell.append(data['cells'][num])
        elif data['cells'][num]['cell_type'] == 'markdown':
            regx_result = regx.match(data['cells'][num]['source'][0])
            if regx_result:
                print (regx_result.group())
                regx2 = re.compile("[0-9]+")
                regx2_result = regx2.search(regx_result.group())
                if regx2_result:
                    print (int(regx2_result.group()))
                    if chapter_in == False:
                        if chapter_num == int(regx2_result.group()):
                            chapter_in = True
                            copy_cell.append(data['cells'][num])
                    else:
                        if chapter_num != int(regx2_result.group()):
                            break
            elif chapter_in:
                copy_cell.append(data['cells'][num])
    copy_data["cells"] = copy_cell
    copy_data["metadata"] = data["metadata"]
    copy_data["nbformat"] = data["nbformat"]
    copy_data["nbformat_minor"] = data["nbformat_minor"]
    with open(FILENAME + '-' + str(chapter_num) + '.ipynb', 'w') as fd:
        json.dump(copy_data, fd, ensure_ascii=False)
This is a function to check chapter numbers in a notebook file. I added chapter number to the notebook file with '## 1. chapter name' in markdown cell, so just check ## digit. pattern with regular expression.
Then, next code is to copy data of cells into this chapter number, and save the only copied cells and others(metadata, nbformat, and nbformat_minor) to separate file.
copy_data = dict()
FILENAME = 'DS&AL1' 
CHAPTERS = list()
with open(FILENAME + '.ipynb') as data_file:    
    data = json.load(data_file)
for num in range(len(data['cells'])):
    if data['cells'][num]['cell_type'] == 'markdown':
        regx_result = regx.match(data['cells'][num]['source'][0])
        if regx_result:
            regx2 = re.compile("[0-9]+")
            regx2_result = regx2.search(regx_result.group())
            if regx2_result:
                CHAPTERS.append(int(regx2_result.group()))
print (CHAPTERS)
for chapternum in CHAPTERS:
    notebook_spliter(FILENAME, chapternum)
Some years later, luckily there is a library that can do such things for you:
pip install nbmanips
nb select has_html_tag h1 | nb split -s nb.ipynb
The first part of the command (nb select has_html_tag h1) will tell nbmanips on which cells to perform the split.
The second part (nb split -s nb.ipynb) will split the notebook based on the piped selection. The -s flag tells nbmanips to use the selection instead of a cell index.
my source: https://towardsdatascience.com/split-your-jupyter-notebooks-in-2-lines-of-code-de345d647454
the library: https://pypi.org/project/nbmanips/