Branching experiments

A worked example. Fork a notebook, develop two approaches, compare them, and converge.

This walks one branching workflow end to end. For the ideas behind it (checkpoints, the branch graph, convergence), read Branches first.

The setup: you've loaded and cleaned a dataset and fit a logistic-regression baseline. You want to compare it against a gradient-boosted model without redoing any of that.

1. Get to the fork point

Work the notebook up to the cell where the two paths split. Here that's right after the train/test split, with X_train, y_train, and the rest in the kernel. Everything above is shared setup you don't want to repeat.

2. Fork

Fork from that cell. Clusy checkpoints the kernel state, so the new branch starts with your loaded and split data already in memory. Name it something you'll recognize, since "gbm" beats "branch 2."

You now have two branches sharing the setup and diverging from here. The baseline branch is untouched.

3. Build the new branch

On the new branch, build the gradient-boosted approach: fit, evaluate, and plot whatever you'll compare on. Ask the agent or write it yourself. Because the setup state came along in the checkpoint, the first new cell can use X_train right away.

4. Compare

Switch between the branches, or open the tree view to see how they relate. Line up the numbers you care about, like accuracy, the confusion matrices, and the ROC curves. Both branches ran from the same checkpoint, so you know they saw the same data and the comparison is honest.

5. Converge

Once you've decided:

Winner carries that branch's result and kernel state into the cells downstream, and you keep building from there.
Comparison keeps both, when the comparison itself is the deliverable.
Manual lets you take pieces of each.

The point where the branches rejoin is a convergence point, a diamond in the graph that says these split back at the train/test cell and came together here.

Tips

Fork early. It's cheap, because it snapshots state instead of recomputing. Don't agonize over whether an idea is worth a branch. Just try it.
Name branches. A tree of "branch 1, 2, 3" is its own puzzle.
Set dead ends to dormant instead of deleting them. You keep the record of what you tried without the clutter.
Branching isn't committing. When you've got a result worth versioning, commit the notebook to GitHub.