@dspreadbury I’ve been working on optimal line breaking for Byzantine neume notation in my open-source scorewriter, Neanes, and I wanted to reach out because I think the approach may be relevant more broadly to Western staff notation software, including Dorico.
One core problem is that music notation with lyrics has break-dependent geometry: when two notes appear on the same line, their lyric syllables can often tuck underneath one another, reducing the horizontal space required between them. When a line break separates them, each note needs its full width. In other words, the effective spacing between notes depends on whether they end up on the same line, which is exactly what the line-breaking algorithm is trying to decide.
The Knuth-Plass algorithm, which produces optimal paragraph breaks in TeX, is a natural fit for music notation, and Hegazy and Gourlay proposed this approach in 1987. In practice, though, there has been a significant obstacle: the algorithm has no straightforward way to represent break-dependent widths. TeX addresses a related issue with discretionary nodes. This is acceptable for local text substitutions involving small widths, such as discretionary ligatures, but it is not well suited to music notation because they cannot contain flexible spacing (glue).
Break-dependent widths can, in principle, be encoded within Knuth’s original model by representing the difference between the break and no-break cases as negative widths in the standard box/glue/penalty algebra, even when the effect is split across the end of one line and the start of the next. The difficulty is that this violates one of the algorithm’s core assumptions, that widths are never negative. This causes performance to degrade from linear to quadratic time, which is prohibitively expensive even on modern hardware.
I found a way to relax this assumption while preserving efficiency. The key idea is to use a precomputed array (in data structures and algorithms terms, a prefix sum) that allows the algorithm to continue pruning its search space effectively. The result is an algorithm that finds provably optimal line breaks in time that scales linearly with the number of notes rather than quadratically.
I’ve implemented this approach and validated it on hundreds of real Byzantine neume notation scores. The technique is both optimal and general, applying to any situation in which a line break changes the effective width of neighboring elements. Western staff notation has the same class of interactions, where lyric syllables, dynamics, articulations, and other markings that extend beyond their parent note and interact differently with neighboring objects depending on line assignment.
I’m planning to prepare a paper on this over the coming months. I’d be very interested to know whether this resonates with challenges you’ve encountered in Dorico’s layout engine. I’m open to discussing how to apply this approach to efficiently implement optimal Hegazy and Gourlay style line breaking in Western staff notation.