I hope this doesn’t come across as too much a stream of consciousness. I have used music scanning software for well over a decade. There are many perfectly legal reasons for wanting to do this. Sometimes I want to base a new project, in part, on some public domain music. Often, I have purchased arrangements that either contain mistakes, or I may need a different transposition for a substitute instrument. Sometimes I need to produce a decent rendering of the music. Sometimes I need to alter an arrangement to a different key or perhaps a different length to match the needs of a service. There are many legitimate reasons for wanting to turn printed music into MusicXML.
Sadly, the state of the art is very poor. There are few products out there. None ever work completely accurately on anything but the simplest of pieces. None of the products show much commitment to development and support. This becomes a vicious cycle. If the tools don’t work very well, then there is not much market, and therefore no money to justify further development.
It seems to me that in a world that is swimming with AI, music recognition may be the perfect case for applying AI. It should be noted that what passes for “AI” today in many cases is simply a developer invoking an existing large language model (LLM). The developer isn’t doing any “AI development” per se, just using a “black box” In other cases, specialized neural nets are developed for specific applications. We see that In SpectraLayers and now Cubase 15 with the stem separation features. Stem separation is completely unrelated to LLM, and required people to create new nets from scratch. I doubt that Steinberg has hands on the development of these stem separation AIs, but certainly Steinberg is becoming familiar with the process.
To develop a new neural net, one needs training data, where there is a source and a known outcome. We have that by the millions in music notation. We have a practically infinite supply of printed music, and much of that has a digital equivalent that could be used to train the nets. I am not saying that is easy. But I am saying that if anybody ever gets this right, it will be transformational in how we composers, arrangers, songwriters, soundtrack builders, and engravers go about our jobs.
I would think Yamaha and Steinberg should have more than a passing interest in this. A really effective scanner/converter could feed right into Dorico, Cubase and probably other Yamaha products.
Given what we have seen of other AI developments, I have no doubt that this could be accomplished if sufficient resources were available. And that comes down to the business case. What we are seeing with many products (look at the new Canva/Affinity announcement, for example) is a base product that might be available on a perpetual license, but all the AI stuff requires a subscription.
Many of us absolutely deplore monthly subscriptions. I curse myself every month when I pay the Cable TV bill (I should be cutting the cable – long story.) OTOH, I am willing to pay for valuable functions. I only object to being charged every month if I am not using the service.
So, FWIW, I believe an AI-based music scanning product could be offered using a “credit” model instead of a subscription model. That is, maybe we could pay 50 cents for each page successfully scanned, and we might buy 100 pages of credits at a time. I would find that perfectly acceptable because I can value my time saved for each page I do not have to enter by hand.
Anyway, all of this is to say, I think that somebody ought to be doing this. We have lived with the existing half-baked tools far too long. I believe somebody will eventually do this, and it seems to fit the Yamaha/Steinberg business better than most.
Any thoughts?