I’m on the ChatGPT+ plan and I tested whether the Codex Apple Silicon app could generate a 5-string double bass .doricolib file for me. I placed the instruments.xml file from the Dorico 6.1.1 app bundle into the project folder and described what I wanted. Codex analysed Dorico’s original 3 MB instruments XML file and produced a .doricolib file containing a 5-string double bass tuned in fourths: B–E–A–D–G.
On the first attempt, the instrument ended up in the Custom category, and as a result the string section bracket didn’t automatically extend to include it. I was able to fix that with another analysis and a new file, which I’m attaching here as well.
So, you find the category mismatch is something that needs the by now proverbial ‘human in the loop’, but you have no objections to the chatbot – as ever confidently – spelling the instrument name with two b in each and every instance in the file?
When a comment is patronising and at the same time based on incorrect facts, there’s always a chance the person who wrote it ends up making a fool of themselves.
ChatGPT-5.3-Codex, which runs in the Codex app, seems to be the first one that doesn’t hallucinate, even when writing simple code. I’ve had the same experience with Home Assistant add-ons and with Dorico.
Speaking of incorrect facts, though – and I am aware of the implicit condescension, but I think it can’t be said enough: the fundamental way of functioning for an LLM is to always hallucinate. The fact that, more often than not, the output matches what you wanted to get does not change that.
Yes — LLMs do always hallucinate, that’s true. I’ve listened to quite long podcasts with people involved in their development (Lex Fridman), and I’ve also come to understand that today’s systems are called “AI” because the name is sexier and sells subscriptions effectively. Intelligence, think about it!
That’s why I let the LLM do the grunt work and I always double-check to make sure there aren’t any major hallucination errors. Even so, 5.3-Codex is the first one where those errors are actually fixable — and where successive code edits don’t drift further and further away from a working solution. That was a surprise to me as well.
And point doubly taken, because had I checked the language flag in the file, I most probably had not written the smug response. Good faith question, though: if you wanted an Italian named instrument, why wouldn’t you have tasked the chatbot with that right away?
That’s because I did all of this on the bus, on my way to the concert venue to perform, and I forgot to include the language in the prompt. It was simpler and quicker to change those names manually in the XML file afterwards.
That’s where most of the important work gets done.
I’ll be interested to see how this works when it filters down to the ‘free’ level. Using Chat-GPT as help for writing software is an exercise in patience, as it frequently makes up properties and methods for system objects.
Remember that endpoints contain VST saved-state data, which isn’t text-readable code, so it won’t be able to generate those easily.
I certainly wouldn’t call it a conversation. Creating the .doricolib file for a five-string double bass tuned B–E–A–D–G, and testing it in Dorico, took maybe half an hour in total. And because I was in a moving vehicle, gaps in mobile coverage in a sparsely populated rural area also played a part.
P.S. For anyone who’s interested: I also managed to get a pretty good result with a LilyPond → MusicXML conversion. Even the lyrics ended up under the notes, although I had to use the Codex app to copy them separately from the .ly file and inject them into the MusicXML (since python-ly doesn’t support that).
This opens up some entirely new possibilities for importing freely available LilyPond files into Dorico and then making further changes/edits there.