In previous posts here I’ve talked about the idea of computers and AI composing music. Today I’d like to consider the possibility of computers aiding in human composition. Specifically I ask: can we envision a software that works with humans, particularly non-musicians, to create music?
One might ask, “why would we even try?” I think an obvious use case would center around enabling videographers, film directors and producers at ad agencies to produce music for their work without hiring a composer. (I, as an occasionally paid composer, am not particularly happy about this scenario but that doesn’t mean it won’t happen.)
Let’s envision this process. A non-musician sit down at a computer and requests, “30 seconds of happy music that rises to a climax 22 seconds in and then gradually lowers the dynamics.” This might specifically map to a piece of footage that person intends to use the music for.
How would the computer handle this? It would need to know what “happy music” really means. Generally speaking that could be translated into music set in a major key, probably using chord progressions common to pop music. I would involve melodies that likely don’t use much dissonance etc.
So the computer composes and spits out some possibilities. The human listens to them, chooses one and says, “Actually, I’d like more of a Caribbean sound.” So the computer adjusts the rhythms to have a more reggae flavor. Maybe it changes some of the instruments to sound like steel drums and the like.
How would this work the backend? Obviously the software would need to “understand” what all these terms mean. It would need to map terminology about various emotional cues, genres, geographic music styles (e.g. “Brazilian music”, or Detroit rock”) and such to actual music. How would it do this? My guess is that the process of deep learning in which neural networks are trained to learn semantic meanings is perfect for such a process.
Ideally such software would allow users to go “under the hood” as they see fit. For instance, the user might get his or her 30 seconds of happy reggae and then say, “I just want to tweak that one note in bar 12,” and be able to do so.
In a way, this sort of thing is a logical extension of digital audio workstations (DAWs) like Pro-Tools and Garage Band that are already used to mix and match various prerecorded audio and midi files. The main difference here is that the software is composing the music on the fly.
Of course, this kind of tool could be used not just by people looking to score visual material but by bands and songwriters. A music group might gather at the computer screen to use the software to write at least rough forms of tunes they would then edit and tweak. (As I understand it, something like this as already going on.)
Now, some people will be absolutely outraged by this idea. Some listeners will doubtless charge that computer-aided compositions are inferior. They will say the tools is simply a way for amateurs to cheat. And those complaints will have some validity. But I doubt any of that will stop this sort of thing from happening.