15 questions | Interview | Andrew Paley | "Songs have a sender and a receiver. They rely on shared experience. What happens when that fundamentally changes?"

Name: Andrew Paley
Occupation: Singer, songwriter, multi-instrumentalist, producer
Nationality: American
Current release: Andrew Paley's live EP My Darling Dopamine is out via Thirty Something.

If you enjoyed this Andrew Paley interview and would like to stay up to date with his music, visit his official homepage. He is also on Instagram, Facebook, and Soundcloud.

You were involved in the earliest stages of projects like Narrative Science and Storyline. These early phases are often particularly interesting - what was this time like for you personally?

There’s kind of a beautiful, nervous energy in the early stages of a company – especially one trying to build something that’s more inventive or exploratory or uncertain. You have to be nimble and scrappy and willing to experiment and even fail, and pick yourselves back up together, and then keep running.

Everyone is playing multiple roles and has to learn how to do stuff on the fly, and everyone’s collectively responsible – we sink or keep swimming together. There’s a deep camaraderie that comes out of that connection and cross-reliance. On top of that, you’re always juggling the tension between what you want to build towards and what you have to build to at least get started.

For me, personally, I love the early stages. It reminds me of starting a band – you’re this tight little collective of people trying to make something out of thin air. All you’ve got is the belief you might be able to pull it off, and every step is (hopefully) towards proving that you can.

What were you working on, concretely?

At Narrative Science, it was a natural language generation company way back before anyone had even thought of transformer architectures or heard of anything related to “GPT.” And it was NLG in the broadest sense of the term – not just how to write, but what’s worth writing in the first place: what’s true, meaningful, important and interesting to a given audience.

The company started with sports stories – baseball, football, basketball – because they were closed systems with lots of data, but pretty quickly moved into lots of other spaces from financial and real estate reporting over to marketing and web analytics.

It was my first experience working with machines, not just as some form of mechanical servant but starting to reconceptualize them as companions capable of developing representations of the world and communicating them through the generation of language.

I then spent four years working towards a PhD and doing research in human-information interfaces, further pushing the boundaries of how we might leverage the growing toolset of AI to help people understand the world around them through a variety of modalities (language, images, etc).

Now, at Storyline, I’m back in that early, energetic stage with a venture that builds beyond that backdrop and goes a step further into interactive video.

At Narrative Science, what would you say, were the expectations within the team in terms of the time horizon when AI technologies would arrive at practical results?

Well, AI technologies have been arriving at practical results for decades – it’s just that the term has shifted in meaning.

At some point in the not-so-distant past, optical character recognition – the ability to scan a document and have the machine understand the text and extract it – was seen as artificial intelligence – and there are countless such examples spanning the gamut of possibility over the decades – logic and inference systems, game playing systems, wayfinding systems, generative systems, etc.

Did you ever imagine something as powerful as Chat GPT would arrive this soon?

The past 5-10 years has seen a sea change in certain forms of capability, and I don’t think anyone saw the extent of the success as quite so imminent.

There were certainly people touting the possibilities of deep learning a ways back, but even people like Geoffrey Hinton (who gets dubbed “the godfather of AI” and in many ways deserves that moniker, though I would argue it’s more “the godfather of deep learning”) admit they were a bit blindsided by the rate and extent of success.

Especially in journalism, AI has come under fire for a variety of reasons. How do you today weigh the potential benefits against the risks and downsides? How does the goal of democratisation stack up against potentially putting people who love their jobs out of work?

I think putting people out of work is one of a few incredibly serious issues this new era of large language models imposes upon us, especially in the realm of truth and information. If I’m to be honest, I think our current status or position in the arc of productization of large language models is both incredibly promising and somewhat premature when it comes to actual, scalable deployments for all sorts of use cases in the real world.

In other words, no one should be relying on large language models alone for meaningful information. They can certainly be used in conjunction with other sources, leveraged by experts for quick, validatable reminders (e.g. code completion or reference) or incredible starting points, or coupled with truth-maintenance systems to ensure there are baseline facts that are adhered to (a version of which we’re building at Storyline).

But they are trained to be fluent – to generate language – and while along the way they might pick up all sorts of useful information, it’s been clearly demonstrated they’re also quite happy to make things up to complete a sentence, or take pretty much any suggestion and run with it (absent some additional controls, of course).

So, my concerns are about job loss, but also misinformation (whether intentional or unintentional), disinformation and the allure/lock-in of endlessly available content (just imagine infinite TikTok – we just might amuse ourselves to death after all).

There is this oft-discussed idea of the “alignment problem” – how do we get AI to share goals with human beings so it doesn’t decide to just off us or work against our best interests. While that makes for exciting headlines, I don’t think it’s our most immediate existential threat – I’m way more concerned presently with what we might do to each other in the name of profit, market share, or conquest, given the new tools afforded us by AI.

Since you're also active as a coder, do you see any overlaps between creativity and coding? Are sites like github potentially the new meeting places for creatives of the future?

Yes, absolutely. They’re one and the same. It’s just a new form of writing with a different method of realization.

The term AI is an interesting one when it comes to creativity. What, would you say, does intelligence mean in relation to music?

That’s an interesting question. Creativity is clearly an act of generation, but intelligence suggests to me some sort of intention, some act of communication, some broadcast.

AI-generated music – or poetry, writing, etc – currently lacks that message-in-a-bottle feeling for me. It mimics the content, but lacks the feeling that there’s someone on the other end. We’ll see if that changes.

At least from my perspective, the advances of AI in music seem to lag behind those made in many other fields. Would you agree or not – and if so, what are possible explanations?

Well, I think there’s a lot going on in a song – there’s much more data than in, say, an image and there’s a lot of moving parts that are hard to disentangle (various instruments and vocals all mashed together) with no clear “first step” as with video – it stands to reason that all the work going into image classification and generation is also towards similar techniques in a series of images, aka video, with additional considerations, of course.

But there’s not such a clear all-encompassing single “frame” in song generation. It’s sort of an ensemble generation problem – writing the chord progressions, writing the lyrics, writing the melodies, developing rhythms, arranging, organizing instruments, selecting sounds, generating sounds, generating vocals in time and tune, mixing, etc.

There’s interesting work coming out of Microsoft, and I’m sure other places, to attempt to make each of these problems discrete – to break the process of music decomposition and composition into a series of subcomponents.

There are also proxies – training on tagged midi samples and then subsequently feeding results into different virtual instruments, or using tagged spectrograms of songs to leverage existing generative image techniques to generate new song spectrograms (that are then fed into an interpreter) – but they yield mixed results, in my experience.

All that said, there’s also a bigger question just beyond the technical ones: even when the results become convincing, as I’m sure they will, will a song generated by a machine resonate any differently with us? As I said, songs are like messages in bottles to me – there’s a sender and a receiver, and the language relies on some sort of common or shared experience. What happens when that fundamentally changes?

How would you describe your interest in AI when it comes to music? How do you see the shift of moving towards a process where AI is more deeply involved?

The expansion of the ecosystem of tools to create music – or really art in any number of forms – is always an undeniably good thing. And the space of artificial intelligence is a source of a new set of tools affording us new ways to put new types of messages in new forms of bottles.

What matters to me – in music and everywhere else – is that these tools and techniques are aimed at human empowerment and expression and not human subjugation.

I would love to see new ways for musicians to collaborate with AI in songwriting or arranging or editing or sample alteration or signal processing (including remapping performances to other voices) or other accompaniment (like generation of visual components). But I’m way less interested – and even concerned by – the idea of hitting a button to receive an infinity of content, musical or otherwise.

Some deeper experiments into AI-generated music allow a glimpse at non human musicality (of course, so do video- and image-generating programs). From your own experience, in which way is it different from our human musicality?

In one sense, they’re very similar – we’re all working within an agreed upon framework that can be expressed mathematically, with access to much of the same palette of possibilities and a massive trove of work from which to source “inspiration.” Even things that seem like they should be human – the ability to actively discriminate based on “taste” or adherence to a particular genre – can in fact be modeled.

All that said, there is some seed of intent, some anchor in the broadcast of shared experience, that I can’t yet imagine being mechanized through machine learning. The process of making art is some part active exploration and some part dropping breadcrumbs in the cave of collective human experience, some form of lighting candles in the dark, that others might stumble across.

If AI can generate an infinity of candles with no tether to some intentional broadcast of meaning or experience, the resulting blinding light threatens to be just another form of darkness.

Even if AI will not entirely replace human composition, it looks set to have a significant impact on it. What does the term composing mean in the era of AI, do you feel?

It’s just another accelerant to the process. But it’s important to note that process accelerants often change the quality of the outcome.

There are already apps that make the creation of songs incredibly easy – you don’t have to stumble around in search of an idea so much as hit a few buttons and the power of simple rules of music theory coupled with curated sound packs will get you to a decent sounding song quickly. But, if the process stops there, the outcomes tend to be vacuous, aesthetically uninteresting and void of meaning – the empty calories of the audio medium.

Push-button AI song generators are going to provide us with a lot more junk food, but – with human companions looking to see where they can push the boundaries – maybe something more interesting and valuable too.

Can you see a future where AI could make aesthetic judgements as well?

Certainly I can imagine models trained to mimic aesthetic judgment – the discriminator part of GANs arguably already represent some form of that as do things like song recommendation algorithms.

The question is will I find their aesthetic judgments meaningful? Depends on how they’re trained.

Andrew Paley about AI and Music

"Songs have a sender and a receiver. They rely on shared experience. What happens when that fundamentally changes?"

Video