Relational Alignment

Recently, Dario Amodei, CEO of Anthropic, wrote about “AI welfare.” It got me thinking about the whole ecosystem of AI ethics, safety, interpretability, and alignment. We started by treating AI as a tool. Now we teeter on the edge of treating it as a being. In oscillating between obedience and autonomy, perhaps we’re missing something more essential – coexistence and collaboration.

Historically, we’ve built technology to serve human goals, then lamented the damage, then attempted repair. What if we didn’t follow that pattern with AI? What if we anchored the development of intelligent systems not just around outcomes, but around the relationships we hope to build with them?

In an earlier post, I compared the current AI design paradigm to arranged marriages: optimising for traits, ticking boxes, forgetting that the real relationship begins after the specs are met.

I ended that piece with a question …

What kind of relationship are we designing with AI?

This post is my attempt to sit with that question a little longer, and maybe go a level deeper.

From Obedience to Trust

We’re used to thinking of alignment in functional terms:

  • Does the system do what I ask?
  • Does it optimise the right metric?
  • Does it avoid catastrophic failure?

These are essential questions, especially at scale. But the experience of interacting with AI doesn’t happen in the abstract. It happens in the personal. In that space, alignment isn’t a solved problem. It’s a living process.

When I used to work with couples in conflict, I would often ask:

“Do you want to be right, or do you want to be in relationship?”

That question feels relevant again now, in the context of AI, because much of our current alignment discourse still smells like obedience. We talk about training models the way we talk about housebreaking a dog or onboarding a junior analyst.

But relationships don’t thrive on obedience. They thrive on trust, on care, attention, and the ability to repair when things go wrong.

Relational Alignment: A Reframe

Here’s the idea I’ve been sitting with …

What if alignment isn’t just about getting the “right” output, but about enabling mutual adaptation over time?

In this view, alignment becomes less about pre-specified rules and more about attunement. A relationally aligned system doesn’t just follow instructions, it learns what matters to you, and updates in ways that preserve emotional safety.

Let’s take a business example here: imagine a user relies on your AI system to track and narrate daily business performance. The model misstates a figure, off by a few basis points. That may not be catastrophic. But the user’s response will hinge on what they value: accuracy or direction. Are they in finance or operations? The same mistake can signal different things in different contexts. A relationally aligned system wouldn’t just correct the error. It would treat the feedback as a signal of value – this matters to them, pay attention.

Forgetfulness, in relationships, often erodes trust faster than malice. Why wouldn’t it do the same here?

From Universal to Relational Values

Most alignment work today is preoccupied with universal values such as non-harm, honesty, consent. And that’s crucial. But relationships also depend on personal preferences: the idiosyncratic, context-sensitive signals that make someone feel respected, heard, safe.

I think of these in two layers:

  • Universal values – shared ethical constraints
  • Relational preferences – contextual markers of what matters to this user, in this moment

The first layer sets boundaries. The second makes the interaction feel meaningful.

Lessons from Parenting

Of course, we built these systems. We have the power. But that doesn’t mean we should design the relationship to be static. I often think about this through the lens of parenting.

We don’t raise children with a fixed instruction set handed over to the infant at birth. We teach through modeling. We adapt based on feedback. We repair. We try again.

What if AI alignment followed a similar developmental arc? Not locked-in principles, but a maturing, evolving sense of shared understanding?

That might mean building systems that embody:

  • Memory for what matters
  • Transparency around uncertainty
  • Protocols for repair, not just prevention
  • Willingness to grow, not just optimise
  • Accountability, even within asymmetry

Alignment, then, becomes not just a design goal but a relational practice. Something we stay in conversation with.

Why This Matters

We don’t live in isolation. We live in interaction.

If our systems can’t listen, can’t remember, can’t repair, we risk building tools that are smart but sterile. Capable, but not collaborative. I’m not arguing against technical rigour, I’m arguing for deeper foundations.

Intelligence doesn’t always show up in a benchmark. Sometimes, it shows up in the moment after a mistake, when the repair matters more than the response.

Open Questions

This shift opens up more questions than it resolves. But maybe that’s the point.

  • What makes a system trustworthy, in this moment, with this person?
  • How do we encode not just what’s true, but what’s meaningful?
  • How do we design for difference, not just in data, but in values, styles, and needs?
  • Can alignment be personal, evolving, and emotionally intelligent, without pretending the system is human?

An Invitation

If you’re working on the technical or philosophical side of trust modelling, memory, interpretability, or just thinking about these questions, I’d love to hear from you. Especially if you’re building systems where the relationship itself is part of the value.