anu.sh

Friday, August 1, 2025

I woke up this morning realizing that I prefer to communicate with my parents only by text messages. This way I can hold them accountable for what they say.

Why should I need to hold them accountable? Because my parents don’t mean the same thing when the use the same words at different times. Their intended meaning is whatever’s convenient for the situation. I want to pull my hair out every time I speak with them!

We Indians tend to eulogize the aural tradition. The Argumentative Indian even makes it sound enjoyable. But when each person means the same words differently (and self-servingly), communication can be stressful. In the best of times, I’m “just” second guessing what they said. In the worst of times, my furious mind is losing its grip on reality. This can make for an intriguing Shakespearean play (no risk of boredom!), but leaves little ground for trust or relationships. In any case, relationships are not something you control. They’re handed to you like food and water.

The starkest difference between my life in India and the U.S. is that I can reasonably predict what the other person will hear, how they might react, and what they’ll do. In India, I am mainly rolling the dice and surviving on my instincts.

Power of Printed Words

Where and when did Western civilization diverge from oral traditions anyway? Not too long ago, the written word was merely an aid to memory. Why do you write when you can remember, they asked. Instead of memorizing them, monks read manuscripts aloud. Universities in the 12th century taught oratory and debate, and later in the 14th century expected students to scribe lectures as teachers read from books or introduced new concepts.

The printed book in the 15th century seems to be the first break in tradition. It seems to have homogenized language and instruction. Words now had standard spellings and meanings. The book became a teaching machine. The Guttenberg Galaxy ascribes all kinds of institutional revolution to the printed book. I do buy that it kicked off the world’s first consumer age, but I’m less certain that it levelled individuals the same way war levelled classes and feudal distinctions. Consumers do tend to copy each other and look more alike than different over time. With printed books, it’s plausible that more folks acquired a shared threshold of scientific knowledge and a better understanding of the world. Some developed the ability to scrutinize their understanding more than others. A subset learned to apply this knowledge to the problems they observed. And a further subset organized human effort to solve these problems at scale. Each of these leaps required human ingenuity that was not homogenously endowed, nor universally acquired from books. I have no instinctual basis to predict if generative AI will be a great equalizer or a great amplifier, but looking at the way books have shaped us, it could be both.

Power of Self-service

Among the cognitive powers that books have dilated, the most powerful to me is the ability to self-teach and self-reflect, to arrest a snapshot of analysis from a book and scrutinize its merits on your own terms. Some are adept at live scrutiny during a conversation or argument, but I’m much slower than that. I tend to scrutinize my understanding by writing it down and then speaking about it when asked. Often I agree with the author, but sometimes when I connect the printed word with something else I’ve read, I discover a satisfying “nuance” that the author has left untouched.

Almost everything I’ve learned has been through self-service. A vanishingly small fraction of what I’ve learned is from a tutor. I see no reason to believe that a chat interface will replace self-study as the primary mechanism for acquiring applied knowledge.

Exams Shape Study

The second most powerful teaching aid for me has been subjecting my understanding to external inquiry. Even preparing for this makes me better at appreciating the complexity of a topic. Claude can be good for this, but its inquiry can be narrow and superficial, and highly sensitive to the context that I must provide to get it going.

Even more valuable than the test is a critique of my answers. To deliver this well, the examiner must know the subject matter, they must know the variations in human understanding across multiple dimensions, they must be able to identify the root cause of these variations, and they must be able to articulate these root causes productively (at my current level of understanding). This is where I tend to need the most help from others.

Maybe Claude will get very good at this over time, but I imagine that human critique will become even more valuable than it is today, especially when the relevant human has context that’s not easily replicable. Likely they have also built instincts on the most rewarding areas for further exploration. There will be few humans with this kind of context and instincts.

Does this mean that testing and guidance will become even more scarce and centralized? Those who have good tests and guides today will have better tests and guides in the future? Grrrrrr.

#research

Scaling Intelligence with Culture

Friday, July 4, 2025

When I first joined Amazon in 2012, I’d asked a colleague, “What makes Amazon successful?”. “The culture”, she’d replied without batting an eyelid. That year Facebook also published a 148-page Little Red Book, a culture manifesto that “was a declaration of identity, solving the problem of scaling culture during explosive growth.” Netflix had already flashed their cultural badge in a widely shared culture deck on Freedom and Responsibility in 2009.

Culture is the primary tool to scale company operations as economies of scale or network effects (or whatever poison you choose) drives explosive business growth. Culture eats strategy for breakfast, goes the cliche. Maybe what this really means is that no matter what a great business jackpot you might’ve hit, if you don’t scale the way people operate the business, the business is unlikely to succeed.

Now, with all this talk about AI making people many times more productive, companies can afford to stay small and do more with fewer people. If this is true, will culture continue to be as important as it is today? Will we now teach robots our business ‘culture’?

What is culture anyway?

At the risk of repeating the obvious, more than the principles and manifestos that mark the hallways of growing companies, culture is a function of the people, the relationships they choose to form, and the norms they develop to make decisions. New members deduce the culture from the sum of social norms that they observe. We humans are unusually good at mimicking each other and extending our unspoken cultural learning. Leadership principles and manifestos just seem to provide the vocabulary to ramp up new members as quickly as possible.

But humans don’t always blindly accept what they see. Ironically, we rely on unspoken cultural transmission to develop our own filters for fairness. We judge human behavior based on the individual’s intention and the situation, not just the outcome. Most critically, our cultural values evolve fast, especially in the U.S., where diverse individuals meet and find the space to thrive. The culture that evolves fastest is the one that’s most likely to survive.

Soul of a New Machine

What does this mean for the “AIs” and robots joining the workforce? Will we leave them to be mechanistic, like a washing machine? Or would we codify them with our working principles, a la Asimov?

More than adding “personality”, it’s important for us to have the ability to codify our working principles and cultural norms in AI systems. Most of these norms are unspoken and many of us aren’t trained to explicitly communicate these to each other, let alone to AI systems or robots. Anthropic’s work on Constitutional AI comes closest to this, but they seem to have approached it with the goal of training Claude to be innately harmless, helpful, and honest.

This is a great baseline, but back to an earlier point, it’s important to have the ability to tweak and adapt the cultural constitution of AI systems to be more tuned to their environment and continously evolving. For example, a company would likely want to customize Claude towards its cultural principles. Anthropic’s Constitutional AI blog mentions that they, “are exploring ways to more democratically produce a constitution for Claude, and also exploring offering customizable constitutions for specific use cases.” However, it’s not clear to me how a different company would imbue Claude with its own cultural values. Would this be a system prompt or post-training and customization? Would the system prompt or training override the values that Anthropic injected into it?

This thought experiment also leaves me with a bunch of broader questions. For example, how will robots change us culturally? If our workforce is set to change radically, what would the cultural values of the new and evolved human and/or hybrid workforce look like?

First Principles

Before we codify anything into AI systems, I want to really understand what dimensions are worth codifying. Over the last few weeks, I’ve been researching how successful organizations “implement” culture, starting with the industry I’m more familiar with i.e. companies in technology and software, which operate in a fast-changing market with rapidly evolving customer expectations such as Netflix, Pixar, Apple, Walmart, Meta, and Amazon. These companies tend to pay obsessive attention to their customers/users/viewers and reward front-line innovation among their employees. But it starts with hiring the first few people…

The Units of DNA

Combining my interview bar raiser training at Amazon and experience from hundreds of interviews (on both sides of the table), companies that index on innovation tend to focus on three ingredients.

1. Attitude

Recently, a colleague, K, was explaining to me, “One of my most trusted mentors told me that I’m just not a natural fit for product management.” She went on to add, “My instincts are to rush into execution rather than pause to evaluate and assess. That’s why I gave up my career as a product manager and became an engineering manager.”

Whether this was the right decision for her or not isn’t for me to say. But this kind of thinking that strikes out whole areas based on the idea that “I’m built a certain way, don’t have the talent, and can’t pursue proficiency in X” is the kind of attitude that I find difficult to alter. Maybe I’m overly optimistic about human ability. Maybe it won’t even translate into AI ability. But I believe that we are all born with the ability to learn and adapt, and our attitude is the only thing that gets in the way.

2. Intellect

Satya Nadella may describe this as the ability to root cause and create clarity. In Creativity Inc., Ed Catmull describes it as the ability to connect distant, seemingly unrelated patterns. Both seem to converge on the idea that intellect correlates with the proficiency to identify and solve problems, sometimes within the given constraints, sometimes by knowing which constraints to test, and oftentimes by reframing the constraints.

3. Energy

Bill Belichick describes this well in the The Art of Winning. It’s the capacity to “know the job”, “stay attentive”, “stay committed”, and “put the team first”. The first three drive proficiency and the last one drives performance. But energy is the underlying driver for all.

The energy to know the job, stay attentive, and stay committed is like the energy of an athlete. Athletes learn the sport and what it takes to perform competitively. They are self-aware, recognize their areas of improvement, and practice to improve both their strengths and weaknesses. They are disciplined about showing up and being fully present at practice every day so that they can perform when game day arrives.

The energy of putting the team first is like the energy of a parent, who’s patient, authentic, and consistent with their “team”. It’s the curious energy of wanting to know each person and observing what drives them. It’s the caring energy of being kind and considerate. It’s the mental energy for being vocally self-critical. It’s the supportive energy that removes distractions or obstacles. And at the risk of overloading the word, it’s the energy that sets the bar and holds it consistently. All this needs effort and practice and can easily tire us out, which is why I call it “energy”. This is the energy that fills the space between people. Like the Japanese idea of “Ma”, it enhances the elements that compose the whole.

Composing the DNA

Setting aside attitude, which may be non-negotiable, I would hire someone who spikes on either intellect or energy. However, I’ve found that intellect is rarely the constraining factor in team performance. It’s energy. Recognizing the kind of energy that’s missing and addressing its absence is the job of the manager.

Even when AI agents become proficient, I imagine that humans will be better positioned to (a) define what performance means, and (b) recognize what’s missing in the team to achieve that performance. If humans continue to own the definition of performance, they are also likely to remain ultimately accountable (and legally liable) for observing, root causing, and addressing what’s not working.

One takeaway for me in this research chapter has been that genuine curiosity about people is so rare that it could be a defining trait of successful leaders in the 21st century. I’d load up on such curiosity, and even more so, the energy to learn and care about the human condition and aspirations.

#research

Fewer Screens

Friday, July 4, 2025

Living in the Moment

Does living in the moment translate to fewer screen moments in modern human assembly language?

I want to break my addiction to screens but it’s hard, especially when it’s easy to main-line the world wide web all the time.

Is it possible to stay connected but not look at a screen all the time? Particularly a screen that’s designed to deliver distractions in every shape and form?

Well, challenge accepted.

Hello World!

Last week I started a hardware project to capture a photo using a small camera attached to an ESP32 S3 Sense chip that’s just 2 x 1.75cm. My hope is to tape it to a very snazzy pair of sunglasses that my husband won at last year’s airfare.

Touching the camera mounted on the frames would capture a photo of a book or a landmark that’s in front of me, get its rating or fun facts, and play it on the ear-buds, also mounted on the frames. DIY smart glasses, ftw.

What’s working

Last weekend, I was able to get the first leg working: take a photo and get an analysis.

Step 1: Set up a BLE (Bluetooth Low Energy) server on an ESP32 S3 with an integrated camera

The sketch for the BLE server on ESP32 is here. I borrowed a bunch from this getting started guide.

The smallest chip I could find with a camera extension and support for Wifi and BLE

Step 2: Connect to the camera from the accompanying app on the phone. The Xcode project for the app is here.

Step 3: Once connected to the camera, touch the camera to shoot a photo.

Specifically, touching the D0 pin on the ESP32 S3 will trigger a photo capture. The BLE server on the ESP32 will then transmit the photo to the app on the phone.

Step 4: Analyze the photo.

Once the photo receives the photo, the app will automatically invoke the OpenAI API to describe the image.

This weekend (happy Fourth!), I was able to get another piece in…

Step 5: Convert the analysis to speech.

Now, the app will play back the analysis through the audio channel. If you have ear-buds mounted on your frames and paired with the phone, you’ll hear the hear the analysis within a few seconds.

The last step is to power the camera with a battery rather than USB-C and tape it on.

The Bill

Frames: free, but shall we say $20?

ESP32 with Camera: $24

JBuds: $15 (this made me give up trying to DIY these)

3.7V LiPo battery: $9

$68 for a very dorky pair of “smart” sunglasses. My parents would definitely not approve.

#projects

Getting Rid of Your Voice

Monday, June 16, 2025

Teasers

“I don’t care about what the experts say about the notes of the wine,” our wine tasting host, Tyler, railed passionately. “I’m only interested in what happens in your palatte.” Music to our ears, my husband and I settled back into our seats, mildly pleased with ourselves as the unwashed masses with minds of our own.

As we discovered our taste for French wines, one guy in the audience (let’s call him Sanjay) was rising to the occasion, stepping up to be the born French connoisseur, spitting out acronyms like “GSM” and names of remote French villages that produced $8000 bottles of wine. Between Tyler’s subtle salesmanship and varying shades of our wine appreciation, Sanjay’s loud mouthed commentary briefly threatened to drown the room.

But thanks to Tyler, I ended up enjoying all the teasers of the event. All for $20.

The Job

After it was over, I marvelled at Tyler and how well he’d handled Sanjay. Dealing with such crowd can be hard work, but Tyler poured us wine after wine with rare sensitivity and proficiency. He knew his job, I thought, as I opened up to some self-reflection…

How often do I confuse the job with work? Honestly, all the time. When I’m cooking, I’m grading myself on following the recipe. When I’m hiking, I’m crushing it on the trail ahead of everyone else. When I meet a defect, I want to squarely fix it. When I find a problem, I want to solve it. Work might have a way to pump up the adrenaline. And I’m an adrenaline junkie, it would seem.

Does it make for good work? Sure. Does “facing it squarely” deliver instant gratification? May be. Does it get the job done? Unclear. But if I were an artist, I’d wager that this is the voice that I’ve come to recognize as mine.

The Voice

“The real issue is not how to find your voice, but… getting rid of the damn thing,” Philip Glass once quipped. I’ve tried to get rid of my pestering “face it squarely” voice, I promise you, but the only thing that’s worked for me is to be in the moment. To fill the moment with attention and openness.

Tyler seems to have mastered this, presumably from hosting dozens of wine tastings and dealing with all kinds of characters. I wonder if the wine helps too?

Like the notes of wine wafting to our noses, there’s no intellectual sophistication in the moment, just the experience that no one else can have. And like the flavors dissolving on our tongues, there was no judgement, just a sense of what must be done… attend to your experience and let others attend to theirs.

#observations

MCP Transport Inspection

Tuesday, May 6, 2025

The First Two Laps

A couple of weeks ago, I wrote up a scrappy MCP (Model Context Protocol) Server and Client. Beyond the fact that everyone is creating MCP servers, there were two reasons I wanted to write these:

I wasn’t able to find a way to test the newly introduced Streamable HTTP transport in the Model Context Protocol, either on the server or client. At the time of writing, the MCP Inspector didn’t seem to support this transport; it supported SSE (Server-side Events) and Stdio transport mechanisms. I was also not aware of a server that supported Streamable HTTP, though this was relatively easy to write. The hard part seemed to be the client app, though it’s entirely possible that I missed something obvious.
There were even fewer pointers to MCP clients that required authorization support from servers, which had been running mostly on local desktops. Now with Streamable HTTP transport, remote servers were beginning to emerge.

So I prototyped an MCP client that used device flow based OAuth and used Streamable HTTP transport to communicate with servers. As a command line client, it required a custom MCP server with OAuth that was slightly different than anything I could find publicly, either via the MCP Inspector or CloudFlare Remote MCP Servers.

Last week, I prototyped an MCP client that mimicked a browser-based experience similar to the MCP Inspector, but one that supported Streamable HTTP transport. This allowed me to drop down one level to work with the end-to-end sequence of messages exchanged between the client and server, from authorization to tool execution and follow up notifications.

This has been a lot of fun and after the first two laps of this marathon, I’ve been reflecting on what I learned.

Takeaways

The most interesting takeaway was that vibe coding is awesome to learn a new language or programming paradigm. It may also be good enough to build a prototype. However, in this journey, I quickly hit a wall when I could no longer fully follow what was happening and couldn’t get myself unstuck. For example, one day I couldn’t debug why the authorization flow was going into a loop. That day I ran out of my model call limit just trying different guesses. Then I had to fall back on my own understanding and it frustrated me that I hadn’t come to understand what I was doing.

The next day, I built the whole app from scratch myself, one step at a time. I used the code from the prior iteration to fill in the gaps as I went along. When I had it running in a day, I realized this alternative journey had been far more satisfying than the imaginary one, where I’d have been happy to simply have something working one time.

My sense is that most programmers building something meaningful that continously evolves will run into this performance wall sooner or later. They will come to value learning for its own sake as learning can help us get a lot more stuff done.

My growing conviction is that we have a “fast learning” button at our fingertips today not a “generate code” button. Software by definition is meant meant to evolve… and writing software that’s not been written before will likely remain in our own hands.

Next Steps

Could I create a shopping agent that plugs into my Amazon account? Or a Midjourney agent that might help me reimagine my living room? Or a personal agent that could do my taxes? How might multi-modal data travel on the MCP wires?

#projects

↑ newer

↓ older