Everyone’s calling it “vibe coding”, and I’m quite enjoying it. It’s fast and fun, like a game where you create your own levels, and keep going till you exhaust the model. I go from “working” to “working” and unlock new challenges everyday. For example, I’ve never coded in Typescript, but I’ve now built two apps, one for personal finance using bank transactions and a second one to estimate income taxes. Next, I want to fine-tune a model and evaluate it using model-generated datasets, things I’d only dreamt of doing last year…

Surprisingly, it’s fun because it’s not always smooth sailing. People are comparing generative AI coding skills with human coding skills, which I think misses the point. When I understand what the model is doing and why it’s doing it, it feels like moving at the speed of logic rather than being constrained by language and syntax.

I must admit that part of the fun is discovering the model’s limits. Maybe these are Cursor’s limits rather than the model’s limits? For the record, I’ve been using Cursor with the default model, Claude 3.5 Sonnet. I’ve also used Claude Code, but that was a different experience. I might write about it at a later point. Yesterday, someone asked me what was it like to use Cursor. So here’s a short roundup of what I discovered.

Always be vigilant

Confabulation: Even when I provided reference documentation from Stripe, the model (Cursor?) confidently responded with something along the lines of “Looking at the documentation, this is correct.” It only relented when I asked it to provide a reference. If I hadn’t read the documentation myself, it may have taken me much longer to debug situations where it generated handlers for completely made up wehbook events.

Blindness: The model uses its previously generated code as ground truth. When I manually fixed code, it didn’t seem to effect the model’s subsequent generations, though asking it to fix the mistakes helped it recognize the “fix” before I could continue. For example, when I renamed a field in a Typescript interface (e.g. short_term_capital_gains instead of just capital_gains), it would not pick this up for the code it generated after this. At least a couple of times, it created new files without taking the repo structure into account. For example, it created a new scripts folder for database migration at the root level when it had previously created similar scripts in the src/app/db level.

Specificity is no antidote eagerness

Overreaching: When I asked to add a new button and selector to a web interface, it added half dozen new packages and four new files. We already had Tailwind CSS and basic components to add these features, and I was not expecting that it would make the task to be 10x more complex than I wanted it to be. Fortunately, reverting was simple.

Overassuming: On another occasion, I prompted it to add specific new fields to one Typescript interface, but it also added to another interface for no reason. This worried me about what else it had assumed… maybe most of its assumptions were good?

Don’t bank on its stamina

Anti-endurance: Ironically, sessions that don’t go well last longer. The session where it confidently misinterpreted Stripe’s documentation was also the session that drew on for a couple of hours. At some point, it froze in the middle of responses, leaving the repo code in intermediate state. Rerunning the prompt sometimes resolved the issue. After a point, it stopped responding altogether. It would respond to short and simple questions after that.

Anti-recall: On a few occasions, it added dependencies even after we explicitly decided against these. For example, I began the project by asking for options and trade offs, and directed to proceed without taking a dependency on the Prisma ORM. When I tried to add new features, it would forget this instruction, and add code requiring Prisma. This occurred enough that I asked it to find a way to remember it. It proposed to create a project plan without realizing that we had created one at the beginning of the project. I wasn’t sure if updating it would help it remember better, but the issue hasn’t reappeared since.

Intelligence != Ownership

No Reflections: On many occasions, it would leave behind spurious code or files, making no effort to clean up after I had redirected its suggested solution. Sometimes this clean up became difficult as I lost track of all the changes that had to reversed. Maybe this is a good thing, because I wouldn’t want it getting too eager overthinking its past results!