My experience with Cursor and Clojure-MCP
Today, there are many ways to use Generative AI for coding. From tools running in the terminal like Claude code, Codex, Kira, Ampcode, Cline… to using Anthropic, ChatGPT, Gemini, Mistral … in the browser to a more integrated experience with Cursor or VS Code with Copilot. The previous list is not exhaustive; there is literally an explosion of tooling implementing the same OODA loop. A build-from-scratch description here with its naive Clojure implementation here.
What I found missing is developer reports on the use of these tools in real-life situations. I get it. It’s hard to compare those tools with each other: GenerativeAI non-deterministic nature, various AI models, various clients using it with tools or MCP, the quality of the prompt written by the developer, the state of your codebase … and so on. Add to that the hype train of everything needs AI versus what AI actually does today is hard to challenge.
So here is my report after using Cursor and Clojure-MCP for a few months. I hope it inspired others to do the same and that we can all grow together with this wave of tools.

Cursor without clojure-mcp
Clojure being a niche language, the initial experience was frustrating. It could not get the parentheses right: more than half the time, one was missing, and the more the edits happened, the more functions started to be mismatched with each other. Even for the simplest tasks, it was often wrong. It was pre-Claude-4 (today’s best coding model with GPT5, as far as I know), maybe it could figure it out better now? Maybe not.
Cursor agent mode made it even worse than without agent: the clj-kondo hint about wrong parentheses confuses the agent that starts to move parentheses here and there until it decides to either give up or rewrite the entire file…
Even simple things took way too long, and debugging the parentheses imbalance in Lisp is not something I enjoy particularly.
Arrive clojure-mcp
https://github.com/bhauman/clojure-mcp
You might recognize the author’s name because he also created figwheel.
From the README:
Clojure MCP connects AI models to your Clojure development environment, enabling a remarkable REPL-driven development experience powered by large language models (LLMs).
With it, the agent experience got a lot better. No more imbalance parentheses with dedicated tools to edit s-expression, and it speeds up the agent loop considerably by using the REPL. The agent can try a solution directly (ie. without JVM restart) and see if it works or not. Pretty much a human will do. After a few tries, it writes the solution in the file. I don’t have metrics on the speed of Clojure-MCP versus the vanilla cursor edit, but from my impressions, it goes a LOT faster to write good code solutions.
Context of my ongoing project
To give some context (which matters a lot, but more on that later), on what I’m working on. I start a new project from scratch using this great template: clojure-stack-lite. It comes with everything you want for starting a web project: Integrant, Deps, Babahka tasks, Tailwind, CI, clj-kondo, Reitit… Kudos to Andrey, it is a great work and makes it easy to start something new.
Why does that matter for my AI setup? With a small codebase, it’s harder to get lost. Or with fewer tokens, the code proposals are better. With a working reload workflow, Clojure-MCP can iterate more easily until it finds a working solution, and if it goes in the wrong direction, you can just (user/reset) and you are clear to go. With different codebase maturity, you will get a different experience.
Retrospectively, another interesting aspect of this codebase is that with HTMX, my templates are present in the codebase like the rest of the code. Due to how verbose HTML (ie. hiccup) and Tailwind are, the codebase quickly gets dominated by views, which significantly increases the amount of tokens (higher cost and latency of every model call). I would be curious to compare it with a solution where the templates are in different files, like Mustache/Handlebars or something like that, making it clear where the views are and where the logic is. I guess I could force it in the code, but having less power in the templating somehow deserves me here.
Worth mentioning, I started from a few HTML mockups that I built with Claude-4. I use this mockup in my prompt, so the AI should have a good idea about the expected visual side with a syntax close to Hiccup.
Building full features
The first few features I built entirely with the Agent, I used Claude-4 as a model (let’s start with the best, right?), and how boy, it felt like magic. What I thought would take me a few days to implement was done in a few hours with some minor changes and follow-up prompting to help the agent go in the direction I want. I imagine it’s matching the “vibe coding” experience. You still need to know and explain clearly what you want, but a lot of things are figured out by the model and by trial and error with the REPL.
I forgot to mention that I’m also using playwright-mcp, which allows the agent to reproduce what I see in my browser and fix the error that can appear in the console. Feels like magic when you are describing something, and the Agent changes the layout according to your instructions. As a side effect, I rarely open the Tailwind documentation despite not being familiar with it.
Over time, the code accumulates small problems
Nothing major, but I realize that the agent did not refactor the code on its own, like a human would. I had to actively do it. If I don’t, and since the Agent takes inspiration from my existing codebase to produce new code, my codebase will exponentially compound in the wrong direction.
An example of a missing refactor that a human would have naturally done: introducing UI components like button and title. Instead, the agent keeps repeating the same code over and over, which increases the code length and makes it hard to read (1k lines of hicup is not handy to browse => if a function does fit on my screen, it’s way too long)
All your tech debt will be repeated over and over forever.
No tech debt will stay hidden.
The view in the handler that is small enough that you did not want to create a namespace for it, the function in a let that you already repeated a few times instead of doing a defn. The Agent will reproduce your mistakes continuously until you actively instruct it to refactor itself, or (often faster) refactor it yourself as soon as you see something looking remotely like a code smell.
Another way to think about it → if you put garbage in, you get garbage out. With a smile because the AI models are always approving you're doing!

Over time, namespaces are becoming larger as the agent does not make architectural changes on its own.
My project has become sizable, something like a small SaaS built by a company with full-time employees. Then it becomes impossible to build full features in one prompt. Instead, I had to target prompts for specific areas of the feature I want to build. I think the Agent is now reading too many tokens to iterate effectively. Something I can observe in the Cursor analytics: prompting for “simple” things now takes a few million tokens. Note that starting from 1 September, I cannot observe it anymore… the feature now comes bundled with Cursor 1.5+, whatever that is 🤦
Anyway, back on building features. Having large namespaces including code smells makes my codebase not AI-ready. Cursor rules and memory are a way to help the agent, but complementary to that, you need to refactor and have a clear architecture. Like for a human developer, your namespace needs to make sense, stay small, have comments and docstring when it matters (Models have the tendency to add useless comments like get-name : return the name. Yeah, thank you, Sherlock… I remove these comments as soon as I see them). Stay ahead on the upkeep of the codebase, make it work better with AI (and for humans too!). Try to think that every time you are starting a new agent, it is like a new junior hire that knows nothing about your codebase. The junior is relentless and has the entire internet in its brain.
Cursor auto mode
I get the Cursor intention here, but unfortunately, since it’s a black box, it’s hard to know when to use it. Sometimes it produces good code, sometimes it’s pure garbage with the linter screaming at you. Clojure being a niche language, I’m guessing that the routing between the model done with the auto is not matching the complexity of the task accurately? For example, refactoring the copy/paste in a test file to turn it into fixtures and reusable helpers is not complex per se, but apparently too complex for the model to pick up by auto. The fact that Auto is not showing which model he ends up using makes it a tool hard to use. I hope that the cursor team will improve it because I think it makes sense not to always use claude-4/gpt-5 for doing something. Reading a file with a top model is just a waste of tokens/money/infrastructure.
Related to that, Cursor is being updated frequently, but there is no clear changelog for it. With all the non-deterministic factors of having an agent coding for you, it adds an extra random factor: do I do things better today, or does Cursor change something 🤔. Coding with AI means you have to accept experimenting with a lot with fast-moving tools anyway, so what’s the problem with another vector of randomness? Still, it’s somewhat frustrating not to know what you are even using. Hopefully, tools will get better over time to trace your AI-assisted work and make suggestions to improve your technique (is there already a company working on that?). Some of the companies building AI agents are already working on thread sharing, which will bring explainability as a side effect.
Cursor auto-complete
The initial feature that was talked about when Cursor got out.
By default, I disable it. I find the current UX distracting with the constant highlights and the mixing of normal auto-completion (powered by LSP, or VS Code itself) with AI suggestions. The only case that it can eventually be useful is for writing docstring (careful of hallucination) and doing repetitive changes that cannot be done quickly with the IDE or that are difficult to explain by prompt. The case where you have to edit 20 function calls to add an argument, or tedious stuff like this.
Clojure-MCP learning curve
The MCP comes with a lot of tools; some are used by the model itself, and others need to be explicitly mentioned in your prompting. I have not tried them all yet, but one I discovered a few weeks ago, and that I now use often, is code_critique. I think it’s the easiest feature you can start with if you have no prior experience with AI-assisted coding. The prompt I use is a variation of “Use code_critique to review a list of files”. It works extremely well, making great suggestions following Clojure best practices, and it can implement them correctly most of the time. This makes me think that I should improve my prompting to drive the agent in the right direction. For reference, the code_critique prompt.
Using multiple agents at the same time
I experimented a bit with that, but not as much as I would like.
How it works: You can start multiple Cursor threads (or use the cursor-cli) to work on different features at the same time. The agent iterates like it does with a single Thread.
I think it’s another game compared to single-agent coding. There are a few requirements to start:
- Your codebase needs to be organized enough that you are confident that the 2 agents are not gonna write in the same files. Of course, you also need great documentation to be confident that the Agent knows where is what.
- Your mental state needs to be in a deep flow. Writing software is often a task that requires stacking information on top of each other until you get to work on the piece you need. Imagine stacking X times the amount of information to follow the construction of different features. It’s where our childhood playing Starcraft will make you shine!
I do it occasionally for small things like updating dependencies, refactoring a precise file, or writing unit tests on a consolidated feature, but I did not build 2 distinct features at the same time. I think this setup is not great for that because the filesystem with git is unique, as well as the JVM. Theoretically, you can have multiple users interacting with the same REPL, but in practice, they might misguide each other, adding more non-deterministic elements to the party.
I think a more practical setup is to have multiple clones of the same project and multiple REPL starts. So, replicating 2 developers working on different features, merging with the latest branch when they are done.
Another practical scenario, which brings us back to making architectural choices to make the best of AI agents. For example, a better separation of frontend and backend. With HTMX, the boundaries are blurred. With a SPA and a backend, once you get a clear input/output contract between the frontend and the backend, 2 agents can work in parallel to build those.
Still, it’s something I want to experiment with in the future when the tooling becomes more mature, and when my codebase will get big enough for productivity gain.
Last words
I see AI agents with repl-driven development as a powerful tool to significantly increase productivity. I feel that I’m only scratching the surface of what is possible today, and the tooling is still really young. In the near future, software development will be vastly different than what it used to be.
Other Clojure-specific tools I have not used (yet), but could fit your workflow better than Cursor and clojure-mcp:
- https://github.com/PEZ/backseat-driver A ChatGPT like in VSCode
- https://github.com/karthink/gptel A simple LLM client for Emacs
- https://github.com/editor-code-assistant/eca AI pair programming capabilities agnostic of editor
For the Clojurist not using agentic coding, I hope it gives you a reality check on it. For those we do, please share what is working for you 🙂