I’ve been experimenting with AI coding tools, mostly Claude, for the last few months. Here are some thoughts on my experience so far.
Things I have tried
So far I have tried using Claude for many different tasks:
- Solving a coding problem, such as creating a Python script to scan header files
 - Writing tests for existing code
 - Checking patches before I send them
 - Writing drafts for blog posts
 - Figuring out bugs
 - Adjusting code to make tests pass
 
First, a caveat. I am a beginner at using this technology and do not spend much time following leaders in this space. I started using Claude based on a family recommendation and am far from being any sort of expert.
The good
When I first tried out creating a new command in Claude, it produced code which compiled and looked like U-Boot code. This seemed quite magical to me and i left the computer with a feeling of awe and bewilderment. I have a basic understanding of how AI operates but it is amazing to see it producing code like this. Also the inference is very fast, sometimes producing hundreds of lines of code in a minute or less.
Claude works within your project directory. It can read and write commits, run tests, etc. For a code base like U-Boot with lots of tests, it is relatively easy to get it to make changes while ensuring that the tests still pass. It can execute most commands that you can. It asks first but you can add rules to a JSON file to give it permission to run things itself.
In most cases, Claude is able to solve the problem. For example, I wanted to build U-Boot as a library. I actually had some old patches for this, but I decided to give the take to Claude. I provided a lot of guidance on exactly how it should work, but Claude figured out the Makefile rules, iterating until things worked and fixing bugs that I found. I was able to make changes myself, tell Claude and it would build on them.
The easiest tasks for Claude are to extend and existing feature and the associated tests. The hardest are to refactor code in non-trivial ways.
One interesting case where Claude does well is dealing with conflicts when rebasing a series or applying patches. I have only tried this a few times but it has not made any mistakes yet. It is fairly slow, though.
The bad
Claude is a long way from perfect. It feels to me like a very junior but knowledgeable software engineer. It happily writes 600 lines of code when 300 would do. For larger tasks, the code feels extremely ‘flabby’ and a bit painful to read. Variables names are too long. Useless comments are added to obvious code. It happily creates highly redundant code instead of setting up a helper function first.
But with all of these problems, you can guide it to perform better. You can review the code and ask it to make changes and it happily does so. It might not be able to conceive of a refactoring that would improve code, but it can execute it.
For one tasks, Claude was able to create a 1200-line Python script in about an hour (with lots of prompting, etc.) that would have likely taken me 1-2 days (it is hard to be sure, since I didn’t do it). I then spent about 6 hours refacoring and rewriting the script after which I felt that the code wasn’t too bad. The end result was likely not as good as I would have done if I had written it myself from scratch, but it was fine.
Sometimes Claude gets stuck. It tends to happen when I am trying to get it to do something I’m not to sure about. It sometimes just gets the wrong end of the stick, or solves a slightly different problem. This is most common when first firing it up. After a bit of back and forth it settles in and makes progress
The ugly
What is Claude bad at? When running programs it can see the console output, but it cannot interact with the program, or at least I’m not sure how to make it do that. So for example it can run U-Boot sandbox with a timeout and it can even pass it commands to run. But it doesn’t know how to ‘type’ commands interactively.
Claude can sometimes change the test to fit the code, when the tasks was to fix the code. It tends to write non-deterministic tests, such as asserting that an integer is greater than zero, rather than checking for an exact value.
Claude can produce commit messages. It tends to be verbose by default but you can ask it to be brief. But I have not had much success in getting Claude to create its own commits after making a whole lot of changes.
Getting things done can sometimes just be very slow. I have not really compared the time to create high-quality, finished patches myself versus with Claude. That is something I must try, now that I am learning more about it.
My workflow
How do I actually use Claude? Well, first I figure out a design for what I want to do, typically a diagram or some notes. Then I ask Claude to perform an initial change to make a start. Then I build on that piece by piece until I have something working. I have not tried giving it a large challenge.
In some cases I create commits as I go, or even ask Claude to do this, but mostly I just create the whole thing and then manually build the commits afterwards. For code that Claude created in whole or part I add a Co-developed-by tag.
Breaking things into chunks saves time, I think. Cleaning up hundreds of lines of AI-generated code is very time-consuming and tedious. It is easier to tweak it a bit as I go.
A note on Gemini
I have tried Gemini CLI and sadly so far it has not been useful except in a few small ears. It is quite slow and seems to go into the weeds when any sort of moderate challenge. I expect this to change rapidly, so I keep trying it every now and then. I also use Gemini to create most of the featured images.
I have not tried OpenAI so far.
What do you think?
Leave a message in the comments if you have any thoughts or suggestions!


