Time to cloc out?
Building a faster line counter with AI in two hours
Last week I wrote about using Claude and Rust to build tools to do a thing. But the thing was quite simple. This week I experimented with building a more complex thing to get a feel for the limits of current tech.
cloc
This time I’m going to build my own version of cloc - the classic source code line counting tool. It’s a moderately complex tool - the canonical version is written in Perl and is about 2.5kloc.
Here’s my initial prompt. I’ve attempted to set concise, clear requirements. I’ve also asked Claude to use Rust and use comprehensive UTs.
And then Claude generated code. But I didn’t download it. As normal, I ran this prompt a few times.
Each time Claude found bugs and fixed them. Eventually it seemed to be scraping the barrel of bugs so I stopped, downloaded the file, renamed it as main.rs and copied it over the original main.rs in c:\language\mynewtool\src.
Then I ran cargo test to compile and test the code. But, as is often the way, the code didn’t work first time:
This turned out to be my fault. Claude had hit the limit on how much it can generate at once. As before I told it to continue:
With the two parts glued together, I tried again. But it still failed. It took another five rounds of trying code, reporting the error to Claude and then applying diffs before the code finally compiled. Clean compilation and one test failure.
I just copy and paste the error into Claude. No fancy prompting - just the raw, unfiltered error. Claude knows what to do.
And it fixed it. Success.
Improving cloc
Then I decided to improve the code to add a few extra features:
Support for Perl extensions.
Support to scan a specific sub-directory.
Better progress reporting - particularly while scanning large directories.
This turned out to be more tedious. By default LLMs provides deltas, not full source files. Why? I suspect it’s to minimize compute. Providing a complete new source file for every iteration uses a lot of tokens and takes time - it can be 10-20 seconds for Claude to generate the output. That costs Anthropic money.
It also uses up the context window. There are about 7.5 tokens per loc - so 500 loc is ~4k tokens. Claude has a maximum context window of 200k tokens. So enough for 50 iterations - and by the time you include error messages, questions and other text there’s only really capacity for 20-25 iterations.
So I was now in a loop of applying diffs, attempting to compile and getting errors:
After feeding those back to Claude I got to a point where everything compiled cleanly. Even better the tests passed too.
Testing
Fortunately it’s easy to test this tool. Run it on the same codebase as cloc and compare the results!
Running cloc on the cloc codebase…
Let’s compare our new tool:
Hurrah! They agree! The new tool is more verbose. And it’s also faster - it can process 83 files/sec versus 26 for cloc. I guess that’s the advantage of Rust over Perl.
Let’s try something bigger. How about OpenSSL?
cloc says:
So that’s 617,200 of C (source & headers), 227,930 of Perl and 197 of Python. Let’s see what the new tool says:
The new tool doesn’t support as many languages. Nor does it remove duplicate files (cloc remove 60 duplicate files). And the output isn’t quite so nice.
But the results match well. 617,300 of C (0.01% difference), 226,191 of Perl (0.8% difference) and 195 lines of Python.
The new tool is also much faster. cloc took 4mins 30secs. The new tool? 10 seconds :). Of course the new tool isn’t scanning for duplicates, but even so.
And the size of the new tool? 488 lines of Rust.
Conclusion
Building something more complex requires more iteration. But I didn’t have to write any code. Nor debug anything. Realistically cloc is probably about the limit of what the best LLMs can single handedly cope with at present. For now, anything bigger will require some human oversight.
But it’s impressive. I’ve spent a couple of hours and got a pretty good tool. Thirty years ago the company I worked for had built its own line counting tool. A common new hire activity was to get them to enhance the tool - maybe adding support for a new language or enhancing the output. Those tasks often took the best part of a week. Now I can do it in less time than it would take to brief the new hire.
What's more exciting is how this process will evolve: the manual iteration steps will be automated by AI agents, and these agents will handle increasingly complex development tasks. We're moving toward a future where developers can focus on system design and innovation while AI handles the implementation details. For teams looking to experiment with AI-assisted development, tools like cloc provide an ideal starting point - complex enough to be meaningful, yet contained enough to be manageable.













