
There has never – literally never been a better time to build software as an individual developer.
Not during the open-source boom.
Not during the mobile app gold rush.
Not even during the early cloud era.
Right now, in 2026, we are living through something fundamentally different:
Software development has shifted from effort limited to imagination limited.
Developers, especially seasoned ones, have a ton of projects they haven’t worked on in years. There is a certain effort required to sit up all night completing a module and then attend the office the next day. I have a OneNote full of ideas I could have implemented for personal automation or projects I once believed could turn into startups.
For a while, I found some peace in self-hosting tools. I still do this using old laptops and Raspberry Pis. Many of the open-source tools I hosted were built primarily for the regions they catered to, which meant significant customization on my end. I spent weeks working on them during free and off hours. Many of those projects remain abandoned today, either because I moved on from the need or because the problem itself became redundant.
Those OneNote jotted ideas become part of my prompt to build stuff using LLM agents.
With locally hosted models, free-tier services, and basic subscription plans, one can now build a solid coding automation setup that allows multiple projects to be completed quickly and effectively.
Over the last couple of months, I have been spending evenings with Claude, Codex, and Antigravity building many of these old ideas and unfinished projects in hours. No all-nighters. Just in the past month, I completed eight different projects across multiple languages:
| Project | Language | Time | Key Feature |
| Non-linear Editor | Go | 4 Hours | A non-linear text editor where texts are arranged in a grid with contextual notes |
| Subscriptions Tracker | Go | 2 Days | Intelligent email scanning & categorisation for identifying subscriptions enrolled and costing. |
| Ebook Summarizer | Python | 2 Hours | Celebrity voice synthesis (Freeman/Attenborough) and read out book summaries for reading technical articles and ebooks both fiction and non fiction. |
| DJ Workflow | Python | 1 Day | DJ workflow for music downloaded. Creating proper metatdata. Auto STEMS. RAG on your library. |
Not to gloat, this simply demonstrates that developers can now build such tools easily with a few targeted prompts. Getting prompts right plays a crucial role, and that understanding only comes from building more projects. I rarely even look closely at the code the agents generate.
This is where I want to make one distinction very clear: these are truly vibe-coded projects. They are not production-grade or enterprise money-making products. These are tools to automate personal workflows – projects many of us wished we could prototype faster.
Productising something and making it enterprise-grade is still slightly beyond what agentic AI can fully solve, as it requires significant human involvement. Turning an idea into a revenue-generating product introduces hosting, support, maintenance, upgrades, and a wide range of operational concerns that affect cost and investment. It is best to treat these projects as stepping-stone prototypes toward something more meaningful.
The Stack Is becoming Accessible
But let’s be honest — a decent setup still requires either a paid subscription or reasonably capable hardware to run open models effectively.
While I accept that building a solid AI development setup requires some investment which may not be easily accessible to junior developers or undergraduates, the entry point has shifted higher. Development workloads that once ran comfortably on low-powered laptops now increasingly assume at least an entry-level gaming laptop or better. One could argue that cloud compute reduces hardware requirements, but that often increases token and access costs instead.
Serious AI coding capability is now accessible through predictable monthly subscriptions rather than enterprise budgets. For roughly the cost of a streaming subscription, you can realistically complete multiple projects each week within token limits. Claude’s paid plans, for example, begin around $20 per month and provide sustained usage for individual developers. Similar token-based subscription models exist across tools like Codex and Antigravity.
This changes the economics of experimentation.
Bigger Models vs Local Models
A 7B model that can iterate few times on a small function is often more useful than a 400B model that you can only afford to do it a few times.
An important observation while working with agentic tools is that larger parameter models, such as Claude or OpenAI’s flagship models, do not necessarily outperform smaller locally hosted models by a dramatic margin for many coding tasks.
They may be better but often not proportionally better.
With iterative agents, careful prompting, and tooling around the workflow, the performance gap narrows significantly. Since agents operate iteratively, model quality differences tend to even out over multiple refinement cycles.
For example, GPT-OSS 20B can run on a MacBook Pro or a moderate RTX 3060 (16GB VRAM) setup and performs well for coding workflows. Similarly, Qwen Coder 7B runs on even more modest hardware and delivers surprisingly strong results for structured development tasks. While these models may not match proprietary frontier models in every scenario, experimentation and disciplined prompting often compensate for the difference.
A Practical Solo Developer Setup
A solo developer today can assemble a powerful AI coding stack with relatively modest investment.
Paid Coding Models (Primary Engine)
A Claude subscription provides access to Claude Code tokens, which are often sufficient to build one or two moderately complex applications per week depending on scope. Codex and Antigravity offer comparable usage models. Using multiple agents on the same codebase increases iteration speed and expands the effective context window.
Free and Open Models (Cloud & Local)
Ollama enables running open-source models locally with minimal friction. Cloud offerings also provide limited free usage tiers. Larger open models can sometimes be accessed via cloud providers at low or no cost, depending on allocation policies.
Local execution remains an option for those with decent hardware. Tools like vLLM allow efficient model hosting, though setup is more manual and operationally involved.
Use the Cloud biggies (Claude/Antigravity/Codex) to design the system architecture and solve the impossible bugs. Use the local setup (Ollama/vLLM) for the 80% of development that is boilerplate, unit testing, and UI polishing.
How do I setup?
Prompt it and start hacking on your project! Use ChatGPT, Codex, Claude or Code. Or use any good local LLM tool installed.
Generate a install script that sets up LiteLLM with Claude, Codex, and Antigravity (keys via env vars), installs Ollama, pulls gpt-oss:2b, configures local-first routing with fallbacks, tests everything, and exits only if all checks pass. No Docker.


















