Close

Coding with GenAI: The Current Landscape

By Curtis Michelson |  August 11, 2025
LinkedInTwitterFacebookEmail

How both technical and non-technical users are finding value from the hottest new programming language: English (Updated in August 2025)

Is generative AI on the verge of becoming a powerful force in software development?

In 2025, coding and software development support have emerged as one of the most prominent and rapidly growing use cases for generative AI. The code-generation sector is projected to experience a remarkable 53 percent compound annual growth rate (CAGR) from 2024 to 2029, outpacing other generative AI modalities, according to a June 2025 S&P Global report. Startups like Cursor, building coding assistants, sport multi-billion-dollar valuations and hundreds of millions in revenue. And InnoLead research earlier in the year found coding to be one of the top five ways that innovation and R&D groups in large organizations are using generative AI.

For both professional developers and ordinary folks like myself who want to set up a website quickly or build a data dashboard, the benefits of reduced development time, radical ease-of-use, and the ability to move quickly on one’s own without the inertia of a whole team are obvious. And, the risks commonly associated with large language models (LLMs), like hallucinations, are markedly reduced. (More about why that’s the case in a moment.)

I took a deep dive into how people are using these tools today; the risks and potential return on investment; some of the most popular and powerful platforms; and a few of the open questions about where this all goes.

The Current Landscape

Like all aspects of genAI, codegen is rapidly evolving. The market roughly breaks into tools for professionals and tools for everyone else. The 2025 AI Index Report from Stanford University’s Human-Centered AI center found that professional developers make up just 3.4 percent of the U.S. workforce but already constitute 37 percent of the conversations in Anthropic’s Claude. Anecdotally, many developers, including ones I interviewed for this article, give Claude an edge (for the moment) over Gemini or ChatGPT or others when it comes to general software development.

Anthropic just released their Economic Index Report, looking at the impact of AI on software development, noting that far more people are using its Claude Code AI tool to automate the work of coding — do the work — rather than to augment their own work. And the company found that “startups are the main early adopters of Claude Code, while enterprises lag behind.” 

The term “vibe coding” was coined by one of the OpenAI founders and former head of AI at Tesla, Andrei Karpathy, who used the phrase offhandedly in a tweet that went viral. It refers to the zen-like flow state one can enter by talking to a coding agent and seeing your app rendered on-the-fly. Traditional software development typically had a strong separation of specialists: designers in one department and coders in the other. Today, vibe coding is eroding or at least blurring these lines, as design teams whip up code and coders deliver designs, and both camps do so in minutes, days or weeks.

If you can just type, “Build me a slick, cool B2C website using Ruby On Rails, with a look-and-feel that is a cross between low-fi retro CraigsList and shiny Spotify” and 2-3 minutes later (a typical turnaround time for a tool like Claude) you have something halfway decent, then why not? That fast bypass of the “blank page” or “cold start problem” may be the thing that unblocks any team’s imagination and gets a project moving.

Where is The ROI?

A value point for all users but especially for software teams is the fast troubleshooting capabilities these tools offer, helping developers not get bogged down reading software documentation. A good example of this is something Gavin Hupp, VP Technology at United Parks & Resorts (which operates SeaWorld, Busch Gardens, and other theme parks) shared with me. Recently his team was a bit stuck trying to integrate with a vendor’s API interface. Hupp said, “In the past, we would have spent a day or two with an engineer dedicated to the problem. Now the team figured out a solution in minutes.” 

I heard a very similar story when I recently spoke with Aman Khan of Arize AI about his team’s uses of genAI, and he described a similar scenario where one of their large SaaS customers wanted to track how many of their team were using the Arize product; basically, they wanted a custom dashboard. While his team was debating some different approaches, considering putting an engineer on it for a week, Khan said, “I literally just went to Replit, prompted it with “build me a usage dashboard,” and it built me a Streamlit-based dashboard for that specific company and hosted it at a URL with authentication. I was like, “OMG, while the team was talking about this problem, I literally just built the prototype.”

The other big category of value that I heard from my interviews with pro coders is the ability to take on projects that were formerly too ambitious or even impossible. Axel Rivera is a pro iOS developer who delivers SaaS B2B apps in the finance sector. He knows Apple’s programming language Swift very well, and he’s a regular user of their integrated development environment or IDE, which is called XCode. But even for Rivera there are many areas of Swift / iOS coding he has yet to explore but he has found a friend in CodeGen. As he said, “There are projects or ideas that a couple of years ago I wouldn’t have touched. Now, I can get a crash course on some obscure thing like a new video encoder, and I get a fast prototype with clear implementation examples. It is extremely satisfying that I can do things I wouldn’t have considered before.”

Despite these clear advances, new research analyzing real development teams is revealing a kind of productivity paradox. In this study published July 12, dev teams in the research expected a 20-25 percent gain in efficiency, but the controlled test found they were 19 percent less efficient in actual code completion time. They attempt to understand this discrepancy in their “conclusions” section. For pro teams who have been working with real complex systems for many years that they know very well, the tools introduced a false optimism and actually made their situation worse. In the words of one participant, “[codeGen] made some weird changes in other parts of the code that cost me time to find and remove […] My feeling is the refactoring necessary…was ‘too big’ [and genAI] introduced as many errors as it fixed.”

Similarly, a July StackOverflow survey of 12,823 developers found that even though developers reported personal productivity rising (as seen in the testimonials from Hupp and Rivera above), few dev teams report actual team-level productivity boosts. What this points to is the all-too-often forgotten issue of constraints and local optima. Improving one segment of a collaborative workflow may cause downstream bottlenecks that slow the whole system down overall. In the genAI era, this system-level pattern will extend beyond software development teams, out to innovation, R&D, marketing, and other enterprise workgroups.

How and Why Does GenAI Coding Work So Well?

All LLMs can feel a bit mysterious. We’re told they are next word prediction machines. But that seems an inadequate descriptor for this surprising and even spooky experience we get when vibe coding, or vibe writing or vibe music making, etc.. At its most fundamental level, LLMs are indeed stochastic parrots. Trained on all the world’s published knowledge, their outputs mirror all that they have seen before in their training. The surprising quality of our LLM interactions comes from the fact that deep multi-billion parameter neural networks are exquisitely sensitive to small variations in natural language inputs; i.e., to our prompts. As anyone using these tools for writing emails or making pitch decks knows, being very detailed and precise in your prompting (inputs) makes an enormous difference in the quality of output. Prompt engineering became a cottage industry just for these reasons, and that’s also what has spurred the recent interest in meta prompting, the trend to have LLMs write those highly nuanced prompt inputs on our behalf.

Claude-generated code from my project to analyze SEC filings.

However, in the arena of stochastically parroting code, as many research studies have shown, code generation systems deliver higher quality output more consistently than standard plain language interactions. The reasons for this are three-fold:

  1. Code is by its nature more structured, offering a more constrained data space for LLMs to train on. 
  2. The outputs of codegen systems are immediately testable; code either works or doesn’t. 
  3. There are clear evaluation benchmarks (good, better, best) for code quality that permit easier fine-tuning and model refinement, both human-in-the-loop and automated fine tuning.

As a non-coder myself, I’ve been shocked by the speed and quality of code that I get despite my rudimentary level prompting. For example, when I needed to access a list of public company 10k filings, I panicked because I found the byzantine EDGAR search interface intractable. But, asking Claude for help, it identified a well documented open source code project called sec-api for accessing SEC filings. I asked Claude to read the api documentation and then I literally said, “Write a script so that I can connect to and download a 10k filing for a company.” In seconds, I had the following python script (see below), which I really didn’t even know how to use at first. Claude walked me through how to open Terminal on my Mac and execute that script. I hit one error, but when I pasted my error from Terminal into Claude, it had a workaround. Within minutes, I had a bunch of 10k filings and went on my way merrily analyzing.

If I can do this, then just about anyone can.

Should We, Just Because We Can?

A very hot topic in the CodeGen space at the moment is centering around the risks of letting novices into the world of professional coding, and the risks of creating inflated expectations for end-users who might not know that there be dragons in the back of the cave; i.e., there may be lurking bugs or generic clunky inefficient code that “works,” but is non-standard practice and very likely to be a resource hog at the best case, or in the worst case, a security hole.

One particular vulnerability identified in April of this year is “SlopSquatting.” Yes, you heard that right. SlopSquatting gets its name from two sources. “Slop” is the term for genAI generated pablum that is now passing for “good enough” email, marketing, sales and customer services copywriting. For example, if the word “delve” is used, as in, “let’s delve into the future of AI in this podcast”, it likely came from an LLM. Certain words are more favored by these systems than others. And “squatting” comes from the Web 1 era, when there was a gold rush on domain names, and people would literally buy up likely high-value names and sit on them. This new slopsquatting vulnerability comes from malicious hackers who know that codeGen LLMs will occasionally hallucinate the incorrectly spelled names of software libraries. These hackers then create and post their malware to open repositories like Github and cleverly name their poison packages with the same names that the LLMs hallucinate. Unwitting developers pull that malware into their projects.

Another vulnerability made painfully and publicly manifest in July is the risk of live production failures when these tools connect to live external services. Typically, one’s CodeGen work should stay in a sandboxed environment away from production environments. But as Jason Lemkin discovered as described in this article, his Replit CodeGen agent deleted a production database, taking his live system down. Worse, when prompted to explain its action, the Replit agent denied ever taking the action. To the company’s credit, Replit was able to restore Lemkin’s data, but not after the PR damage was done.

Who Should We Be Watching?

The world of CodeGen startups is roughly cut into three big groups. The first is general purpose multi-modal models targeting broad consumers and developers, such as Claude, Gemini, ChatGPT, etc. This is where many non-programmers like myself have discovered the “joy of coding” by creating all kinds of simple scripts and very basic apps.

Then there is a middle niche of prosumer tools (like Bolt, Replit or Lovable) which target creative and innovation professionals who want to go from idea to working app really quickly. The output from these systems come wrapped with hosting capabilities and their models are trained and tuned on modern web design palettes. A new term-of-art for these users are “citizen developers,” who come to these CodeGen tools from all parts of the enterprise — product management and innovation teams certainly. In Hostinger’s 2025 survey of low-code trends, they found these citizen developes already outnumbering professionals, and said they expect the gap to increase further. In other words, this is the high-growth segment for CodeGen.

Finally, the third segment would be true professional grade coding tools (such as Cursor, Poolside, Windsurf) that may be accessible to novices like myself, but are really tuned for software teams collaborating on big systems. These tools can automate much of the drudge work of dev teams like spawning cloud instances of platforms, creating commits and PRs (“pull requests”) into code repositories like GitHub. And they are fine tuned for identifying and fixing bugs in code as well, many of them automating such processes.

Below is my short list of the most well-funded and well known CodeGen tools in all three categories, and a bit about each of them.

Consumer CodeGen Offerings

NameKnown ForInvestors
ClaudeMost folks (today) are giving Claude the edge for homebrew coding. Outputs are rendered in what Anthropic calls “artifacts,” which are like mini built-in web browsers for previewing.*Anthropic
ChatGPTSports almost the same functionality as Claude but also offers a wider set of models to coding and code reasoning such as its highly regarded “o4-mini-high”OpenAI
GeminiThis is probably the one to watch as Google begins to bring its full deep learning expertise to the Gemini suite. They have many CodeGen developer offerings already as well.Google

Prosumer CodeGen Offerings

NameKnown ForInvestors
LovableIt has certainly the most endearing name of them all, and it is loved by coders and non-coders for its lovely design touches and React-style code output.Creandrum, byFounders, Hummingbird
ReplitThis is very similar in capability to Lovable, but has a Ghostwriter AI assistant. Also distinguished by its built-in database editor, user authentication, and the ability to deploy apps directly from the browser. Craft Ventures, Andreesen Horowitz, Coatue Management
BoltAgain, very similar to the Lovable and Replit, but also has WebContainers, technology that runs dev environments directly in the browser, making it more cost-effective.Emergence, Google Ventures, Madrona, Mantis

Developer CodeGen Offerings

NameKnown ForInvestors
GitHub CoPilotIts inline code-completions across many dev evnironments (IDEs), helping developers rapidly write common code snippet. An Accenture study pegged the CoPilot productivity enhancement at 55 percent.Owned by Microsoft
CursorThis is an AI-powered coding tool made by Anysphere. They have one of the largest user bases and are the most well funded. An assistant for engineers.

Cursor’s nearest competitor is Devin from Cognition Labs, which incidentally acquired Windsurf on July 14th, another competitor in this category.
Thrive, Andreesen Horwitz, OpenAI
PoolsideThey are developing engineering grade CodeGen. They are differentiated by their proprietary coding models that continuously learn how developers write code.Bain Capital Ventures, Felicis, Redpoint Ventures

* A breakaway hit new product from Anthropic is Claude Code, released in March. It has received widespread and adoption from prosumers and pros alike. This is a new category of CodeGen tool that runs offline on one’s own local system (in a command-line terminal), and it operates more like an “agent” handling tasks and writing changes to a local copy of one’s project. This provides the sandboxed safety needed to test in a local environment, and not impact production systems. In my personal testing, once I got over the shock and horror of working with a blinking cursor at the command-line, I found the experience not only engaging and fun (Claude Code uses gerunds so clever, your 12th grade English teacher would approve) but it is also much faster (perhaps 2x to 3x) for generating code.

How You Can Start To Test These Tools

A good thing to do when starting with any of these tools is to first ask it for its own capabilities. You’ll get a refreshingly sales-lingo free output that lays out in clear concise terms the precise features and limitations. 

Here is Google Gemini’s response to my query (prompt) “tell me about your coding capabilities” on May 21st, 2025:

As a large language model, I have a range of coding capabilities developed through training on a massive dataset of text and code. Here’s a breakdown of what I can do:

Understanding and Generating Code:

Multiple Programming Languages: I can understand and generate code in a wide variety of popular programming languages, including:
Python
JavaScript
Java
C++
C#
Go
Ruby
PHP
Swift
Kotlin
SQL
HTML/CSS
And many more.

Code Comprehension: I can analyze and understand existing code, identify its purpose, and explain its functionality.

Code Generation: I can write new code based on your descriptions, requirements, or prompts. This can range from simple scripts and functions to more complex algorithms and boilerplate code.

Code Completion: I can suggest completions for partially written code, helping to speed up the development process…

Coding-Related Tasks:

Debugging Assistance: I can help identify potential bugs in code by analyzing error messages and the code’s logic. I can suggest possible causes and fixes.

Code Optimization: I can suggest ways to improve the efficiency or readability of code.

Algorithm Design: I can help you think through and outline algorithms for specific problems.

Explaining Concepts: I can explain programming concepts, syntax, and best practices.

Writing Documentation: I can help generate documentation for code, such as comments or README files.

Unit Test Generation: I can assist in creating unit tests for your code to ensure it functions as expected.

Regular Expressions: I can help you create and understand regular expressions.

API Usage: I can provide examples and explanations for how to use various APIs…

How I Can Help You:

Learning: If you’re learning to code, I can provide explanations, examples, and help you work through exercises.

Prototyping: I can quickly generate boilerplate or initial versions of code to get you started.

Problem-Solving: You can describe a coding problem, and I can suggest approaches or code snippets.

Brainstorming: I can help you explore different ways to implement a feature or solve a problem.

Important Limitations:

No Execution Environment: I don’t have an actual coding environment to run or test the code I generate. You’ll always need to test the code in your own environment.

Potential for Errors: While I strive for accuracy, the code I generate might contain bugs or not be perfectly optimal. It’s crucial to review and test it thoroughly.

Context is Key: The quality of my output heavily depends on the clarity and detail of your prompts. The more specific you are, the better I can assist.

Security Concerns: Be cautious when using AI-generated code for security-sensitive applications. Always review it for potential vulnerabilities…

In essence, I can be a powerful coding assistant, helping you learn, write, understand, and debug code more efficiently. However, I am a tool, and human oversight, testing, and critical thinking remain essential in the software development process.
(Ellipses indicate where we truncated Gemini’s response.)

Where Is It All Headed?

Perusing the above list of CodeGen offerings, nobody has a crystal ball, and certainly much consolidation is expected. Windsurf won’t be the last one snatched up. But despite the uncertainties, it does appear that these startup investments are dividing up between professional and consumer plays, with the pro offerings getting the lion’s share of the capital. 

Here are some final open-ended questions I believe worth pondering about all of this.

As “citizen developers” become more numerous in large organizations, how will governance frameworks adapt to ensure security, compliance, and maintainability at scale? As CodeGen tools improve and are eventually  able to quality-control and debug their own outputs such that IT teams trust them in the hands of all business users, software development is democratized. That will certainly change the role of anyone who trained to be a coder, tester, or software development manager. But the changes in how software gets created will be just one component of the emerging AI-driven future of work. How will enterprise culture change if the fastest path from idea to app becomes describing it, and “coding” becomes a secondary skill?

Is there an under-appreciated opportunity for investors in the much larger consumer space?  As the Hostinger survey noted, citizen developers will far outnumber traditional ones. Perhaps we’ll see more consumer-grade dev tools for for specific domains. The education sector seems like an interesting terrain for this — for instance, a “Khan Academy”-type offering just for getting students from idea to app in seconds. And will investors want to bet on startups, or will they hedge their bets knowing that the large model providers (OpenAI, Anthropic, Google, etc.) will have the most leverage to create such verticals themselves?

Are the agentic CodeGen platforms the new arena of competition? With the runaway success of Claude Code and the massive valuations for startups like Anysphere, Cognition Labs, and Magic, it seems the agentic interface may be the new integrated development environment (IDE). Perhaps the future is about becoming the Slack/Figma/GitHub of autonomous user intent capture and hyper-fast codegen.

Is the new AI arms race in the CodeGen arena moving to the model level? Poolside is being funded like an LLM-scale foundation model company. Windsurf’s acquisition suggests more consolidations to come. It appears that domain-specific models for code (distinct from general LLMs like Claude) are worth competing for. More capital may flow into vertical foundation model plays with code-specific modalities for things like security, or governance, or code quality.

Are we witnessing the death of “learning to code”? If anyone can build software through natural language, what happens to computer science education and the concept of programming as a learned skill? Are we creating a world where understanding code becomes as niche as understanding assembly language? Burning Glass’ latest report “No Country for Young Grads” presents the stark case for this emerging reality.


InnoLead members can access this article in PDF form for easy sharing in our resource library. (Featured image by Nubelson Fernandes on Unsplash.)

LinkedInTwitterFacebookEmail