
Most online tutorials for AI coding tools focus on trivial applications: snake games, calculators, or simple to-do lists. While these examples are interesting demonstrations, they provide little guidance for building complex enterprise software with multiple interacting components.
We decided to put AI coding to a more rigorous test by embarking on a complex project: a comprehensive personal finance application with budgeting, transaction management, and investment portfolio features. This wasn’t a demo app. It was a system with multiple aggregates and real-world complexity, the kind of application enterprises build every day. Aggregates, as self-contained clusters of domain objects, provide an excellent measure of system complexity since they represent complete business capabilities that must function cohesively.
When we started, we were quite skeptical about AI coding, but decided to go all-in to test the limits. As we progressed, our skepticism diminished. We discovered the limits were much further than we had supposed.
Throughout our journey, we learned valuable lessons about what makes AI coding truly effective. The key insight we want to share with you is that the most critical factor in successfully building with AI isn’t clever prompting techniques or understanding AI internals, but something much more traditional: detailed software requirements and specifications.
This article is the first in a series exploring AI-augmented software engineering practices. In this post, we focus on what may be the most important aspect: the role of structured requirements. Check out our next article in the series, ‘AI Software Quality Assurance: Testing Strategies for AI-Generated Code at Scale’.
The Vibe Coding Trap
When we think about AI coding, it exists on a spectrum. You can start with no AI at all (traditional coding), add AI code completion (GitHub Copilot), use AI to support your design process (ChatGPT for debugging or answering questions), or go full agentic AI coding. We chose the latter route.
We started with what’s commonly known as “vibe coding” – an approach that involves giving AI general directions and letting it implement solutions without detailed specifications. This approach works reasonably well for simple applications, and initially, we did make progress on basic functionality.
The problems arose when we needed to add more complex features to our enterprise application. For example, implementing AWS Cognito security became a significant challenge. We spent eight hours and over $200 on multiple attempts with different prompts, and still ended up with nothing functional. With Claude Code, you pay per token (essentially per word) processed, so complex conversations with multiple iterations become expensive quickly.
Why did this approach fail for complex features? Unlike experienced human developers who can ask clarifying questions and fill in gaps, AI tools operate within the boundaries of what they’ve been explicitly told. What works for a simple snake game doesn’t scale to an enterprise application with multiple aggregates, complex business rules, and integrated components. Without clear specifications, AI makes assumptions, often incorrect ones, about your intentions and requirements.
Going Back to Waterfall (But Faster)
After our frustrating experience with complex features, we changed our approach dramatically. We decided to go waterfall because good practices matter. This might seem almost antiquated in the age of agile, but we wrote detailed documentation, including:
- Domain descriptions using Domain-Driven Design principles
- System architecture with clear separation of concerns
- Feature specifications with user journeys
- API contracts between frontend and backend
The key difference from traditional waterfall? The cycle from initiating requirements documentation until delivery can now take as little as 15 minutes to an hour. We’re not talking about the Niagara Falls of development processes anymore – this is more like a quick garden waterfall feature.
Importantly, we don’t create these specifications in isolation. We co-create them with AI, asking it to review, enhance, and add more details to our initial drafts. The AI becomes a partner in the specification process, but we always remain in the loop to verify everything. This verification step is crucial, as the AI occasionally introduces assumptions or details that don’t align with our vision.
The results were transformative. Suddenly, Claude Code went from generating disconnected, non-functional code to producing cohesive, working features that integrated properly with our codebase. It was amazing how well it worked.
This led us to a counter-intuitive realization: AI tools don’t reduce the need for thorough requirements and specifications. They amplify it. The more structured and detailed your requirements, the more powerful AI becomes as a development partner. And this principle extends to other established good practices in software engineering, which we’ll explore in future articles of this series.
Building Your AI Development Framework
The foundation of our successful approach centered around several key elements:
The Central Context File: AI’s Minimal Brain
We created a central context-loading file that our AI tool reads every time it starts. This file serves as a minimal “brain” for the AI, containing:
- Project overview
- Core domain concepts
- Development requirements
- References to detailed documentation
Inside this file, we defined the context that would be available in any prompt. It should be short and focused only on the main rules that needed to be followed. For more complex tasks, we explicitly instructed the AI to “Read doc @docs/functionality_overview-budget.md and @docs/plan_budget.md and respond OK” to preload additional context.
Interestingly, we initially tried to add more context to this central brain, but discovered it was unnecessary for minor bugfixes and small feature additions. For these cases, a minimal context was actually more effective, allowing the AI to focus only on what was relevant to the immediate task.
# Project Overview
Our application is a comprehensive personal finance management system integrating:
- Zero-based budgeting
- Transaction tracking and categorization
- Account management with import capabilities
- Investment portfolio management
# Domain Concepts
- Budget: Zero-based monthly budget with categories
- Transaction: Financial movements with budget assignment
- Account: Banking accounts with transaction import
- Portfolio: Investment tracking with allocation
# Development Requirements
- Test-First Development: Create tests first, verify they fail (red), then write code to make them pass (green)
- No mocks unless absolutely necessary
- Small functions: keep focused (<10 lines)
- Small files: maintain files under 200 lines when possible for better readability and maintainability
Clean Architecture and Consistent Naming
We decided to follow Clean Architecture (or Onion Architecture, if you prefer the alternative name) to structure our application. This approach, with its clear separation of concerns and dependency rules, provided a solid foundation that AI could easily understand and maintain.
All services, controllers, DTOs, and domain objects are named consistently, typically using suffixes (e.g., BudgetService, TransactionController, CategoryDTO). This naming consistency helps tremendously when instructing the AI to find the right classes to modify. Tools like ArchUnit for Java can enforce these architectural boundaries in your test suite, ensuring long-term consistency as your application grows.
We found that AI handles all the boilerplate code without any issues, producing clean, well-structured code for controllers, services, and infrastructure components. The situation becomes more complex when dealing with domain logic, which is expected since the domain is what makes each application unique. This is where human oversight remains most crucial, reviewing and refining the AI’s understanding of our specific business rules.
Domain-Driven Design: A Language for Humans and AI
Building on our clean architecture foundation, we found that domain-driven design principles were extraordinarily effective when working with AI.
Who would have thought that software engineering practices from the early 2000s would become cutting-edge again in the age of AI? DDD was formalized by Eric Evans in 2003, yet it feels like it was designed specifically for helping AI understand complex domains.
By creating clear bounded contexts, aggregates, and naming conventions, we established a shared vocabulary that the AI could apply. Having very clear naming and using the same domain names makes sense and helps the LLM use these names consistently.
For our project, we defined subdomains, bounded contexts, actors, and properties of all aggregates. And by “we,” I mean myself, Claude, and my coworkers – it’s 2025, and AI is now part of the team that defines your domain model! This detailed domain model gave Claude a clear understanding of our system’s structure and ensured consistent implementation across features.
AI-Friendly Technology Choices
When selecting technologies for our project, we discovered what works best with AI coding tools. The key insight: follow the most common practices and popular technologies, as the AI’s training data was built on existing codebases found across the internet.
TypeScript worked noticeably better for us than Java, likely because of its massive adoption in open-source projects that would have been in the training data. On the frontend, despite React being more verbose and consuming more of our token budget than Vue would have, it produced higher-quality code. The AI simply had more exposure to React patterns and conventions in its training data, making it more fluent in React despite the verbosity disadvantage.
One important caveat: AI tools tend to import older versions of libraries and reference outdated documentation. Always ask the AI to check for newer versions and the latest API documentation! This small step can save hours of debugging deprecated features.
By choosing technologies with widespread adoption, we created an environment where the AI could leverage familiar patterns, recognized APIs, and established conventions in its generated code.
From Requirements to Implementation
With our framework in place, we developed a systematic process for implementing new features in our complex finance application:
- Create functionality overviews documenting the user journey and business logic
- Define API contracts between frontend and backend
- Break implementation into focused tasks
What we found essential was describing functionality from a user journey point of view. Giving examples proved particularly helpful.
For instance, when implementing the “add income category” feature, we created a detailed specification with user steps and API contracts:
# Add Income Category
## User Journey
1. User clicks "Add Income Category" button
2. Modal displays with name and initial amount fields
3. User enters data and submits form
4. Backend creates new income category
5. Frontend refreshes budget view
6. Modal closes
## API Contract
POST /api/v1/budgets/{budgetId}/income-categories
Request Body: {
"name": "string",
"initialAmount": "number"
}
Response: 201 Created
With this clear specification, we simply instructed Claude to “read functionality overview and implement add income category functionality” and it would typically complete the task successfully (sometimes).
One critical insight we gained was the importance of meticulously separating frontend from backend in all examples. This separation was essential. By clearly defining API contracts, we prevented Claude from placing business logic in the frontend, a common issue we encountered with less structured approaches.
Managing Complexity and Large Changes
While small features could be implemented with focused specifications, our enterprise-scale finance application required handling large, system-wide changes. When adding multi-currency support, a feature that touched nearly every part of our codebase, we created a comprehensive plan.
For such tasks, you encounter the problem with context management. Context is the conversation history that the LLM keeps in memory as you work. The more you type (and more responses you get), the more of this limited resource you consume. LLMs tend to focus most strongly on the beginning and end of this context, sometimes losing track of details in the middle. With complex features, you’ll quickly exceed what the AI can effectively process in a single conversation.
We asked Claude to generate a detailed implementation plan with discrete, trackable tasks:
# Multi-Currency Implementation Plan
## 1. Domain Model Updates
- [ ] 1.1 Update Money value object to include currency
- [ ] 1.2 Modify Transaction to support currency
...
## 2. Backend Implementation
- [ ] 2.1 Add currency conversion service
...
## 3. Frontend Implementation
- [ ] 3.1 Update transaction form to include currency selection
...
This plan-based approach allowed us to:
- Break the complex change into manageable pieces
- Track progress through the “done” checkboxes
- Keep Claude focused on specific tasks to avoid context overflow
- Maintain consistency even after clearing the context between sessions
For each task, we instructed Claude to execute specific tasks and do only pending tasks. This helps with managing the context window so you can focus on one smaller bit at a time.
One essential practice we discovered was to clear the context between tasks. As you work through the plan, the context window fills up with previous tasks and conversations. Claude Code, like other LLMs, has a maximum context size. Once it’s exceeded, the system uses context compression, which currently produces subpar results. Explicitly clearing the context and then reloading relevant documentation for the current task ensures the AI stays focused and doesn’t get confused by earlier discussions. The task list itself becomes a valuable tool for rebuilding the proper context when following the plan.
This structured approach transformed what would have been an overwhelming change into a series of achievable steps.
We also used a more direct approach for smaller bug fixes. When fixing a portfolio value calculation bug, we provided specific context about the issue and asked Claude to “fix it. Remember to first write a failing test (test first approach)” This approach worked well for isolated issues that didn’t require extensive changes across the codebase.
Practical Tips and Conclusion
From “What’s the vibe?” to “Here’s the spec!” – our journey from loosely-defined AI coding to structured development has taught us valuable lessons that can help any team looking to leverage AI coding tools effectively for enterprise applications.
Start with solid foundations
- Create clear domain models and ubiquitous language
- Document your architecture and design principles
- Establish a central reference file (like claude.md)
Structure your requirements effectively
- Break features into clear user journeys
- Define explicit API contracts
- Separate frontend and backend concerns
- Use plenty of examples to clarify expectations
Manage complexity strategically
- Create detailed plans for large changes
- Focus AI on specific, manageable tasks
- Review and validate generated code
- Implement a test-first approach for critical functionality
The most powerful insight from our experience is that AI doesn’t eliminate the need for software engineering discipline. It amplifies it. Teams with strong requirements engineering skills will extract far more value from AI coding tools than those relying on vague directions.
With our structured approach, we’ve transformed Claude Code from an interesting but limited tool into a powerful development partner. Our finance application now has over 500 commits built in 2 months, with multiple aggregates including budget, transaction, and portfolio.
Good software engineering never goes out of style—AI just makes it faster.
Interested in moving your development process to the next level with AI-augmented engineering? Let’s connect and discuss how your team can implement these practices to build complex enterprise software faster while maintaining quality and consistency.

