30 Days with Claude Code: A Brutally Honest Review

The Experiment

On day one, I committed to using Anthropic's Claude Code — their CLI-based agentic coding tool — as my primary development assistant for 30 consecutive days across real projects: a Next.js SaaS dashboard, a Go microservice, and a Python data pipeline. I wanted to know if this tool justified the hype, the cost, and the workflow disruption.

Here's what happened.

Setup: Frictionless, Almost Suspiciously So

Getting Claude Code running is genuinely simple:

npm install -g @anthropic-ai/claude-code
claude

That's it. You authenticate through the browser via Anthropic's console, and you're dropped into an interactive terminal session. No Docker containers, no config files to wrangle, no API key environment variables to manage manually.

$ claude
╭──────────────────────────────────────╮
│ ✻ Welcome to Claude Code!            │
│                                      │
│   /help for commands                 │
│   /config to set preferences         │
╰──────────────────────────────────────╯

>

The tool immediately begins indexing your working directory. On my Next.js project (~340 files), the initial scan took about 8 seconds. On the Go microservice (~120 files), it was nearly instant.

One early gotcha: Claude Code operates within your current working directory by default. If you cd into a subdirectory before launching, it scopes to that subtree. This seems obvious but tripped me up twice during the first week when I was getting "file not found" errors for files that clearly existed — just outside the working directory.

Configuration is minimal but useful. The /config command lets you set a system prompt (called a "memory" file), model preferences, and tool permissions. I created a CLAUDE.md file at the root of each project:

# CLAUDE.md

## Project Context
This is a Next.js 14 app with App Router, TypeScript, Tailwind CSS, 
and Supabase for the backend.

## Conventions
- Use server components by default; add "use client" only when needed
- Prefer Zod for validation
- All database queries go through the `lib/supabase/queries.ts` module
- Follow the existing file naming convention: kebab-case

## Testing
- Run tests with `pnpm test`
- We use Vitest + Testing Library

This context file is the single highest-leverage configuration step. Without it, Claude Code gives competent but generic output. With it, the output aligns with your actual project from the first response.

Daily Workflow: How It Actually Fits Into Development

My typical day with Claude Code fell into three usage patterns:

Pattern 1: Feature Implementation (High Value)

I'd describe a feature in natural language, and Claude Code would plan the implementation, create files, and wire things together.

> Add a new billing page at /dashboard/billing that shows the user's 
  current plan, usage metrics from Supabase, and a button to upgrade 
  via Stripe Checkout. Follow the existing page layout pattern.

Claude Code would:

Read existing page files to understand the layout pattern
Check the Supabase schema for relevant tables
Create the page component, a server action for Stripe, and the necessary types
Update the navigation sidebar to include the new route

This worked shockingly well for boilerplate-heavy features. The billing page — which would have taken me 2-3 hours of scaffolding — was done in about 15 minutes, including three rounds of refinement.

Pattern 2: Debugging and Refactoring (Mixed Value)

For debugging, I'd paste error messages or describe unexpected behavior:

> The /api/webhooks/stripe endpoint is returning 500 intermittently 
  in production. Here's the error log: [paste]. The signature 
  verification seems to fail randomly.

Claude Code would read the webhook handler, trace the logic, and often identify real issues — in this case, a race condition where the raw body was being parsed before the signature check. It correctly suggested buffering the raw body.

For refactoring, the results were more variable. When I asked it to refactor a 400-line component into smaller pieces, it did a competent job but made choices I disagreed with — extracting components at boundaries that didn't match my mental model of the feature. The refactoring was correct but not elegant. I ended up rewriting about 30% of what it produced.

Pattern 3: Quick Questions and Exploration (High Value, Low Cost)

> How does the auth middleware in middleware.ts decide which routes 
  to protect?

> Find all places where we're making direct Supabase calls instead 
  of going through the query module.

This "codebase Q&A" mode is where Claude Code's large context window genuinely shines. It reads the relevant files, synthesizes the information, and gives you a clear answer with file paths and line numbers. This replaced a significant chunk of my time spent scrolling through unfamiliar code.

Strengths

1. Reasoning About Complex Systems

Claude Code's ability to hold a large codebase in context and reason about interactions between files is its defining strength. During week three, I asked it to add rate limiting to our API routes:

> Add rate limiting to all API routes. We need a sliding window 
  approach, 100 requests per minute per user. Use Redis (we have 
  Upstash already configured). Make sure it works with our existing 
  auth middleware.

It correctly:

Identified all 14 API route files
Created a shared rate-limiting utility using Upstash Redis
Integrated it into the middleware layer rather than adding it to each route individually
Handled the edge case of unauthenticated requests (IP-based limiting)
Added proper Retry-After headers

This wasn't a copy-paste from Stack Overflow. It understood the architecture and made an architectural decision (middleware vs. per-route) that I agreed with. That's genuinely impressive.

2. Extended Thinking for Hard Problems

When facing a complex task, Claude Code uses extended thinking — an internal chain-of-thought process that's invisible by default but dramatically improves output quality on hard problems. You can see it with /config set to show thinking.

I tested this by giving it a gnarly bug: a React component that re-rendered infinitely only in production builds. Claude Code spent 45 seconds in its thinking phase, traced through the component tree, identified that a useEffect dependency was a non-memoized object created inline in a parent component, and suggested the fix. That diagnosis would have taken me 30+ minutes of console.log debugging.

3. Multi-File Coherent Changes

Unlike tools that operate on a single file, Claude Code maintains coherence across a changeset. When I asked it to rename a database column and update all references, it:

# It actually ran these kinds of operations:
# 1. Searched for all references to the old column name
# 2. Updated the Supabase migration
# 3. Updated the TypeScript types
# 4. Updated every query file that referenced the column
# 5. Updated the Zod schemas
# 6. Ran the TypeScript compiler to verify no broken references

This is table-stakes for a coding agent, but the coherence of the changeset matters. It didn't just find-and-replace; it understood that some references were in type definitions (where the rename was straightforward) and others were in raw SQL strings (where it needed to be more careful).

4. Git Integration

Claude Code can stage, commit, and create PRs. The commit messages it generates are genuinely good — descriptive, conventional-commit formatted, and scoped to the actual changes. I stopped writing my own commit messages entirely by week two.

> Commit these changes with a good message.

# Claude Code output:
git add -A
git commit -m "feat(billing): add Stripe Checkout integration for plan upgrades

- Create billing page with current plan display and usage metrics
- Add Stripe Checkout session creation server action
- Integrate billing link in dashboard navigation
- Add webhook handler for checkout.session.completed events"

Weaknesses

1. Cost: The Elephant in the Room

Let's talk numbers. Over 30 days, I tracked my usage:

Week	Sessions	Avg. Session Length	Est. Cost
1	22	25 min	~$48
2	18	20 min	~$35
3	15	30 min	~$52
4	14	25 min	~$40
Total	69	~25 min	~$175

These are estimates based on token usage tracking. The cost model is per-token, and complex tasks that require reading many files burn tokens fast. A single session where I asked it to audit our entire auth flow cost nearly $8 by itself.

For a solo developer, $175/month is significant. For a team, it scales quickly. Compare this to GitHub Copilot at $19/month — the value proposition has to be substantially higher to justify the cost, and for simple autocomplete-style tasks, it isn't.

The cost problem is structural: Claude Code reads files into context to understand your codebase. Large codebases mean large contexts mean high costs. There's no way around this with the current architecture.

2. Speed: Patience Required

Claude Code is slow compared to local tools. Response times for complex tasks:

Task Type	Typical Response Time
Simple question	5-10 seconds
Single-file edit	10-20 seconds
Multi-file feature	30-90 seconds
Large refactoring	2-5 minutes

During the multi-file feature implementations, I frequently found myself tabbing away to do something else, breaking my flow. This is the opposite of what a good developer tool should do. GitHub Copilot's inline suggestions feel instantaneous by comparison.

The speed issue is particularly painful during debugging, where rapid iteration is essential. When Claude Code suggests a fix, I apply it, test it, and it doesn't work — the 15-second round-trip for the next suggestion adds up fast.

3. Hallucinated APIs and Outdated Patterns

This was my biggest frustration. Claude Code confidently generates code using APIs that don't exist or are deprecated:

Real example from week two:

// Claude Code generated this:
import { createServerComponentClient } from '@supabase/auth-helpers-nextjs';
import { cookies } from 'next/headers';

export async function createClient() {
  const cookieStore = cookies();
  return createServerComponentClient({ cookies: () => cookieStore });
}

This import path and API are from an older version of the Supabase auth helpers. The current approach uses @supabase/ssr with a different API. I had to correct this three separate times across different files before I added a note to my CLAUDE.md about the correct import.

Another example: It generated fetch calls with the signal parameter placed incorrectly in the options object, which silently fails rather than throwing an error. These are subtle bugs that pass linting and basic tests but break in production.

Mitigation: Adding specific library versions and correct import patterns to CLAUDE.md reduced but didn't eliminate these issues. You still need to verify generated code against actual documentation.

4. Context Window Limits Are Real (Even at 200K)

Despite the large context window, Claude Code doesn't read your entire codebase. It reads files on-demand as it works through a task. This means it sometimes misses relevant context:

During week three, I asked it to add a new field to a form. It correctly updated the form component, the Zod schema, and the Supabase insert call — but missed that there was a second form in a modal component that also needed the field. It didn't know to look for it because I didn't mention it, and it doesn't do exhaustive searches by default.

You can explicitly tell it to search comprehensively:

> Find ALL forms that submit to the users table and add the 
  new 'phone' field to each one.

But this requires you to know what to ask for, which partially defeats the purpose.

5. No Real-Time Collaboration or IDE Integration

Claude Code lives in the terminal. There's no VS Code extension, no JetBrains plugin, no way to highlight code in your editor and ask about it. The workflow is:

Switch to terminal
Describe what you want (including file paths and context)
Wait for the response
Switch back to editor to review changes
Switch back to terminal to iterate

This context-switching tax is non-trivial. Tools like Cursor and GitHub Copilot Chat, which live inside the editor, have a fundamentally smoother interaction model for many tasks.

Tasks It Handled Well vs. Poorly

Excellled At

Database migration generation:

> We need to add a comments table with proper RLS policies. 
  Comments belong to users and posts. Include soft delete.

It generated a complete, correct migration with RLS policies, indexes, and foreign key constraints. It even added a comment_count trigger on the posts table. This was production-ready on the first try.

Test generation: Given a utility function, it wrote thorough tests covering edge cases I hadn't considered — empty arrays, null inputs, Unicode strings, very large inputs. The test quality was consistently high.

Explaining unfamiliar code: When I pointed it at a complex recursive algorithm in our codebase, it provided a clear, step-by-step explanation with a concrete example traced through the recursion. Better than any code review comment I've received.

Struggled With

Visual/CSS work: Asking it to fix a layout issue resulted in a cascade of Tailwind class changes that made the component worse. It doesn't "see" the rendered output, so it's guessing at visual problems from code alone. This is a fundamental limitation.

Performance optimization: When I asked it to optimize a slow database query, it suggested adding indexes (correct) but also suggested restructuring the query in ways that changed the semantics. It didn't understand the business requirements behind the query — that certain joins were intentional for data integrity reasons.

Anything requiring external service knowledge: It doesn't know the current state of third-party services. When I asked about Stripe's latest API version and webhook events, it gave me information that was 6-12 months out of date. Always verify against current documentation.

Who It's Best For

Best fit:

Senior developers who can evaluate and correct generated code quickly
Developers working on greenfield projects where they're defining patterns (Claude Code is excellent at following patterns once established)
Teams doing a lot of boilerplate-heavy work (CRUD APIs, form handling, data pipelines)
Developers who are strong at directing AI but want to reduce typing and file-navigation time

Poor fit:

Junior developers who can't distinguish correct code from plausible-but-wrong code
Developers who primarily do visual/UI work
Budget-conscious individual developers (Copilot is 10x cheaper for 80% of the value)
Anyone who needs fast, inline autocomplete-style assistance (use Copilot or Cursor for this)

Final Verdict

After 30 days, Claude Code earned a permanent place in my toolkit — but not as my primary tool. It's a specialist, not a generalist.

What I use it for now:

Scaffolding new features (2-3x faster than doing it manually)
Codebase exploration and Q&A (replaces a lot of grep-and-read)
Generating tests (consistently good quality)
Git operations and PR creation
Multi-file refactors where I can clearly describe the desired outcome

What I still do myself or use other tools for:

Inline code completion (Copilot)
Quick single-line fixes (Copilot)
Anything visual (my own eyes + browser DevTools)
Debugging tight loops where I need rapid iteration (manual)

The honest summary: Claude Code is the most capable CLI coding agent available today. Its reasoning ability and context handling are genuinely ahead of the competition. But it's expensive, it's slow, and it requires a skilled operator to get value from it. It's a power tool, not an autopilot.

Rating: 7.5/10 — Excellent for what it does, but the cost-speed-capability triangle hasn't been fully solved yet. When Anthropic brings the price down and the speed up, this becomes an easy 9.

I Used Claude Code for 30 Days — Here's My Honest Review