Claude Code for Web Experiment

Anthropic sent an email offering $1,000 in credit for their Claude service. At first glance, the generous credit amount caught my attention. A smidgeon later, I realized this was specifically for Claude Code for Web, a cloud-based version of the coding agent that runs remotely rather than on local machines. The amount of credit involved meant it was worth learning what Claude Code for Web can do differently from its CLI equivalent.

Here’s a video record of the Claude Code for Web experiment that was the source for this blog post. Its a “naf” video in many regards, but is included to get a visual feel for what working with Claude Code for the web is like - including the lack of understanding of different steps in the process and how they were corrected.

Initial Setup

Logging into Claude and accessing the code section presented the first configuration decision: network access levels. The system offers three options: no network access, constrained trusted network access to specific sources, or full network access. Given that I intended to use this primarily for my own projects rather than customer work, full network access made sense for this evaluation.

The research preview came with $1,000 in credit, but it was only valid until November 18. That deadline was only 12 days away, which is a short window of opportunity. The system had already detected my GitHub account and various repositories, though not all of them appeared in the initial list. After installing the Claude GitHub app on my private repositories and completing the authentication process, my website repository appeared as expected.

The interface revealed options for different network access environments and environment variables. One limitation became apparent: there’s no direct way to specify a particular branch when setting up a session. You can only select a repository. Branch selection is handled through prompts to the agent.

The Documentation Challenge

After considering various tasks to test the system, I settled on a documentation challenge rather than a traditional coding task. It’s a task I would like done. The goal was straightforward but labor-intensive: see if I could use my historical daily report emails to fill the gap in my professional website’s blog content after 2012.

I had recently updated my professional website and reinstated older blog posts, revealing a significant chronological gap. This gap represented work that would be valuable to search indexes for people seeking technical expertise in areas I’ve worked on. The challenge was that I didn’t want to create all this content from scratch manually. Instead, I want to leverage the documentation I had already made.

Throughout much of my later work life, I’ve maintained detailed daily reports. For a particular internal R&D project called Visual Touchscreens (which ultimately didn’t reach commercialization), I had sent daily emails to myself, then used a .NET utility to scrape these from Outlook and generate Word documents. Two years’ worth of these documents were available, which I could copy into a GitHub branch, make available to Claude Code on the web, but not retain in that form in the long run.

The task for Claude Code for Web was to read these Word documents and generate blog posts matching my existing content structure: organized by year, following specific naming conventions, with images placed in designated locations. I already had styling rules, stream-of-consciousness rewriting guidelines, and agent rules for categories and tags. The system had substantial context to work with.

This was deliberately not a typical coding task. It was a documentation-generation challenge that tested the agent’s ability to process unstructured content and produce structured output in accordance with existing patterns.

The Execution Process

I prepared a detailed prompt specifying the branch to work on, the location of the Word documents, what the documents contained, the desired blog post format, which images to include or exclude, general approaches, and writing context. This being my first time using the system, I wasn’t certain how it would handle the complexity.

The first challenge emerged immediately: the agent couldn’t locate the files. Despite multiple attempts and confirmation that the files existed in the branch (visual-touchscreens-blog), the system struggled with finding the branch I mentioned. After correcting what appeared to be a branch path issue, the agent finally found the documents and began formulating a plan. It created its own working branch and started processing.

The agent began converting Word documents to text and extracting images. However, I noticed it was including details about investor discussions from the daily reports. I hadn’t anticipated this, but investor-related content wasn’t appropriate for technical blog posts. I instructed the agent to review all generated posts and remove any investor interaction commentary, focusing strictly on technical content.

The system adapted to this mid-process correction and began editing the existing blog posts to remove investment-related material. This raised another issue I hadn’t considered: the Visual Touchscreens work stopped in 2015, but the daily reports didn’t document why. A blog post series covering this work would need some explanation for the abrupt conclusion.

I provided the agent with context:

I stopped the Visual Touchscreens work because it was too expensive, the hardware wasn’t mature enough for commercialization, and some patents had emerged that would potentially conflict with the approach. I instructed the agent to research the challenges of hardware development in early-stage ventures and create an additional blog post explaining why the project concluded.

The Image Challenge

After the agent reported completion of 15 blog posts, I reviewed the output and discovered the images hadn’t been properly integrated. The agent had noted the images but hadn’t completed the full scope of work: copying images to the correct locations (not temporary folders), renaming them according to existing conventions, and inserting them at appropriate points in the markdown content.

I clarified that my expectation was for the task to be fully executed, not just notes on what needed to be done. The agent acknowledged the oversight and completed the image work, ultimately delivering 16 blog posts spanning 2014 and 2015 with 59 images referenced.

Results and Evaluation

Rather than creating a pull request immediately, I pulled the changes locally to review the actual output. The agent had generated 10 blog posts for 2014 and 6 for 2015, including a conclusion post that wasn’t in the original daily reports.

The conclusion post was overly verbose and required manual editing. The other posts looked reasonable but would need a thorough review. This is the fundamental reality of working with AI-generated content: you cannot rely on the system to avoid errors or hallucinations. The goal was never fully automated content generation. The goal was bulk initial work that reduces manual effort, with human review and editing as the essential final step.

The entire process consumed approximately $4 worth of credit. With $1,000 available over roughly two weeks, that’s substantial spare capacity for additional work and experiments.

Key Insights

The experiment proved valuable for understanding both capabilities and limitations. Claude Code for Web successfully processed unstructured Word documents, extracted relevant technical content, generated structured markdown following existing patterns, handled images, and even conducted web research to create contextual content that wasn’t in the source material.

The limitations were equally instructive. The agent required a mid-process content filtering correction. The scope of work needed explicit reinforcement to ensure complete execution rather than partial completion with manual work left over.

The most important insight remains the same for any AI tool: the output requires human verification. The system dramatically reduces the initial labor of content creation, but publishing AI-generated content without thorough review would be irresponsible. The generated blog posts now make historical technical work discoverable through search indexes, demonstrating expertise in depth sensing technology, industrial visualization, and hardware development that would otherwise remain hidden in archived daily reports.

That represents real-world value for me from the experiment. There is a whole category of work that would not be undertaken because the initial hurdle is too large or intimidating, and AI tooling helps with that. Yes, the wording wouldn’t be the same if I wrote it myself, but in this case, it doesn’t matter to me. What matters to me is that my content, which shows my experience and expertise in a given area, is now more likely to be brought to the attention of people interested in that subject.

Here are links to the results of the exercise after a read through and additional editing I performed:

2014:

2015:

Related (not part of Claude Code experiment):

Surface Symbolics Concept Videos (2012)