Why Not Just Use fetch()?

March 3, 2026·8 min read·StripFeed Team

The Honest Answer

You've probably looked at StripFeed and thought: "Why would I pay for this? I can just fetch() the URL myself."

That's a fair question. And honestly, for simple cases, you're right. fetch() works. You hit a URL, you get HTML back, you feed it to your LLM. Job done.

Until it isn't.

The gap between "fetch a URL" and "give my AI agent clean, token-efficient content" is wider than it looks. It's not one problem. It's nine problems stacked on top of each other. And each one is just annoying enough that you'd rather not solve it yourself.

What fetch() Actually Returns

Let's say your AI agent needs to read a blog post. Here's what you get:

const response = await fetch("https://example.com/blog/great-article");
const html = await response.text();
// Now what?

That html variable contains the article, sure. But it also contains:

<nav> with 40+ navigation links
<script> tags for analytics, ads, and tracking
<style> blocks with hundreds of lines of CSS
Cookie consent banners
Social sharing widgets
Related article sidebars
Footer with sitemap links
<meta>, <link>, and <head> noise
Inline SVGs for icons

Here's what that looks like in tokens:

What you get	Tokens	Useful for your LLM?
Navigation + header	~2,400	No
Scripts + tracking	~3,800	No
CSS + styles	~2,200	No
Sidebar + related posts	~1,600	No
Footer + cookie banner	~1,200	No
The actual article	~3,100	Yes
Total raw HTML	~14,300

That's 78% noise. Your agent is spending tokens reading <div class="ad-wrapper"> and onclick="gtag('event', 'click')" instead of the content it actually needs.

Why This Matters (Even on a Subscription)

"But I'm on Claude Pro / ChatGPT Plus. I don't pay per token." Fair. But wasted tokens still hurt you in three ways:

Pay-per-token (API): You're paying for every token. 80% noise means your bill is 3-5x higher than it needs to be. This is the most obvious case.
Subscription plans: You don't pay per token, but you have usage limits. Fewer wasted tokens per request means your agent runs longer before hitting that limit. Instead of pausing every few hours, it keeps going.
Context window quality: This one matters regardless of how you pay. When your agent's context is full of navigation HTML, cookie banners, and tracking scripts, it has less room for actual content. The model's responses get worse because signal-to-noise ratio drops. Clean Markdown means your agent understands the page better and produces better results.

The 9-Step DIY Pipeline

To go from fetch() to clean, LLM-ready content, here's what you'd need to build:

1. HTML Parsing. You need a DOM parser. In Node.js that means jsdom (heavy, 2MB+) or linkedom (lighter). In Python, BeautifulSoup or lxml. Each has quirks with malformed HTML.

2. Content Extraction. The hard part. You need to identify which part of the page is "the content" and which is chrome. Mozilla's Readability algorithm does this, but it's a non-trivial dependency to integrate and configure. It doesn't work on every site.

3. Noise Removal. Even after extraction, you'll find leftover junk: empty links, tracking pixels disguised as images, inline scripts, hidden elements. You need custom rules for these.

4. Markdown Conversion. HTML to Markdown sounds simple until you handle nested lists, code blocks with language hints, tables, and edge cases like <pre> inside <blockquote>. Libraries like Turndown help, but need custom rules to produce clean output.

5. Token Counting. Different models use different tokenizers. If you care about costs (and you should), you need tiktoken or js-tiktoken with the right encoder for your model.

6. Smart Truncation. If the content exceeds your token budget, you can't just slice the string. You'll cut words, sentences, or even Markdown syntax in half. You need paragraph-boundary or sentence-boundary truncation.

7. Caching. You don't want to re-fetch and re-process the same URL every time your agent encounters it. That means Redis, Memcached, or at minimum a filesystem cache with TTL expiry.

8. Rate Limiting. Hammering a site with rapid requests gets you blocked. You need backoff logic, request queuing, or rate limiting on your own code.

9. Edge Cases. Timeouts (some sites take 10+ seconds). Redirects (HTTP to HTTPS, www to non-www, paywall redirects). Sites that already serve Markdown (like some Cloudflare-enabled sites responding to Accept: text/markdown). Sites with JavaScript-rendered content. 403 responses. Gzipped responses. Character encoding issues.

Each of these is solvable. None of them is hard in isolation. But together, they're a week of work that has nothing to do with your actual product.

The Code Comparison

Here's the DIY approach:

TypeScript (DIY)

import { JSDOM } from "jsdom";
import { Readability } from "@mozilla/readability";
import TurndownService from "turndown";
import { encodingForModel } from "js-tiktoken";

const encoder = encodingForModel("gpt-4o");
const turndown = new TurndownService({ headingStyle: "atx" });

async function fetchAndStrip(url: string, maxTokens?: number) {
  const response = await fetch(url, {
    signal: AbortSignal.timeout(9000),
    headers: { "User-Agent": "MyAgent/1.0" },
  });

  if (!response.ok) throw new Error(`HTTP ${response.status}`);

  const html = await response.text();
  const dom = new JSDOM(html, { url });
  const article = new Readability(dom.window.document).parse();

  if (!article?.content) throw new Error("Extraction failed");

  let markdown = turndown.turndown(article.content);

  // Basic noise cleanup
  markdown = markdown
    .replace(/\[]\(.*?\)/g, "")         // empty links
    .replace(/!\[.*?\]\(data:.*?\)/g, "") // data-uri images
    .replace(/\n{3,}/g, "\n\n");         // excess newlines

  const tokens = encoder.encode(markdown);

  if (maxTokens && tokens.length > maxTokens) {
    // Naive truncation (won't respect paragraph boundaries)
    const truncated = encoder.decode(tokens.slice(0, maxTokens));
    markdown = truncated;
  }

  return {
    markdown,
    tokens: tokens.length,
    title: article.title,
  };
}

That's ~30 lines, three dependencies, no caching, no rate limiting, and the truncation cuts mid-sentence.

TypeScript (StripFeed)

import StripFeed from "stripfeed";

const sf = new StripFeed("sf_live_your_key");
const result = await sf.fetch("https://example.com/article", {
  maxTokens: 3000,
});
console.log(result.markdown);

Three lines. Caching, rate limiting, smart truncation, and token counting included.

Python (DIY)

import requests
from readability import Document
from markdownify import markdownify
import tiktoken

encoder = tiktoken.encoding_for_model("gpt-4o")

def fetch_and_strip(url: str, max_tokens: int = None) -> dict:
    resp = requests.get(url, timeout=9, headers={"User-Agent": "MyAgent/1.0"})
    resp.raise_for_status()

    doc = Document(resp.text)
    markdown = markdownify(doc.summary(), heading_style="ATX")
    tokens = encoder.encode(markdown)

    if max_tokens and len(tokens) > max_tokens:
        markdown = encoder.decode(tokens[:max_tokens])

    return {
        "markdown": markdown,
        "tokens": len(tokens),
        "title": doc.title(),
    }

Python (StripFeed)

from stripfeed import StripFeed

sf = StripFeed("sf_live_your_key")
result = sf.fetch("https://example.com/article", max_tokens=3000)
print(result.markdown)

Same story. Three lines versus a custom pipeline.

What You Don't Get with fetch()

Beyond the basic extraction pipeline, StripFeed handles things that are genuinely hard to build yourself:

Feature	DIY with fetch()	StripFeed
Content extraction	Build it yourself (Readability + custom rules)	Built-in
Markdown conversion	Configure Turndown/markdownify + edge cases	Built-in
Token counting	Add tiktoken, manage encoders	Every response includes token count
Smart truncation	Build paragraph/sentence boundary logic	`max_tokens` parameter, truncates cleanly
Caching	Set up Redis or similar	Built-in, configurable TTL (up to 24hr)
Batch processing	Write parallel fetch + error handling	`POST /api/v1/batch`, up to 10 URLs
CSS selector extraction	Integrate a DOM query layer	`selector` parameter
Multiple output formats	Build format converters	`format=markdown\|json\|text\|html`
Usage analytics	Build a logging pipeline + dashboard	Dashboard with per-key, per-model stats
Cost tracking per model	Manual calculation	Pass `model` parameter, see costs in dashboard

When fetch() Is Actually Fine

Let's be real. You don't always need StripFeed.

fetch() is enough when:

You're scraping specific sites you know well. A custom pipeline tuned to their HTML structure will always beat a generic solution. You know exactly where the content lives, what to strip, what to keep.
You're fetching a handful of pages per day from sites you control
The content is simple and well-structured (like an API docs page)
You don't care about token optimization
You're in a prototype phase and just need something working

StripFeed earns its keep when:

Your agent receives arbitrary URLs from users or other systems and needs to handle whatever it finds. You can't write a custom scraper for every site on the internet.
Your agent reads diverse sources (news sites, blogs, documentation, forums). Each has different HTML structure.
You're processing hundreds or thousands of pages. Token savings compound fast. At 1,000 pages/day with 78% noise reduction, you're saving ~8.7M tokens daily.
You care about cost tracking and want to know exactly what each URL costs per model.
You need reliable extraction across the messy web, not just clean sites.
You'd rather spend your engineering time on your actual product instead of maintaining a content extraction pipeline.

Try It Yourself

The fastest way to see the difference is the live demo. Paste any URL and compare the raw HTML tokens against the clean Markdown output.

Or grab a free API key and try it in your pipeline. 200 requests/month, no credit card:

curl "https://www.stripfeed.dev/api/v1/fetch?url=https://example.com" \
  -H "Authorization: Bearer sf_live_your_key"