AI & Automation 9 min read · May 28, 2026 · Updated July 17, 2026

How the Mochify MCP Server Works: Hosted vs Local, with Worked Examples

Q: Does using the Mochify MCP server cost more tokens than the web app?

The web app costs no agent tokens at all. Among agent-driven workflows: hosted MCP URL input is now nearly free in both directions because compressed results return as a download URL rather than inline bytes. Hosted MCP upload input is expensive on the way in but cheap on the way back (a URL string). The local install remains the cheapest of all because no image bytes ever enter the agent's context in either direction.

The bridge between your AI agent and our compression engine comes in two distinct flavours. Here's how each one works, where each one belongs, and the part nobody else writes down: where the MCP server actually saves you tokens, and where it doesn't.

This guide covers the hosted MCP server at mcp.mochify.app and the local MCP server via mochify serve - their architecture, setup, token-cost profiles, retention behavior, and four concrete worked examples.

What's in This Guide

30-Second Cheat Sheet

If your goal is the lowest possible token cost, use either local mode or hosted-MCP URL input. The hosted MCP now returns compressed results as a short-lived download URL rather than inline binary - which fixes the old return-side problem but means the compressed result is briefly held in Mochify-controlled storage.

Workflow	Image bytes go to	In agent context?	Best for
Hosted MCP, URL input	`api.mochify.app`	Neither (URLs in, URL out)	Public or CDN-hosted images you want as AVIF, WebP, or JXL
Hosted MCP, uploaded image	`api.mochify.app`	Input only (output returns as URL)	Small images already in a conversation
Local MCP (`mochify serve`)	`api.mochify.app` via local binary	Neither (paths only)	Claude Desktop, Cursor, file work on disk
Direct CLI (`mochify`)	`api.mochify.app` via local binary	Neither (paths only)	Build pipelines, Claude Code, batch jobs, scripts

The Two Surfaces of the Mochify MCP Server

Mochify exposes its compression engine to AI agents through two separate surfaces. Both ultimately call the same hosted API at api.mochify.app/v1/squish, where the encoding work happens in RAM and the original is discarded immediately. They differ in where the agent's request originates, whether image bytes ever pass through the agent's context window, and whether the compressed result is briefly held server-side for pickup.

The hosted MCP server lives at https://mcp.mochify.app. Any MCP-compatible client - Claude Desktop, Cursor, any agent runtime that supports remote connectors - can register it, authenticate via OAuth, and call its tools. The agent sends a request, Mochify's hosted MCP forwards it to the API, and the processed image comes back as a short-lived download URL on files.mochify.app.

The local MCP server is a single Rust binary (mochify) that ships via Homebrew, raw download, or cargo install. It runs as a direct CLI in your shell, and the same binary runs as a local MCP server when launched with mochify serve and registered in your Claude Desktop config. In both local modes, the agent sees only file paths and metadata - image bytes flow from your disk to the API via the local binary and never enter the agent's context window.

You don't pick one over the other for life. You pick the right one for the job at hand. The rest of this guide gets specific about which is which.

How the Hosted MCP Server Works

The hosted server is a connector that any MCP-compatible AI assistant can register and call. Once OAuth is complete, the agent has two tools available: squish for image processing, and check_usage for quota checks.

Setup, in one paragraph

In Claude Desktop, go to Settings → Connectors, add https://mcp.mochify.app, then complete an OAuth flow that authorises Mochify to process images on your behalf. Cursor and other clients follow the same shape. The full step-by-step is in our MCP setup guide. Authentication is OAuth-based, the connector talks to our servers over HTTPS, and the agent doesn't need to manage an API key directly.

The `squish` tool

The squish tool accepts either a public HTTPS URL to fetch the image from, or raw base64 image bytes plus a media type. It returns a short-lived download URL for the processed image - a five-minute TTL on files.mochify.app, with an explicit "expires in ~5 minutes" note in the response so the agent and any downstream tool know exactly when the link goes dead.

JPEG is the default output (encoded with Google's jpegli encoder, which delivers roughly 35% better compression than libjpeg-turbo at matched quality). AVIF, JPEG XL, WebP, and PNG are also supported. Options include resize, smart-crop to a subject, EXIF stripping, brightness, clarity, smartCompress, HDR gain-map preservation, and AI background removal.

Retention, honestly

The original image you send to Mochify is streamed into the encoder in RAM, processed, and discarded. There are no disk writes of the source, and no logs containing image data.

What URL passback adds is a brief pickup window for the compressed result. After encoding, Mochify holds the processed bytes in a pickup store keyed by an unguessable hash, with a five-minute TTL. After five minutes the result is evicted regardless of whether anything fetched it.

This is a softening of the original "wiped immediately" claim on the compressed-output side. We'd rather describe it accurately than carry a phrase that no longer matches what the server does. The two local install workflows don't use the pickup store at all - the local binary receives the compressed bytes from the API and writes them straight to your disk. If you need the strongest possible end-to-end retention story, use a local mode or self-host via our Docker guide.

Where the agent's tokens go

Now that the hosted MCP returns a download URL instead of an inline image, the token-cost story has shifted. URL input is genuinely cheap on both sides: a URL in, a URL out. Upload input is still expensive on the way in because the image is base64-encoded into the tool-call payload, but the return is a short text URL rather than a binary blob - a one-way cost rather than two-way.

When you use the hosted MCP through Claude, what passes through the agent provider's systems is now just the download URL string, unless the agent itself fetches the file and brings the bytes into the conversation. The image bytes only enter the chat provider's infrastructure if the agent decides they need to.

How the Local Install Works

The local install is a single Rust binary that runs as either a direct CLI or as a local MCP server that Claude Desktop spawns as a subprocess. In either mode, the agent sees only file paths and compression metadata - never image bytes, and nothing held server-side after the compressed result is written to your disk.

Installation and auth

macOS (Homebrew):

brew tap getmochify/mochify
brew install mochify

One auth step covers both CLI and MCP modes:

mochify auth login

This opens your browser, you sign in with your Mochify account, and credentials are written to ~/.config/mochify/credentials.toml. No environment variables to manage, no API key to copy and paste.

Direct CLI mode

The CLI takes one or more file paths and a set of options. The -p flag is the headline feature - describe the goal in plain English:

mochify photo.jpg -p "optimize for eBay"
mochify *.heic -p "convert to WebP, 1200px wide, strip EXIF" -o ./out

It also accepts file paths on stdin, which is what makes it slot cleanly into any Unix pipeline:

find . -name "*.jpg" | mochify -t webp -o ./out
ls *.heic | mochify -t jpg

Flag	What it does
`-t, --type <FORMAT>`	Output format: `jpg`, `png`, `webp`, `avif`, `jxl`
`-w, --width <N>`	Target width in pixels
`-H, --height <N>`	Target height in pixels
`--crop`	Saliency-guided crop to exact dimensions
`-o, --output <DIR>`	Output directory (defaults to input dir)
`-n, --name <NAME>`	Base name for the output file
`-r, --rotation <DEG>`	Rotate by 0, 90, 180, or 270 degrees
`--clarity`	Midtone contrast enhancement
`-p, --prompt <TEXT>`	Natural-language prompt that resolves all params automatically

Local MCP server mode

Same binary, different invocation. Add this to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "mochify": {
      "command": "mochify",
      "args": ["serve"]
    }
  }
}

Restart Claude Desktop. The squish tool now appears in the connections panel and uses the credentials from your earlier mochify auth login. No OAuth round-trip, no remote connector to refresh, no token expiry to worry about.

When to Use Which

Pick by what's true about your task, not by which tool you've heard of.

If…	Use
Someone shared an image URL and you want it as AVIF or WebP	Hosted MCP (URL input)
You want to drag a small image into a chat and have it compressed	Hosted MCP (upload) - see Worked Example 2
You're in Claude Desktop and images live on your laptop	Local MCP server (`mochify serve`)
You're in Claude Code working in a project repo full of images	Direct CLI
You're building a static site and want compression as a build step	Direct CLI (no agent needed)
Your agent is generating images and needs to compress before commit	Direct CLI from the script
You want zero image bytes in any agent's context window	Either local mode
No developer environment, want the simplest possible AI workflow	Hosted MCP with a URL

Ready to connect Mochify to your AI workflow?

MCP and API access are included on every plan, starting at Free. No paywall on developer or agent features.

View plans

Worked Example 1: Hosted MCP with a Public URL

URL-input mode is the cheapest hosted MCP workflow: no image bytes touch the agent's context window in either direction - a URL string in, a URL string out.

A blogger is in Claude Desktop with the Mochify connector registered. They say: "Convert this hero image to AVIF and resize to 1200px wide for my blog: https://example.com/photo.jpg"

1 Claude maps the request via Magic Flow to squish({ url: "https://example.com/photo.jpg", type: "avif", width: 1200 }).
2 Mochify's hosted server fetches photo.jpg from example.com, runs it through the AVIF encoder in RAM, and parks the compressed result in its pickup store.
3 The tool response carries a short-lived download URL on files.mochify.app (valid for five minutes). Claude fetches the URL to show you the result or hands you the link directly.

Token cost: just the URL strings and options in both directions. The original image bytes never touch the agent's context, and the compressed bytes only reach the chat if Claude chooses to fetch the result for display. This is the cheapest of the hosted-MCP workflows by a wide margin.

For format context: web.dev's AVIF compression study places AVIF at 40–50% smaller files than JPEG at matched visual quality. Browser support sits at over 95% globally per caniuse.com/avif, so the output is safe to serve as the primary format on most modern sites.

Worked Example 2: Hosted MCP with an Uploaded Image

This was historically the workflow most people tried first, and it disappointed in three independent ways. One of those three is now fixed. We're going to be just as direct about what's solid as we were about what wasn't.

Failure 1 (still applies): the image you uploaded isn't the image Mochify receives.

When you attach a file to a chat, the client's vision pipeline resizes it - typically to around 1568 pixels on the long edge - and frequently re-encodes it before the model can see anything. The bytes the model packages into the tool call are that vision-processed copy, not your original. For a screenshot you're sending around at thumbnail size this is fine. For a hero image where pixel-level quality matters, it isn't.

Failure 2 (still applies): the base64 round-trip is size-capped.

The full image, base64-encoded into JSON, has to fit inside the MCP tool-call payload, and base64 inflates bytes by roughly a third. Anything much larger than a small screenshot is at risk of being truncated or rejected outright.

Failure 3 (fixed): returning the compressed image inline used to be unreliable.

Mochify's hosted MCP previously returned the processed image inline as binary content, which forced the chat client to render or save a binary blob from a tool response. Support varied wildly, and in practice the tool call frequently hung or produced a result the user couldn't save. We've now switched to URL passback: the response is a text URL on files.mochify.app, valid for five minutes. Chat clients handle a URL string without difficulty.

Where this leaves us: the drag-into-chat workflow is now genuinely usable for small images where the quality hit from chat-client upload pre-processing doesn't matter - a screenshot you're prepping for a Notion page, a thumbnail you want as AVIF, a quick conversion of a small graphic. It is not the right workflow for a hero image, a product photo, or anything where you'd want pixel-accurate output. For those, use Worked Example 1 (URL input) or Worked Example 3 (local MCP server).

A worked run: drag a small PNG screenshot into Claude with "Make this web-ready." Claude calls squish({ data: "<base64...>", mediaType: "image/png", optimizeForWeb: true }). Magic Flow infers AVIF output with sensible web-ready defaults and EXIF stripping. Mochify processes the vision-processed copy of your screenshot and returns:

Image processed successfully (image/avif, 12.7 KB).
Download URL (expires in ~5 minutes): https://files.mochify.app/629b...d46.avif

Claude either fetches that URL and shows you the result, or hands you the link directly. The compressed file lives on files.mochify.app for the rest of the five-minute window and then disappears.

Worked Example 3: Local MCP Server in Claude Desktop

The local MCP server is the right surface for chat-driven file work: image bytes flow directly from disk to the API and back to disk, no agent context touched, no pickup store involved.

"Convert all the JPEGs in /Users/sam/Desktop/product-shoot/ to AVIF at 1200px wide and save them to /Users/sam/Desktop/product-shoot/web/"

1 Claude picks up that this is an image-processing task and finds the squish tool on the locally-registered mochify MCP server.
2 Claude calls squish once per file, passing the full file path plus the resolved parameters from Magic Flow.
3 The local mochify binary, running as a subprocess, opens each file from disk, sends the bytes to api.mochify.app/v1/squish, receives the compressed result, and writes it to the requested output directory.
4 The MCP tool response back to Claude contains the saved path, original size, and new size. No image bytes.
5 Claude summarises: "Compressed 24 product photos. Originals 38.7 MB total; AVIF outputs 9.1 MB total. Average saving 77%. Files in ~/Desktop/product-shoot/web/"

Token cost: the agent never holds image bytes - paths and metadata only. No pickup store on Mochify's side either: the compressed bytes flow straight from the API back to disk via the local binary. Setup was a one-time brew install mochify, mochify auth login, and three lines of JSON in the config file.

Worked Example 4: Direct CLI in Claude Code or a Build Pipeline

The direct CLI is the cheapest workflow per image by orders of magnitude - zero image bytes in any agent context, works with no agent at all, and slots cleanly into any Unix pipeline or CI step.

A developer working on a Next.js site opens Claude Code and asks: "Optimize everything in public/images to AVIF and WebP, then add a preload tag for the hero image."

1 Claude reads the directory listing (just filenames and sizes, not pixels).

Claude runs two shell commands to produce both formats:

mochify ./public/images/*.{jpg,png} -p "web-ready AVIF, max 1600px wide" -o ./public/images
mochify ./public/images/*.{jpg,png} -t webp -w 1600 -o ./public/images

3 The CLI processes the folder in place. Each invocation prints a summary line per file plus a totals line.
4 Claude parses the summaries, then edits app/page.tsx to add the preload tag and the <picture> element with the AVIF / WebP / JPEG fallback chain.

Token cost: zero image bytes anywhere in the agent's context. Only paths, filenames, sizes, and the CLI's text output.

The same pattern works without an agent at all. A content engineer building an unattended publishing pipeline:

# Generate the hero, optimize to AVIF + WebP, then commit
generate-hero-image "$DRAFT_PATH" /tmp/hero.png
mochify /tmp/hero.png -p "AVIF web-ready hero, 1600 wide" -o ./public/heroes/
mochify /tmp/hero.png -t webp -w 1600 -o ./public/heroes/
git add ./public/heroes/ && git commit -m "Add hero for $DRAFT_PATH"

Or the stdin pipe pattern for batch jobs:

find ./uploads -name "*.heic" | mochify -t jpg -o ./out

No upload step, no download step, no chat round-trip, no per-image agent token cost. For the full Claude-Code-specific walkthrough, see our guide to compressing images inside Claude Code. For container-based pipelines, see our self-hosting image optimization with Docker guide.

The Honest Answer on Token Cost

The hosted MCP server used to be a workflow saver but not a token saver, because compressed results returned inline as binary. With URL passback, that's no longer true on the return side. The local install remains the cheapest workflow per image, because no image bytes ever enter the agent's context in either direction. For the underlying numbers behind these tradeoffs, see our breakdown of how many tokens does an image use?

The hosted MCP server with URL input is now genuinely cheap both ways - a URL string in, a URL string out. Upload input is still expensive on the way in because the image is base64-encoded into the tool-call payload, but the return is a short text URL rather than a binary blob; it's a one-way cost rather than a two-way one.

The local install (CLI or mochify serve) is still the cheapest workflow per image. Nothing about the image ever enters the agent's context - not on the way in, not on the way back. For batch workflows the difference is dramatic: compressing 100 images via the hosted-MCP upload path would push 100 base64 payloads through your agent's context; compressing them via the local install puts a handful of summary lines in.

The same paths-not-bytes principle carries over to documents: you can extract, split, and convert PDF pages in an agent pipeline without ever putting page bytes in the agent's context either.

The practical rule

A few chat-driven image edits per week with no install - use the hosted MCP and don't worry about it.
Chat-driven work where the files are already on your laptop - install the local MCP server (mochify serve).
Anything resembling a pipeline (build step, content workflow, batch processing, repo-wide cleanup, agentic content generation) - use the direct CLI.

Benchmark: 1600px PNG source, three formats

Output format	Quality setting	File size	Reduction vs. source
PNG (source)	—	3.1 MB	—
JPEG (jpegli)	80	193 KB	−94%
WebP	80	122 KB	−96%
AVIF	60	80 KB	−97%

Source is a lossless 1600px PNG - the large reductions reflect PNG→lossy format conversion as well as compression. For a JPEG-to-JPEG comparison, Google's jpegli announcement reports 35% better compression than libjpeg-turbo at matched quality, and web.dev's AVIF benchmarks place AVIF 40–50% below comparable JPEG.

FAQ

Does using the Mochify MCP server cost more tokens than the web app?

The web app costs no agent tokens at all - it's just you and a browser. Among agent-driven workflows: hosted MCP URL input is now nearly free in both directions. Hosted MCP upload input is expensive on the way in (base64-encoded image bytes in the tool-call payload) but cheap on the way back (a URL string). The local install remains the cheapest of all because no image bytes ever enter the agent's context in either direction.

Can I use the MCP server on the Free tier?

Yes. MCP and API access are included on all tiers: Free (25 images/month), Seller, and Pro. The monthly image limit applies equally across the web app, the MCP server, and the API.

Does the hosted MCP server store my images?

The original you send is processed in RAM and discarded immediately after encoding - no disk writes, no logs containing image data. The hosted MCP does briefly hold the compressed result in a pickup store with a five-minute TTL so it can return a download URL. After five minutes the result is evicted. The local install workflows skip the pickup store entirely.

What's the difference between Magic Flow and the squish tool?

squish is the low-level tool the MCP server exposes - it takes explicit parameters: format, quality, dimensions, options. Magic Flow is the natural-language layer that lets you describe the goal in plain English and have the agent resolve the parameters automatically. Most users only ever interact with Magic Flow.

Can I run mochify without an agent?

Yes. mochify is a standalone Rust binary. It works from the shell, in scripts, in CI, in Docker - anywhere you can run a single static binary. Many users run the CLI directly as a build step with no agent involved at all.

Which formats does squish actually output?

JPEG (encoded with jpegli for ~35% smaller files at matched quality), AVIF, JPEG XL, WebP, and PNG. AVIF is our default recommendation for web delivery; jpegli-encoded JPEG is the broad-compatibility fallback.

Does the MCP server support background removal and smart crop?

Yes. squish exposes removeBackground, crop (smart crop to the subject), smartCompress (saliency-guided quality selection), brightness, clarity, stripExif, rotate, and HDR gain-map preservation. Background removal requires a Seller or Pro plan; the rest are available on all tiers.

Can I use the MCP server through Cursor or other MCP clients?

Yes. For the hosted server, any client that supports remote MCP connectors over HTTP with OAuth can register https://mcp.mochify.app. For the local server, any client that supports the standard stdio MCP pattern - Cursor, Continue, Cline, Claude Code with custom config, and others - can spawn mochify serve as a subprocess. Mochify is also listed on Smithery and Glama MCP marketplaces.

Free Tool

Wire Mochify into your AI workflow today

MCP and API access are available on every plan, starting at Free. No paywall on developer or agent features.

Start Optimizing Free

What's in This Guide

30-Second Cheat Sheet

The Two Surfaces of the Mochify MCP Server

How the Hosted MCP Server Works

Setup, in one paragraph

The squish tool

Retention, honestly

Where the agent's tokens go

How the Local Install Works

Installation and auth

Direct CLI mode

Local MCP server mode

When to Use Which

Ready to connect Mochify to your AI workflow?

Worked Example 1: Hosted MCP with a Public URL

Worked Example 2: Hosted MCP with an Uploaded Image

Failure 1 (still applies): the image you uploaded isn't the image Mochify receives.

Failure 2 (still applies): the base64 round-trip is size-capped.

Failure 3 (fixed): returning the compressed image inline used to be unreliable.

Worked Example 3: Local MCP Server in Claude Desktop

Worked Example 4: Direct CLI in Claude Code or a Build Pipeline

The Honest Answer on Token Cost

The practical rule

Benchmark: 1600px PNG source, three formats

FAQ

Does using the Mochify MCP server cost more tokens than the web app?

Can I use the MCP server on the Free tier?

Does the hosted MCP server store my images?

What's the difference between Magic Flow and the squish tool?

Can I run mochify without an agent?

Which formats does squish actually output?

Does the MCP server support background removal and smart crop?

Can I use the MCP server through Cursor or other MCP clients?

Related Guides

Wire Mochify into your AI workflow today

The `squish` tool