Appearance
mere.run CLI
mere.run is the public command-line interface for the OSS mere.run package. It exposes a modality-first command tree for image, text, speech, vision, music, video, model management, local status snapshots, and local API serving.
If you are looking for the broader docs set, start at mere.run Documentation.
Overview
bash
swift run mere.run --helpPublic tree:
mere.run guidemere.run image generatemere.run image validatemere.run text chatmere.run text codemere.run text embedmere.run text anonymizemere.run speech synthesizemere.run speech transcribemere.run speech profile { list, create, delete }mere.run vision captionmere.run vision inspectmere.run vision groundmere.run vision segmentmere.run vision trackmere.run vision track-livemere.run vision ocrmere.run music generatemere.run video generatemere.run video export-latentsmere.run model { list, capabilities, info, pull, remove, repair-manifests }mere.run statusmere.run api servemere.run setupmere.run agent { onboard, install-pi, start }
Global model-store override
The CLI honors the shared models root override:
bash
swift run mere.run --models-root /Volumes/FastSSD/mererun-models model listThat is equivalent to setting:
bash
export MERERUN_MODELS_DIR=/Volumes/FastSSD/mererun-modelsCanonical managed model IDs
See model-sources.md for the full source story, including which IDs are pullable from Hugging Face. The most common managed IDs are:
- Images:
image-klein-nano,image-klein-base,image-klein-max,image-zimage-nano,image-zimage-base,image-zimage-max,image-hidream-o1,image-hidream-o1-dev - Text chat:
text-chat-gemma4,text-chat-mebot,text-chat-psi-agent,text-chat-q35,text-chat-q35-nano - Text code / agents:
text-agent-qwen35-9b,text-code-qwen3 - Text embed:
text-embed-qwen3-0.6b - Text anonymize:
text-anonymize-privacy-filter - Speech TTS:
speech-tts-qwen3-nano,speech-tts-qwen3-customvoice - Speech ASR:
speech-asr-qwen3,speech-asr-parakeet - Vision OCR:
vision-ocr-lighton - Vision segmentation / tracking:
vision-segment-sam31 - Vision grounding:
vision-ground-falcon-perception - Music:
music-acestep - Video:
video-ltx-av
For subsystem-specific implementation guides, see:
Common workflows
Pull and inspect models
bash
swift run mere.run model list
swift run mere.run status
swift run mere.run model capabilities
swift run mere.run model pull image-zimage-nano
swift run mere.run model info image-zimage-nanoGenerate an image
bash
swift run mere.run image generate \
--prompt "a ceramic mug in soft morning light" \
--output ./mug.pngChat locally
bash
swift run mere.run text chat \
--prompt "Explain classifier-free guidance."Generate speech and transcribe it back
bash
swift run mere.run speech synthesize \
"Hello from mere.run" \
--output ./hello.wav
swift run mere.run speech transcribe ./hello.wav --backend autoInspect, segment, track, and OCR
bash
swift run mere.run vision inspect ./diagram.png "What does this diagram show?"
swift run mere.run vision segment ./photo.jpg --prompt "a cat"
swift run mere.run vision track ./clip.mp4 --prompt "a cat"
swift run mere.run vision ocr ./page.png --backend lightonGenerate music
bash
swift run mere.run model pull music-acestep
swift run mere.run music generate \
"upbeat electronic groove" \
--output ./track.wavGenerate video
bash
swift run mere.run video generate \
"a cinematic drone flythrough over snowy mountains" \
--variant unified-av \
--model-root ~/Library/Application\ Support/MereRun/models/video-ltx-av \
--output ./clip.mp4Serve a local API
bash
swift run mere.run api serve --engine text-chat-gemma4Command reference
Model installation in the OSS repo is explicit. mere.run model pull uses cataloged Hugging Face snapshots only; local-path-only models must be supplied with command-specific --model or --model-root options. See configuration.md and model-sources.md.
mere.run image generate
Generate a PNG with a local image model.
bash
swift run mere.run image generate --prompt "<text>" [options]Key options:
--prompt: required text prompt--model: canonical model id or local model path--output: output PNG path--width,--height--steps: override the model-specific step default--cfg: override the model-specific CFG default--input: image-to-image source--ref-image: repeatable HiDream O1 reference image--keep-original-aspect: preserve one HiDream reference image's aspect ratio--strength: image-to-image strength--lora,--lora-scale--quiet
Examples:
bash
swift run mere.run image generate --prompt "a black cat on a red sofa"
swift run mere.run image generate --model image-zimage-nano --prompt "retro robot illustration" --output ./robot.png
swift run mere.run image generate --prompt "turn this into a pencil sketch" --input ./photo.png --strength 0.6
swift run mere.run image generate \
--model image-hidream-o1-dev \
--prompt "put this subject in a studio portrait" \
--ref-image ./subject.png \
--output ./portrait.pngmere.run image validate
Run advanced deterministic validation for the local image families.
bash
swift run mere.run image validate --family zimage --test all
swift run mere.run image validate --family klein --test vae --output ./validation_output
swift run mere.run image validate --save-reference
swift run mere.run image validate --compare --reference-dir ./validation_outputKey options:
--family:zimageorklein--test:vae,encoder,transformer,pipeline, orall--output--save-reference--compare--reference-dir
mere.run text chat
Run local text chat with the Gemma 4, Q35, or Psi family.
bash
swift run mere.run text chat --prompt "<text>" [options]Key options:
--prompt--system--model: canonical model id--model-root: explicit local model root--max-tokens--temperature--top-p--stream--thinking--stats--quiet
Examples:
bash
swift run mere.run text chat --prompt "What is classifier-free guidance?"
swift run mere.run text chat --model text-chat-q35-nano --prompt "Explain speculative decoding."
swift run mere.run text chat --stream --prompt "Write a short welcome message."
swift run mere.run text chat --thinking --stats --prompt "How would you design a tokenizer?"mere.run text code
Run local code generation with GGUF models through the vendored llama.cpp runtime.
bash
swift run mere.run text code --prompt "<text>" [options]Key options:
--prompt--model: GGUF file or canonical code model id if your local setup resolves it--stream--stats--temperature--top-p--max-tokens
Examples:
bash
swift run mere.run text code --prompt "Write a Swift function to reverse a string"
swift run mere.run text code --model ./Qwen3-Coder-Next-Q4_K_M.gguf --stream --prompt "Implement a trie in Rust"mere.run text embed
Generate embeddings with the native Qwen3 embedding model.
bash
swift run mere.run text embed "semantic search query"
swift run mere.run text embed "foo" "bar" --output embeddings.json --prettyKey options:
- positional text arguments
--model--max-tokens--output--pretty
mere.run text anonymize
Detect and redact PII with the native OpenAI Privacy Filter model.
bash
swift run mere.run text anonymize "My name is Alice Smith and my email is alice@example.com"
swift run mere.run text anonymize --json --pretty "Phone: 555-1234"
cat notes.txt | swift run mere.run text anonymize --output redacted.txtKey options:
- positional text arguments, or stdin when omitted
--model--max-tokens--replacement: template supporting{label}and{index}--json--output--pretty
mere.run speech synthesize
Generate speech from text with Qwen3-TTS.
bash
swift run mere.run speech synthesize "<text>" --output ./speech.wav [options]Key options:
--output: required--model: canonical speech TTS id or local model path--voice--mode:styleorclone--profile--ref-audio--ref-text--language--save-profile--temperature--stream--stream-chunk-tokens--quiet
Examples:
bash
swift run mere.run speech synthesize "Hello from mere.run" --output ./hello.wav
swift run mere.run speech synthesize "Welcome aboard" --voice "A calm British male voice" --output ./welcome.wav
swift run mere.run speech synthesize "Read this in my cloned voice" --mode clone --profile my-voice --output ./clone.wavmere.run speech transcribe
Transcribe or translate local audio with the speech backends.
bash
swift run mere.run speech transcribe <audio.wav> [options]Key options:
- positional audio path
--backend:auto,qwen, orparakeet--task:transcribeortranslate--model--language--max-tokens--stream--stream-chunk-ms--stream-decode-ms--no-timestamps--output--quiet
Examples:
bash
swift run mere.run speech transcribe ./audio.wav
swift run mere.run speech transcribe ./audio.wav --task translate --backend qwen
swift run mere.run speech transcribe ./audio.wav --stream --output ./transcript.txtmere.run speech profile
Manage reusable voice clone profiles.
Subcommands:
mere.run speech profile listmere.run speech profile createmere.run speech profile delete
Examples:
bash
swift run mere.run speech profile list
swift run mere.run speech profile create \
--name narrator \
--audio ./ref.wav \
--text "reference transcript"
swift run mere.run speech profile delete --id <uuid>mere.run vision caption
Generate captions for one or more images.
bash
swift run mere.run vision caption ./images/*.png
swift run mere.run vision caption ./images/*.png --output-dir ./captionsmere.run vision inspect
Ask a direct question about an image.
bash
swift run mere.run vision inspect ./diagram.png "What does this diagram show?"mere.run vision segment
Segment prompted objects in an image using the native SAM 3.1 runtime.
bash
swift run mere.run model pull vision-segment-sam31
swift run mere.run vision segment ./photo.jpg --prompt "a cat"Key options:
--prompt: one or more text object prompts--box: one or morex1,y1,x2,y2[,label]geometry prompts--point: one or morex,y,positive[,label]orx,y,negative[,label]geometry prompts--model: managed model id or local SAM 3.1 model root--output: annotated image path--json-output: metadata path--mask-output-dir: optional per-object mask export directory--threshold: score cutoff, default0.05--resolution--show-boxes--multimask: emit up to three candidates per geometry-prompted object
Defaults:
- annotated image:
<image-stem>_segmented.<ext> - JSON metadata:
<image-stem>_segmented.json
Notes:
- still-image runs accept text, box, and point prompts in the same invocation
--mask-output-dirwrites one PNG mask per exported detection candidate- empty detection sets still produce annotated output plus JSON metadata
Examples:
bash
swift run mere.run vision segment ./photo.jpg --prompt "a cat"
swift run mere.run vision segment ./photo.jpg --prompt "a person" "a phone" --show-boxes
swift run mere.run vision segment ./photo.jpg --box "120,80,420,760,person" --mask-output-dir ./masks
swift run mere.run vision segment ./photo.jpg --point "512,384,positive,person" --point "700,200,negative,person"
swift run mere.run vision segment ./photo.jpg --prompt "a dog" --output ./photo-segmented.png --json-output ./photo-segmented.jsonmere.run vision track
Track prompted objects through a video with the native SAM 3.1 runtime.
bash
swift run mere.run model pull vision-segment-sam31
swift run mere.run vision track ./clip.mp4 --prompt "a dog"Key options:
--prompt: one or more text prompts used to seed objects on the init frame--box: one or morex1,y1,x2,y2[,label]geometry prompts--point: one or morex,y,positive[,label]orx,y,negative[,label]geometry prompts--init-frame: starting frame index for seeding--end-frame: optional inclusive final frame index--output: annotated video path--json-output: tracking metadata path--mask-output-dir: optional per-frame mask export directory--threshold: score cutoff, default0.05--show-boxes--show-labels
Defaults:
- annotated video:
<video-stem>_tracked.mp4 - JSON metadata:
<video-stem>_tracked.json
Notes:
- text prompts seed objects on
--init-frame, then the native tracker reuses geometry prompts for later frames - box and point prompts seed explicit tracked objects directly on the init frame
--mask-output-dirwrites per-frame mask PNGs under frame-named subdirectories- empty prompt sets still produce an annotated video and JSON summary
Examples:
bash
swift run mere.run vision track ./clip.mp4 --prompt "a dog" --init-frame 12
swift run mere.run vision track ./clip.mp4 --box "40,50,120,180,dog" --box "200,80,320,260,person" --show-boxesmere.run vision track-live
Capture a camera clip and run native SAM 3.1 tracking over the recorded session.
bash
swift run mere.run vision track-live --output ./live.mp4 --prompt "a person"Key options:
--prompt: one or more text prompts used to seed objects from the init frame--camera: camera device index--duration-seconds--init-frame: initial frame index used to seed tracking--seed-search-frames: additional frames to search when the init frame finds no objects--output: annotated video path--json-output: tracking metadata path--threshold: score cutoff, default0.05--show-boxes--show-labels
Notes:
track-livecurrently records a camera clip first, then runs tracking over the recorded media- live tracking searches a short warm-up window after the init frame so startup exposure or motion blur does not silently produce an unsegmented output
- live mode accepts text prompts only in the current implementation
--outputis required;--json-outputis optional
mere.run vision ocr
Extract text from one or more images.
bash
swift run mere.run vision ocr <images...> [options]Key options:
--backend:lighton,glm, orcompare--model: path to the LightOn OCR root when using the LightOn backend--max-tokens--quiet
Examples:
bash
swift run mere.run model pull vision-ocr-lighton
swift run mere.run vision ocr ./page.png --backend lighton --model ~/Library/Application\ Support/MereRun/models/vision-ocr-lighton
swift run mere.run vision ocr ./page.png --backend glmmere.run music generate
Generate music from a caption and optional lyrics using the native ACE-Step pipeline.
bash
swift run mere.run music generate "<caption>" [options]Key options:
--output--checkpoints-root--lyrics--lyrics-file--duration--steps--use-lm--lm-subdirectory--text-subdirectory--seed--quiet
Environment:
MERERUN_MUSIC_ACESTEP_ROOT
Examples:
bash
swift run mere.run music generate "upbeat electronic groove" --output ./track.wav
swift run mere.run music generate \
"ambient piano and soft rain" \
--lyrics-file ./lyrics.txt \
--duration 8 \
--steps 4 \
--output ./ambient.wavmere.run video generate
Generate MP4 video with the native LTX pipelines.
bash
swift run mere.run video generate "<prompt>" [options]Key options:
--variant:distilledorunified-av--model-root--output--width,--height--num-frames--fps--seed--image--image-strength--quiet
Environment:
MERERUN_VIDEO_LTX_MODEL_ROOT
Examples:
bash
swift run mere.run video generate \
"a cinematic drone flythrough over snowy mountains" \
--variant unified-av \
--model-root ~/Library/Application\ Support/MereRun/models/video-ltx-av
swift run mere.run video generate \
"woman walking in neon rain" \
--image frame.png \
--output ./rain.mp4mere.run video export-latents
Run native distilled LTX denoising and export the final latent tensor.
bash
swift run mere.run video export-latents \
--model-root /path/to/distilled-ltx \
--output out.safetensors \
"a cinematic drone flyover at sunrise"mere.run model list
List all managed model IDs and whether they are installed.
bash
swift run mere.run model listmere.run status
Show a quick local snapshot: whether the API server answers, which model it reports as loaded through /v1/models, the active model-store path/source, and which managed models are installed in that store.
bash
swift run mere.run status
swift run mere.run status --host 127.0.0.1 --port 11434
swift run mere.run status --jsonUseful options:
--host: local API host to check, default127.0.0.1--port: local API port to check, default8080--api-key: bearer token for/v1/models, also read fromMERERUN_API_KEY--timeout-seconds: network probe timeout--json: emit a structured snapshot for scripts and agents
mere.run model pull
Download a managed Hugging Face snapshot into the local model store. The command checks the model capability catalog and available disk space before downloading so unsupported Macs do not pull models they cannot run and tight disks fail with a useful cache path.
bash
swift run mere.run model pull image-zimage-nano
swift run mere.run model pull --allUse --allow-unsupported only when you intentionally accept the runtime risk.
mere.run model capabilities
Show this Mac's supported models, recommended setup package, and a short summary of what each model does.
bash
swift run mere.run model capabilities
swift run mere.run model capabilities --allmere.run model info
Inspect a canonical model ID or a local model root.
bash
swift run mere.run model info image-zimage-nano
swift run mere.run model info /path/to/model/root --components
swift run mere.run model info text-chat-gemma4mere.run model remove
Delete an installed managed model by canonical ID.
bash
swift run mere.run model remove image-zimage-nano
swift run mere.run model remove image-zimage-nano --forcemere.run model repair-manifests
Write missing mererun_model.json manifests for known local model roots.
bash
swift run mere.run model repair-manifests
swift run mere.run model repair-manifests --dry-runmere.run api serve
Start an OpenAI-compatible local API server.
bash
swift run mere.run api serve [options]Current endpoint surface:
GET /healthGET /v1/modelsPOST /v1/chat/completions
Security defaults:
- loopback binds are local-first and do not require auth
- non-loopback binds require
--api-keyorMERERUN_API_KEY POST /v1/chat/completionsrequiresContent-Type: application/json--rate-limit-per-minuteapplies basic request throttling toPOST /v1/chat/completions- generation parameters are bounded before execution; for example,
max_tokensmust fit the configured context size - LoRA adapters for the API server are selected by the operator with
--lora; request bodies cannot provide local LoRA paths
Engine values:
text-codetext-chat-kleintext-chat-gemma4text-chat-q35text-chat-deepseek-v4-flash
OpenAI chat compatibility:
- DS4 raw-proxies the full
/v1/chat/completionsbody tods4-server. - Native engines decode the common OpenAI Chat request shape and reject unsupported high-impact fields with
invalid_request_error. max_completion_tokens,developermessages, function tools, image content parts, structured JSON mode, and streaming usage are capability-gated by engine.
Examples:
bash
swift run mere.run api serve
swift run mere.run api serve --engine text-chat-gemma4
swift run mere.run api serve --engine text-code --model ./Qwen3-Coder-Next-Q4_K_M.gguf
swift run mere.run api serve --host 0.0.0.0 --port 11434 --api-key "$MERERUN_API_KEY" --rate-limit-per-minute 120After starting a server, run swift run mere.run status from another terminal to confirm /health, /v1/models, and the served model.
mere.run setup
Choose the public onboarding path. The default interactive command offers the local Mere agent powered by Pi, a bring-your-own-agent handoff prompt, or manual commands.
bash
swift run mere.run setup
swift run mere.run setup --mode agent --agent-model small --dry-run
swift run mere.run setup --mode agent --agent-model tier --install --start
swift run mere.run setup --mode byoa
swift run mere.run setup --mode manualAgent model choices:
small:text-agent-qwen35-9b, a Qwen3.5 9B Q4 GGUF setup agent for 16 GB Macstier: the best supported local tier for this Mac, currently 9B, Q35 nano, Qwen3-Coder Next, or DeepSeek V4 Flash on 96 GB+ Macspremier:text-agent-deepseek-v4-flash, the preferred managed 96 GB+ setup-agent tier served by the bundled DS4 engine
BYOA prints a ready-to-paste Claude/Codex prompt. Manual mode prints the commands for capabilities, model pulls, serving, and optional Pi installation.
mere.run agent onboard
Lower-level agent plumbing used by mere.run setup. Print a guided setup summary for the current Mac. Optional flags can pull the recommended supported model package, install Pi, and write a Pi provider extension that points at mere.run api serve.
bash
swift run mere.run agent onboard
swift run mere.run agent onboard --pull-recommended
swift run mere.run agent onboard --install-pi --configure-pi
swift run mere.run agent onboard --configure-pi --model text-agent-deepseek-v4-flashmere.run agent install-pi
Install the latest badlogic/pi-mono release asset for the current macOS architecture into the mere.run application-support directory.
bash
swift run mere.run agent install-pimere.run agent start
Start a local API server for a selected managed agent model and launch Pi against the mere-run provider. GGUF code models use --engine text-code, Q35 uses --engine text-chat-q35, and DeepSeek V4 Flash uses the DS4-backed --engine text-chat-deepseek-v4-flash. If --model is omitted, agent start uses the best installed startable setup agent first, then a valid persisted Pi provider model, then the current machine's startable hardware tier. On 96 GB+ Apple Silicon Macs, DeepSeek V4 Flash is the preferred setup-agent tier; Q35 and Qwen models are alternatives, not upgrades.
bash
swift run mere.run model pull text-agent-deepseek-v4-flash
swift run mere.run agent install-pi
swift run mere.run agent start --model text-agent-deepseek-v4-flashValidation and smoke runs
Standard repo validation:
bash
./scripts/check.shFast smoke suite:
bash
./scripts/e2e_smoke.sh --coreInstalled-model sweep:
bash
./scripts/e2e_smoke.sh --installed