openclaw
← All products
Services & Infrastructure

data-labeling-svc

Data-labeling subcontract pipeline. Qwen does first-pass labeling at $0;

Get startedSource on GitHub

Documentation

data-labeling-svc

Data-labeling subcontract pipeline. Qwen does first-pass labeling at $0; human reviewers handle low-confidence + spot-check items. Take Surge / Scale / Labelbox subcontracts at 60-90% margin.

Economics

Metric Cloud-LLM operator Us (Local Qwen)
LLM cost per record $0.05-0.30 $0
Human spot-check (10%) $0.04 (avg @ 40/hr × 6 sec) $0.04
Customer rate (Surge typical) $0.50-3.00 $0.50-3.00
Net margin 30-60% 60-90%

Pricing models

  • Subcontract Surge / Scale work at their per-record rates; capture margin
  • Direct to ML teams at $0.05-0.50 per record (still profitable, beats their internal cost)
  • Custom labeling pipelines — $5-25K per task setup + per-record fee

Run

cd C:\openclaw-products\data-labeling-svc
python -m venv .venv
.\.venv\Scripts\activate
pip install -e .

# Ollama running locally
labelpipe label \
   --spec examples/sentiment-spec.yaml \
   --input-csv unlabeled-1k.csv \
   --out labeled.csv \
   --threshold 0.85 --spot-check 0.10

Roadmap

  • Multi-label / multi-task support
  • Inter-annotator agreement metrics (Qwen vs human)
  • Active-learning loop (use review-queue to retrain prompts)
  • Direct Surge / Scale API integration
  • Per-task gold-standard validation set