Services & Infrastructure
data-labeling-svc
Data-labeling subcontract pipeline. Qwen does first-pass labeling at $0;
Documentation
data-labeling-svc
Data-labeling subcontract pipeline. Qwen does first-pass labeling at $0; human reviewers handle low-confidence + spot-check items. Take Surge / Scale / Labelbox subcontracts at 60-90% margin.
Economics
| Metric | Cloud-LLM operator | Us (Local Qwen) |
|---|---|---|
| LLM cost per record | $0.05-0.30 | $0 |
| Human spot-check (10%) | $0.04 (avg @ 40/hr × 6 sec) | $0.04 |
| Customer rate (Surge typical) | $0.50-3.00 | $0.50-3.00 |
| Net margin | 30-60% | 60-90% |
Pricing models
- Subcontract Surge / Scale work at their per-record rates; capture margin
- Direct to ML teams at $0.05-0.50 per record (still profitable, beats their internal cost)
- Custom labeling pipelines — $5-25K per task setup + per-record fee
Run
cd C:\openclaw-products\data-labeling-svc
python -m venv .venv
.\.venv\Scripts\activate
pip install -e .
# Ollama running locally
labelpipe label \
--spec examples/sentiment-spec.yaml \
--input-csv unlabeled-1k.csv \
--out labeled.csv \
--threshold 0.85 --spot-check 0.10
Roadmap
- Multi-label / multi-task support
- Inter-annotator agreement metrics (Qwen vs human)
- Active-learning loop (use review-queue to retrain prompts)
- Direct Surge / Scale API integration
- Per-task gold-standard validation set