WordFor Benchmark

Measures scoring speed and end-to-end query latency in your browser.

Model Quality (67-query test set)

Offline evaluation on pre-built embeddings. Updated each time compare_eval.py is run.

MethodModeMRRHit@1Hit@6Notes
Potion base int4Lite0.429022/6738/67distilled-mxbai base, int4 scoring
Potion fine-tuned int4Lite0.481526/6740/67distilled-mxbai fine-tuned, int4 scoring
Full pure binary (ITQ)Full0.530828/6745/67mdbr-leaf-mt, binary ITQ (mobile)
Full binary+int4 rerankFull0.602535/6750/67mdbr-leaf-mt, binary+int4 rerank (desktop)
Loading data files...
Back to WordFor

Scoring Performance (per-query, all entries)

Benchmarks the raw vector scoring step only (no model inference). 20 iterations, random query vectors.

End-to-End Query Latency

Full pipeline: query encoding (model inference) + scoring + ranking.

TestStatusLoad (s)Query avg (ms)Notes

Log