Measures scoring speed and end-to-end query latency in your browser.
Offline evaluation on pre-built embeddings. Updated each time compare_eval.py is run.
| Method | Mode | MRR | Hit@1 | Hit@6 | Notes |
|---|---|---|---|---|---|
| Potion base int4 | Lite | 0.5104 | 28/67 | 41/67 | distilled-mxbai base, int4 scoring |
| Potion fine-tuned int4 | Lite | 0.5776 | 35/67 | 43/67 | distilled-mxbai fine-tuned, int4 scoring |
| Full pure binary (ITQ) | Full | 0.6337 | 36/67 | 55/67 | mdbr-leaf-mt, 1-bit binary ITQ, 48 bytes/entry |
| Full pure int2 | Full | 0.6111 | 35/67 | 47/67 | mdbr-leaf-mt, pure int2, 96 bytes/entry |
| Full pure int3 | Full | 0.6351 | 35/67 | 53/67 | mdbr-leaf-mt, pure int3, 144 bytes/entry |
| Full pure int4 | Full | 0.6349 | 36/67 | 53/67 | mdbr-leaf-mt, pure int4, 192 bytes/entry |
| Full pure int8 | Full | 0.6431 | 36/67 | 55/67 | mdbr-leaf-mt, pure int8, 384 bytes/entry |
| Full binary+int3 rerank | Full | 0.6339 | 35/67 | 53/67 | mdbr-leaf-mt, binary+int3 rerank, 192 bytes/entry (desktop) |
| Full binary+int4 rerank | Full | 0.6353 | 36/67 | 53/67 | mdbr-leaf-mt, binary+int4 rerank, 240 bytes/entry |
Raw vector scoring speed for each quantization format — no model inference. 20 iterations, random query vectors. Use this alongside the offline MRR table above to choose the best format.
Full pipeline: model load + tokenization + inference + scoring + ranking.
| Test | Status | Load (s) | Query avg (ms) | Notes |
|---|