Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bench: Comparison with pgvectorscale #125

Open
gaocegege opened this issue Dec 6, 2024 · 5 comments
Open

bench: Comparison with pgvectorscale #125

gaocegege opened this issue Dec 6, 2024 · 5 comments

Comments

@gaocegege
Copy link
Member

No description provided.

@cutecutecat
Copy link
Member

cutecutecat commented Dec 12, 2024

Comparison with pgvectorscale

Dataset: laion-5m-768dim
Argument: default argument from pgvectorscale Readme

Advantages of pgvectorscale

✅ Double the capacity

To store the whole dataset, pgvectorscale cost 17G while VectorChord cost 34G on disk, this might be related to:

num_bits_per_dimension: Number of bits used to encode each dimension when using SBQ, 2 for less than 900 dimensions, 1 otherwise

Our scaler8 type might solve it.

✅ Fantastic cold start

VectorChord and pgvectorscale have a similar query speed at warm state.

For VectorChord, cold start is much slower than warm state, from QPS 29 to 201 at recall 0.95, about 7x accerate. For that reason, prewarm is really important for us.

However, we observed only about 2x accerate from pgvectorscale cold to warm. We can say it doesn't need prewarm at all.

Disadvantages of pgvectorscale

❌ Slower index build speed

  • VectorChord external build: 1240s on 4 cores
  • VectorChord internal build: 9239s on 4 cores
  • pgvectorscale build: 11540s on 1core, unable to use multi cores

❌ Not better performance

With the default argument and both dot/L2 metric on our dataset, we can not find a recall > 0.8 by configure query-time parameters: diskann.query_search_list_size

Even the dataset is dot-based, an dot metric is even worse than l2 metric.

L2 metric:

top 10 cold top 10 warm top 100 cold top 100 warm
Recall 0.7744 0.7744 0.6838 0.6838
QPS 159.72 242.71 73.54 148.10
P50 latency 5.13ms 4.04ms 12.12ms 6.64ms
P99 latency 19.15ms 7.30ms 24.91ms 10.93ms

Dot metric:

top 10 cold top 10 warm top 100 cold top 100 warm
Recall 0.6569 0.6569 0.6723 0.6723
QPS 245.06 265.28 73.58 145.48
P50 latency 3.93ms 3.75ms 13.52ms 6.82ms
P99 latency 9.89ms 6.25ms 23.74ms 10.77ms

For VectorChord, a typical recall standard is 0.95, the whole result can be found at #42 (comment) .

Update: Dot metric with more rerank(default = 500):

While change only diskann.query_search_list_size is useless, increase diskann.query_rescore is more helpful.

For default value, diskann.query_search_list_size=100 and diskann.query_rescore=50, we say rerank=300 means to set diskann.query_search_list_size=diskann.query_rescore=300.

Recall QPS P50 latency P99 latency
top 10 rerank=200 0.9471 144.79 6.69ms 12.82ms
top 10 rerank=250 0.9611 117.41 8.10ms 17.00ms
top 10 rerank=250 cold / 61.03 11.87ms 84.18ms
top 100 rerank=300 0.9088 67.17 14.13ms 29.73ms
top 100 rerank=350 0.9402 54.93 16.90ms 41.26ms
top 100 rerank=400 0.9601 37.06 24.30ms 71.01ms
top 100 rerank=400 cold / 26.16 30.71ms 116.25ms

@xieydd
Copy link
Member

xieydd commented Dec 12, 2024

What is the memory usage after prewarm?

@VoVAllen
Copy link
Member

Without the recall performance, the speed is useless.

@VoVAllen
Copy link
Member

Does more rerank help in pgvectorscale?

@cutecutecat
Copy link
Member

cutecutecat commented Dec 13, 2024

Does more rerank help in pgvectorscale?

It is much useful, I have updated the new result.

What is the memory usage after prewarm?

About 8G for top10 and 11G for top100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants