error in request_metrics dictionary implementation #49

Durga2Dash · 2024-05-14T20:38:20Z

Team,

While running the "token_benchmark_ray.py" on a model in google cloud vertex AI I noticed that the division on line 111 is failing because the division is like

request_metrics[common_metrics.INTER_TOKEN_LAT] /= num_output_tokens
or, request_metrics[common_metrics.INTER_TOKEN_LAT] = request_metrics[common_metrics.INTER_TOKEN_LAT]/ num_output_tokens
or, request_metrics[common_metrics.INTER_TOKEN_LAT] = []/1

where,
common_metrics.INTER_TOKEN_LAT = inter_token_latency_s
As we can see from the example response below inter_token_latency_s = []
num_output_tokens = 1

For Example:

Below is a sample of "request_metrics" obtained during the api call

{'error_code': 200, 'error_msg': "'dict' object has no attribute 'split'", 'inter_token_latency_s': [], 'ttft_s': 0, 'end_to_end_latency_s': 1.4629004680000435, 'request_output_throughput_token_per_s': 0, 'number_total_tokens': 538, 'number_output_tokens': 0, 'number_input_tokens': 538}

File: token_benchmark_ray.py

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

error in request_metrics dictionary implementation #49

error in request_metrics dictionary implementation #49

Durga2Dash commented May 14, 2024

error in request_metrics dictionary implementation #49

error in request_metrics dictionary implementation #49

Comments

Durga2Dash commented May 14, 2024