diff --git a/docs/evaluation/how-to-evals/evaluating-phoenix-traces.md b/docs/evaluation/how-to-evals/evaluating-phoenix-traces.md index 4e0ef65de3..ca19a0ab16 100644 --- a/docs/evaluation/how-to-evals/evaluating-phoenix-traces.md +++ b/docs/evaluation/how-to-evals/evaluating-phoenix-traces.md @@ -160,7 +160,7 @@ We now have a DataFrame with a column for whether each joke is a repeat of a pre Our evals\_df has a column for the span\_id and a column for the evaluation result. The span\_id is what allows us to connect the evaluation to the correct trace in Phoenix. Phoenix will also automatically look for columns named "label" and "score" to display in the UI. ```python -eval_df["score"] = eval_df["label"].astype(int) +eval_df["score"] = eval_df["score"].astype(int) eval_df["label"] = eval_df["label"].astype(str) ```