docs: update dspy notebook to reflect new instrumentation for v2.5 an…

…d above (#4946)
Arize-ai · Oct 10, 2024 · 61d8a1c · 61d8a1c
1 parent 030426e
commit 61d8a1c
Showing 1 changed file with 75 additions and 37 deletions.
diff --git a/tutorials/tracing/dspy_tracing_tutorial.ipynb b/tutorials/tracing/dspy_tracing_tutorial.ipynb
@@ -2,7 +2,9 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "ugOiyLQRScii"
+   },
    "source": [
     "<center>\n",
     "    <p style=\"text-align:center\">\n",
@@ -21,7 +23,7 @@
     "\n",
     "- Composable and declarative APIs that allow developers to describe the architecture of their LLM application in the form of a \"module\" (inspired by PyTorch's `nn.Module`),\n",
     "- Compilers known as \"teleprompters\" that optimize a user-defined module for a particular task. The term \"teleprompter\" is meant to evoke \"prompting at a distance,\" and could involve selecting few-shot examples, generating prompts, or fine-tuning language models.\n",
-    " \n",
+    "\n",
     "Phoenix makes your DSPy applications *observable* by visualizing the underlying structure of each call to your compiled DSPy module and surfacing problematic spans of execution based on latency, token count, or other evaluation metrics.\n",
     "\n",
     "In this tutorial, you will:\n",
@@ -34,7 +36,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "9PzyTdXkScij"
+   },
    "source": [
     "## 1. Install Dependencies and Import Libraries\n",
     "\n",
@@ -47,20 +51,14 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "!pip install \"regex~=2023.10.3\" dspy-ai  # DSPy requires an old version of regex that conflicts with the installed version on Colab\n",
-    "!pip install arize-phoenix openinference-instrumentation-dspy opentelemetry-exporter-otlp"
+    "!pip install arize-phoenix \"dspy-ai>=2.5.0\" \"openinference-instrumentation-dspy>=0.1.13\" openinference-instrumentation-litellm opentelemetry-exporter-otlp"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "⚠️ DSPy conflicts with the default version of the `regex` module that comes pre-installed on Google Colab. If you are running this notebook in Google Colab, you will likely need to restart the kernel after running the installation step above and before proceeding to the rest of the notebook, otherwise, your instrumentation will fail."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "--ju_0z3Scik"
+   },
    "source": [
     "Import libraries."
    ]
@@ -82,7 +80,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "skhq25K-Scil"
+   },
    "source": [
     "## 2. Configure Your OpenAI API Key\n",
     "\n",
@@ -103,11 +103,13 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "9koncSAzScil"
+   },
    "source": [
     "## 3. Configure Module Components\n",
     "\n",
-    "A module consists of components such as a language model (in this case, OpenAI's GPT 3.5 turbo), akin to the layers of a PyTorch module and a retriever (in this case, ColBERTv2)."
+    "A module consists of components such as a language model (in this case, OpenAI's GPT-4), akin to the layers of a PyTorch module and a retriever (in this case, ColBERTv2)."
    ]
   },
   {
@@ -116,17 +118,19 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "turbo = dspy.OpenAI(model=\"gpt-3.5-turbo\")\n",
+    "lm = dspy.LM(\"openai/gpt-4\", cache=False)\n",
     "colbertv2_wiki17_abstracts = dspy.ColBERTv2(\n",
     "    url=\"http://20.102.90.50:2017/wiki17_abstracts\"  # endpoint for a hosted ColBERTv2 service\n",
     ")\n",
     "\n",
-    "dspy.settings.configure(lm=turbo, rm=colbertv2_wiki17_abstracts)"
+    "dspy.settings.configure(lm=lm, rm=colbertv2_wiki17_abstracts)"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "iN-4OqMyScil"
+   },
    "source": [
     "## 4. Load Data\n",
     "\n",
@@ -154,7 +158,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "VgIFspM0Scil"
+   },
    "source": [
     "Each example in our training set has a question and a human-annotated answer."
    ]
@@ -171,7 +177,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "T3ylXcAdScil"
+   },
    "source": [
     "Examples in the dev set have a third field containing titles of relevant Wikipedia articles."
    ]
@@ -188,7 +196,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "3m301eQXScil"
+   },
    "source": [
     "## 5. Define Your RAG Module\n",
     "\n",
@@ -215,7 +225,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "Do8HeVZ3Scim"
+   },
    "source": [
     "Define your module by subclassing `dspy.Module` and overriding the `forward` method."
    ]
@@ -240,21 +252,27 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "mGWLBPeIScim"
+   },
    "source": [
     "This module uses retrieval-augmented generation (using the previously configured ColBERTv2 retriever) in tandem with chain of thought in order to generate the final answer to the user."
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "fh71fUzPScim"
+   },
    "source": [
     "## 6. Compile Your RAG Module"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "6BlkKG1RScim"
+   },
    "source": [
     "In this case, we'll use the default `BootstrapFewShot` teleprompter that selects good demonstrations from the the training dataset for inclusion in the final prompt."
    ]
@@ -283,14 +301,18 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "1-vGu_aDScim"
+   },
    "source": [
     "## 7. Instrument DSPy and Launch Phoenix"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "teUVpEjtScim"
+   },
    "source": [
     "Now that we've compiled our RAG program, let's see what's going on under the hood.\n",
     "\n",
@@ -308,9 +330,13 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "-fIfJ9AOScim"
+   },
    "source": [
-    "Then instrument your application with [OpenInference](https://github.com/Arize-ai/openinference/tree/main/spec), an open standard build atop [OpenTelemetry](https://opentelemetry.io/) that captures and stores LLM application executions. OpenInference provides telemetry data to help you understand the invocation of your LLMs and the surrounding application context, including retrieval from vector stores, the usage of external tools or APIs, etc."
+    "Then instrument your application with [OpenInference](https://github.com/Arize-ai/openinference/tree/main/spec), an open standard build atop [OpenTelemetry](https://opentelemetry.io/) that captures and stores LLM application executions. OpenInference provides telemetry data to help you understand the invocation of your LLMs and the surrounding application context, including retrieval from vector stores, the usage of external tools or APIs, etc.\n",
+    "\n",
+    "DSPy uses LiteLLM under the hood to invoke LLMs. We add the `LiteLLMInstrumentor` here so we can get token counts for LLM spans."
    ]
   },
   {
@@ -320,23 +346,29 @@
    "outputs": [],
    "source": [
     "from openinference.instrumentation.dspy import DSPyInstrumentor\n",
+    "from openinference.instrumentation.litellm import LiteLLMInstrumentor\n",
     "\n",
     "from phoenix.otel import register\n",
     "\n",
     "register(endpoint=\"http://127.0.0.1:6006/v1/traces\")\n",
-    "DSPyInstrumentor().instrument(skip_dep_check=True)"
+    "DSPyInstrumentor().instrument(skip_dep_check=True)\n",
+    "LiteLLMInstrumentor().instrument(skip_dep_check=True)"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "quSMbL5DScim"
+   },
    "source": [
     "## 8. Run Your Application"
    ]
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "XmXs-qUWScim"
+   },
    "source": [
     "Let's run our DSPy application on the dev set."
    ]
@@ -366,7 +398,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "RBye2l4EScin"
+   },
    "source": [
     "Check the Phoenix UI to inspect the architecture of your DSPy module."
    ]
@@ -382,7 +416,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "deaC5uIsScin"
+   },
    "source": [
     "A few things to note:\n",
     "\n",
@@ -395,7 +431,9 @@
   },
   {
    "cell_type": "markdown",
-   "metadata": {},
+   "metadata": {
+    "id": "lIqk4MMXScin"
+   },
    "source": [
     "Congrats! You've used DSPy to bootstrap a multishot prompt with hard negative passages and chain of thought, and you've used Phoenix to observe the inner workings of DSPy and understand the internals of the forward pass."
    ]
@@ -407,5 +445,5 @@
   }
  },
  "nbformat": 4,
- "nbformat_minor": 2
+ "nbformat_minor": 0
 }