Feature/caching #1922

hmoazam · 2024-12-10T22:15:05Z

One single caching interface which has two levels of cache - in memory lru cache and fanout (on disk)

…d and reset methods, test with multiple threads

okhat · 2024-12-10T23:27:28Z

dspy/__init__.py

@@ -64,6 +66,13 @@
 configure = settings.configure
 context = settings.context

+CACHE_DIR = os.environ.get("DSPY_CACHEDIR") or os.path.join(Path.home(), ".dspy_cache")


Note to self: .dspy_cache is already used for litellm, this PR should be swapped (it uses .dspy_litellm_cache for the existing cache but that cache doesn't exist at that new path)

also the limit should be 30GB

okhat · 2024-12-10T23:29:07Z

dspy/clients/lm.py

-        num_retries=num_retries,
-    )
+    from dspy import settings
+    @dspy_cache_decorator(settings.cache)


hmm this is a strange pattern; is the inner function redefined every time? does this mean the decorator... works? I guess it does, but yeah it's a bit overkill probably?

It does work but definitely a weird pattern. Not 100% sure if it has any negative side effects.

okhat · 2024-12-10T23:29:55Z

dspy/clients/lm.py

+    def cached_litellm_text_completion_inner(request: Dict[str, Any], num_retries: int):
+        return litellm_text_completion(
+            request,
+            cache={"no-cache": False, "no-store": False},


This passes cache to the function below. But for some reason we removed the cache kwarg below? Why?

The default (both kwargs true) should be added back to litellm_text_completion function (I forgot to fix this)

okhat · 2024-12-10T23:39:02Z

dspy/utils/cache.py

+            key = self.cache_key(request)
+        except Exception:
+            return None
+        with self.lock:  


Hmm do we need to lock just to read?

if we can remove this lock, we may wanna do something like if key := self.memory_cache.get() to avoid double access

I would think reads are thread safe but the docs aren't clear.

okhat · 2024-12-12T22:01:14Z

Recording some thoughts here.

We need to make sure that caching behavior makes sense if the function being cached raises an Exception.
We need to make sure that the caching behavior makes sense for LiteLLM. Specifically, LiteLLM's default caching is a bit nuanced: it does not cache api_* keys (which is important) and it does not cace metadata (like cost), as far as I can tell.

…re/caching_7Dec2024

dbczumar · 2024-12-20T00:39:30Z

dspy/utils/cache.py

+        """
+
+        self.memory_cache = LRUCache(maxsize=mem_size_limit)
+        self.fanout_cache = FanoutCache(shards=16, timeout=2, directory=directory, size_limit=disk_size_limit)


@hmoazam As I understand it, part of the motivation for this PR is to enable users to dump the current state of the cache for a particular DSPy session.

Is the FanoutCache introduced for that purpose? If so, since FanoutCache shares the same directory as the litellm.cache, what extra utility is it providing?

Separately, I think we should define the behavior for dumping / loading a cache.

For example, consider the case where I've already run "Program 1" as a Python script / notebook. Now, I'm running "Program 2" in a separate notebook / Python script invocation. I'm using the default DSPy cache directory for both program executions.

If I dump the cache after I run Program 2, do I also want to include cache contents from Program 1? Or, do I only want to include cache contents from Program 1, since that's my "current session"? Should users have a choice?

Once we align on the desired semantics, we can figure out how to make adjustments to the implementation.

cc @okhat

@dbczumar For disk caching, we will migrate from LiteLLM's caching to FanoutCache. The former has proved limiting when users need many read/write processes and threads in parallel.

For a short period of time, we might keep the LiteLLM cache as well. This means that highly active users that upgrade versions regularly will have an implicit seamless migration of their caches, because LiteLLM's requests will remain cached and will implicitly transfer to the FanoutCache. (This behavior is not required, and in principle there are other ways to more intentionally allow migration, but explicit migration is almost never worthwhile for caches.)

Cache dumping is fairly simple. It saves the current in-memory Python session, unless the user triggers a reset for the in-memory cache. (The current in-memory session draws on saved caches on disk, as usual.)

hmoazam added 5 commits December 8, 2024 22:15

Rebased latest changes

47d6534

WIP

4a9ac55

WIP - initial tests working fine. TODO: how/where to expose save, loa…

b5a948f

…d and reset methods, test with multiple threads

Cleaned up structure and added cache to settings

dfdf8f7

looks like it works

385014d

hmoazam requested a review from okhat December 10, 2024 22:15

okhat reviewed Dec 10, 2024

View reviewed changes

okhat added 3 commits December 11, 2024 09:05

Merge branch 'main' into feature/caching_7Dec2024

69c7e44

Some WIP adjustments (2 TODOs marked in cache.py!)

afd025a

DSPy cache decorator: use function identifier

e712a61

okhat added 2 commits December 12, 2024 18:38

Fixes

4c6b529

Merge branch 'v2.6' of https://github.com/stanfordnlp/dspy into featu…

bd6f527

…re/caching_7Dec2024

okhat changed the base branch from main to v2.6 December 13, 2024 15:38

WIP debugging the cache tests

9f3b090

dbczumar reviewed Dec 20, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/caching #1922

Feature/caching #1922

hmoazam commented Dec 10, 2024

okhat Dec 10, 2024 •

edited

Loading

okhat Dec 10, 2024

hmoazam Dec 10, 2024

okhat Dec 10, 2024

hmoazam Dec 10, 2024

okhat Dec 10, 2024

okhat Dec 10, 2024

hmoazam Dec 10, 2024

okhat commented Dec 12, 2024

dbczumar Dec 20, 2024

dbczumar Dec 20, 2024

okhat Dec 23, 2024

Feature/caching #1922

Are you sure you want to change the base?

Feature/caching #1922

Conversation

hmoazam commented Dec 10, 2024

okhat Dec 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

okhat commented Dec 12, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

okhat Dec 10, 2024 •

edited

Loading