Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add convenience methods to detect if a model is installed #236

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

bmurray
Copy link

@bmurray bmurray commented Oct 31, 2024

This adds convenience functions to detect if a model is installed locally, without querying the server.

@ZachNagengast
Copy link
Contributor

Thanks for adding this @bmurray looks useful! Can you think of a test to add to verify it is working? Can be part of the unit tests here: https://github.com/argmaxinc/WhisperKit/blob/main/Tests/WhisperKitTests/UnitTests.swift

Also, was the need for this derived from needing offline access to the models? Or some other motivation? We definitely want to make sure offline access is always available so if there was a regression there from a recent release we should fix that.

@ZachNagengast ZachNagengast added the enhancement Improves existing code label Oct 31, 2024
@bmurray
Copy link
Author

bmurray commented Oct 31, 2024

Yea I'm working on replacing WhisperCPP in my app with this, and this is a feature that I needed. It's more or less just detecting if the model is already installed or not, so I can provide feedback to users. I'm probably going to add some other additions here in a bit that provide other feedback, such as if the model is downloading. As far as I can tell, theres no visibility into whats happening if it "auto" downloads a model when a transcription starts.

I'll take a look to see if theres a unit test that makes sense.

…lready downloaded, without querying online servers.
@bmurray
Copy link
Author

bmurray commented Nov 1, 2024

I took a look and I'm not sure where to put any tests for this that don't require downloading models just to test if they're installed.

I updated the pull to do offline support. With download: false set, and the model already downloaded (either manually or automatically), the transcriber can run. It probably needs some more testing as theres a bunch of things that silently query websites.

@ZachNagengast
Copy link
Contributor

Ok cool, I think a good test would start in a similar way to this one, where it just loads the model (which would imply its downloaded), then simply make sure the new functions correctly detect that it is installed.

Also I'd recommend keeping this interface the same modelRepo: String?, since folks will have a breaking change when they update to the new version.

@bmurray
Copy link
Author

bmurray commented Nov 1, 2024

Yea it makes sense not to break the API. I figured it was just unifying the API but it could break it for others.

I added a couple tests to ensure it works. Seems to be testing on my end here.

Copy link
Contributor

@ZachNagengast ZachNagengast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving pending the review comments so that the tests will run.

@@ -266,12 +266,87 @@ open class WhisperKit {
throw error
}
}
public static func modelLocation(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add a new line above this for code style. Will want to add docc for all of the functions in this file eventually but not a blocker.

throw error
}
}
public static func modelInstalled(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New line above this as well

"Model was not installed"
)
XCTAssertTrue(
WhisperKit.modelInstalled(variant: "tiny.en"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need some method to download this one ahead of time before checking it is installed.

run: make download-model MODEL=tiny
you can do that by either adding a line make download-model MODEL=tiny.en here to download it, or just removing this tiny.en check and only running this test for tiny.

let location2 = try WhisperKit.modelLocation(variant: "openai_whisper-tiny")
XCTAssertEqual(location, location2, "Auto fallback to OpenAI model returned a different result")
}
func testOfflineMode() async throws {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New line above this one

.withPrevious()
.sink { previous, current in
if let previous {
XCTAssertLessThan(previous, current)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improves existing code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants