Evaluating Home LLM and collaboration with core developers #179
Replies: 1 comment 2 replies
-
Hi that sounds interesting. I currently don't have much time to commit to this project and am mostly just making fixes to the integration as it breaks. If all you're looking for are some local models to benchmark then all of the models that I've fine-tuned are on HuggingFace here: https://huggingface.co/acon96 If you need them in a different format (i.e. safetensors) just let me know and I can upload them when I get a bit. I have my own eval script in this repo that at least tries to measure how accurate the model is but it does utilize a similar test set as the training set (no direct examples but device names and request are re-used). I would definitely like to see results from completely held out examples since I couldn't find any other datasets online when I built my own dataset. |
Beta Was this translation helpful? Give feedback.
-
Hi there,
I have been working with a group of other home assistant core developers on the LLM APIs and the evaluation in https://github.com/allenporter/home-assistant-datasets for setting a quality baseline.
Are you interested in collaborating on a quality evaluation for Home LLM? Generally I am interested in establishing a quality baseline to show how your fine tuning or other fixes are improving quality over base local LLMs, and helping the community run their own local LLMs.
If you are interested and/or want to join the home assistant LLM group on discord to discuss let me know!
Beta Was this translation helpful? Give feedback.
All reactions