Multilanguage support #14

markist · 2024-01-14T17:58:17Z

markist
Jan 14, 2024

Hi, what would be the way forward on multilanguage support? Just another basemodel or is this more complicated?

acon96 · 2024-01-14T23:45:20Z

acon96
Jan 14, 2024
Maintainer

I think it depends on the approach.

I don't know of any popular "small" language models (<7B) that support multiple languages right now. We would need to have a new foundation model trained from the ground-up in the new language. After that you would need to translate this dataset into that new language as well in order for the fine-tuning to be effective.

With a much larger model, it should have multiple languages in it's base training data and could potentially "transfer" the fine-tuning from English examples to another language. I believe Mixtral-8x7b has shown some particularly good capabilities with performing tasks in multiple languages as well as English, but that is a huge model that is still fairly difficult to run on consumer hardware (reasonable quantizations are still 32GB of VRAM)

0 replies

acon96 · 2024-01-22T23:17:04Z

acon96
Jan 22, 2024
Maintainer

Wanted to add an update on this: As I have done more digging, it looks like there might be some resources to put together a multi-language version of the model.

There is a translation of the alpaca dataset that I used into 132 languages: https://huggingface.co/datasets/saillab/taco-datasets This should help with the fine-tuning step.
Also, anecdotally I have seen promising results for multi-language reasoning in smaller models that are trained using a tokenizer that is better at handling non-english text.

The biggest hurdle would be to translate this projects dataset into other languages. Not sure if I'm a fan of using machine translation on it since I don't have the ability to rate the accuracy of its output whatsoever.

0 replies

acon96 · 2024-02-01T05:37:58Z

acon96
Feb 1, 2024
Maintainer

After doing some quick experimenting with stabilityai/stablelm-2-zephyr-1_6b, I think it is a good base model to try and move forward with. It has a much better multi-language understanding and I was able to get it to almost properly respond to my google translated German request after 30 minutes of fine-tuning WITHOUT any German in the fine-tuning dataset:

<|system|>
*snip*
switch.sauna_lights 'Sauna Lights Switch' = off
*snip*
<|user|>
Schalten Sie das Licht in der Sauna aus<|endoftext|>
<|assistant|>
das Licht im Raum mit dem Label "Sauna" ausschalten
```homeassistant
{"service": "lock.lock", "target_device": "switch.sauna_lights"}

To me, this is promising, and I think having proper fine-tuning examples in the desired language should significantly boost the model's performance up closer to the English performance will be.

The next step is to translate the fine-tuning dataset to other languages. If you want to have this project usable in your language, please take a look at:

Adding a column to these CSV files for your language would be very helpful so we don't need to use machine translation which will inevitably lead to lower quality data.

0 replies

acon96 · 2024-02-23T02:35:03Z

acon96
Feb 23, 2024
Maintainer

I have fine-tuned a version of StableLM-Zephyr-3B and it has surprisingly good multi-language understanding given the model is not trained on non-english data. It is available here: https://huggingface.co/acon96/Home-3B-v3-GGUF

I was able to get it to respond to English, German, Spanish, and French

2 replies

acon96 Feb 23, 2024
Maintainer

Also if you are a native speaker in any of those languages, I would love to hear feedback on the model's performance since I only know English and a bit of Spanish.

Tastaturpilot Mar 6, 2024

Works for german.

mkdirmushroom · 2024-02-23T03:26:19Z

mkdirmushroom
Feb 23, 2024

I have fine-tuned a version of StableLM-Zephyr-3B and it has surprisingly good multi-language understanding given the model is not trained on non-english data. It is available here: https://huggingface.co/acon96/Home-3B-v3-GGUF

I was able to get it to respond to English, German, Spanish, and French

Hello, you've worked hard. How is the Chinese performance of this model?

2 replies

acon96 Feb 23, 2024
Maintainer

Not sure. It isn't something I'm really equipped to test myself. I am at least familiar with European languages so I am able to mess with some basic examples for those languages.

mkdirmushroom Feb 25, 2024

Okay, I will test the 0.5B, 1.8B, and 4B versions of Qwen models soon. I believe it will perform well in handling Chinese and English, and I will post any progress here.

BramNH · 2024-03-06T14:33:19Z

BramNH
Mar 6, 2024

The english version is awesome, nice work! Any updates on implementing the Dutch language? Would love to speak to the AI in Dutch.
The model does seem to (partially) support speaking Dutch to it, but is way less accurate in controlling the entities than speaking English.

13 replies

BramNH Mar 14, 2024

Just to share info: I am able to train another base model using unsloth.
The base model is unsloth/mistral-7b-bnb-4bit and I have trained it with your english homeassistant training set. This base model does understand Dutch better, but still is not trained good enough on Dutch for the home assistant tasks. So I am considering translating the training set. The model is fully offloaded onto my GTX 1080 and takes up about 4.7GB of VRAM.

acon96 Mar 15, 2024
Maintainer

Awesome! Glad to see it working. I think unsloth is doing QLoRA (4 bit model weights) instead of just LoRA (16 bit model weights) which explains why it fits into the VRAM for your card.

Would love to hear what solution you find for translation (or if it ends up just being done by hand).

BramNH Mar 15, 2024

Indeed, but Axolotl also supports qlora. It just would not load into my GPU at all.

I'll come back here if I have a solution for the translation.

BramNH Mar 16, 2024

I got it working quite well 😄.

I am just using Google Translator from the deep_translator library. The Dutch translations turn out better then I expected, but not perfect.
For now I created a fork and this script to translate the CSV piles and also copied the generate script to get the data from the Dutch CSV files.

acon96 Mar 17, 2024
Maintainer

this is AWESOME! thanks for sharing

gacekk · 2024-05-06T11:22:12Z

gacekk
May 6, 2024

Hi,

I see this topic was dead for couple of months now. I see that there are currently translations to three languages other than English.

I would very much like to help in translation to Polish. This week I will try to complete translation of the two data files to polish.

Just let me know what would be the next steps once that's done.

Regards

18 replies

acon96 Jun 14, 2024
Maintainer

@gacekk finally got some time and was able to do a training run using polka-1.1b as the base model. It seems to work OK so far (~60% accuracy on the eval script with mostly EOS token issues but that should be easy to fix with a longer training run). With some more messing around and a few more iterations on the hyperparameters, I'll hopefully be able to post something to try out this weekend.

Do you have the ability to run PyTorch (*.safetensors) models or should I make it a GGUF?

gacekk Jun 14, 2024

Hi. Thank you for response. Great to hear that. GGUF would be appreciated as well

acon96 Jun 15, 2024
Maintainer

I've uploaded one of the training runs here: https://huggingface.co/acon96/tinyhome-polish-experimental/tree/main

The accuracy is just OK (70%) and I think it is because the base model is just TinyLlama (1.1B parameters with almost exclusively English training data) with a fine-tune step using the extra Polish tokens. A larger model that has seen more Polish tokens might end up making more sense to try out using In Context Learning to prompt the model. Unless you are super interested in running this on a low powered machine, that's probably a better approach then trying to find a small enough base model to fine-tune.

gacekk Jun 16, 2024

I've uploaded one of the training runs here: https://huggingface.co/acon96/tinyhome-polish-experimental/tree/main

The accuracy is just OK (70%) and I think it is because the base model is just TinyLlama (1.1B parameters with almost exclusively English training data) with a fine-tune step using the extra Polish tokens. A larger model that has seen more Polish tokens might end up making more sense to try out using In Context Learning to prompt the model. Unless you are super interested in running this on a low powered machine, that's probably a better approach then trying to find a small enough base model to fine-tune.

Thank you. I'm not really worried on running this on low performance machine. I will run it on a PC, if the mini PC i have currently won't work I will build sth on my own, and prefer to have a better model. I recently found this Bielik model. 100 percent trained using Polish data base on Mistral 7b, there is separate one available on 7b instruct. Not sure which one would be better. But u guess Bielik would give better results as base model for this. https://huggingface.co/speakleash/Bielik-7B-v0.1

Not sure if you can help with that or maybe I should try to seek help outside to do the training on my own 3090

gacekk Jun 24, 2024

I've uploaded one of the training runs here: https://huggingface.co/acon96/tinyhome-polish-experimental/tree/main

The accuracy is just OK (70%) and I think it is because the base model is just TinyLlama (1.1B parameters with almost exclusively English training data) with a fine-tune step using the extra Polish tokens. A larger model that has seen more Polish tokens might end up making more sense to try out using In Context Learning to prompt the model. Unless you are super interested in running this on a low powered machine, that's probably a better approach then trying to find a small enough base model to fine-tune.

@acon96 Hi, I was trying to run the train.py again today using the Bielik model, but apparently the script is now failing, due to changes in packages. Can you let me know what versions work for you on the file? Or if you are up to date on your end, update the train.py?

E.g I am currently getting following errors:

`┌───────────────────── Traceback (most recent call last) ─────────────────────┐
│ C:\anaconda3\envs\testowe\Lib\site-packages\transformers\utils\import_utils │
│ .py:1535 in get_module │
│ │
│ 1532 │ │
│ 1533 │ def get_module(self, module_name: str): │
│ 1534 │ │ try: │
│ > 1535 │ │ │ return importlib.import_module("." + module_name, self. │
│ 1536 │ │ except Exception as e: │
│ 1537 │ │ │ raise RuntimeError( │
│ 1538 │ │ │ │ f"Failed to import {self.name}.{module_name} bec │
│ │
│ C:\anaconda3\envs\testowe\Lib\importlib_init.py:126 in import_module │
│ │
│ 123 │ │ │ if character != '.': │
│ 124 │ │ │ │ break │
│ 125 │ │ │ level += 1 │
│ > 126 │ return _bootstrap._gcd_import(name[level:], package, level) │
│ 127 │
│ 128 │
│ 129 _RELOADING = {} │
│ in gcd_import:1204 │
│ in find_and_load:1176 │
│ in find_and_load_unlocked:1147 │
│ in load_unlocked:690 │
│ in exec_module:940 │
│ in call_with_frames_removed:241 │
│ │
│ C:\anaconda3\envs\testowe\Lib\site-packages\transformers\trainer.py:213 in │
│ │
│ │
│ 210 │ import safetensors.torch │
│ 211 │
│ 212 if is_peft_available(): │
│ > 213 │ from peft import PeftModel │
│ 214 │
│ 215 │
│ 216 if is_accelerate_available(): │
│ │
│ C:\Users\kosia\AppData\Roaming\Python\Python311\site-packages\peft_init │
│ .py:22 in │
│ │
│ 19 │
│ 20 version = "0.11.1" │
│ 21 │
│ > 22 from .auto import ( │
│ 23 │ AutoPeftModel, │
│ 24 │ AutoPeftModelForCausalLM, │
│ 25 │ AutoPeftModelForSequenceClassification, │
│ │
│ C:\Users\kosia\AppData\Roaming\Python\Python311\site-packages\peft\auto.py: │
│ 31 in │
│ │
│ 28 │ AutoTokenizer, │
│ 29 ) │
│ 30 │
│ > 31 from .config import PeftConfig │
│ 32 from .mapping import MODEL_TYPE_TO_PEFT_MODEL_MAPPING │
│ 33 from .peft_model import ( │
│ 34 │ PeftModel, │
│ │
│ C:\Users\kosia\AppData\Roaming\Python\Python311\site-packages\peft\config.p │
│ y:23 in │
│ │
│ 20 from huggingface_hub import hf_hub_download │
│ 21 from transformers.utils import PushToHubMixin │
│ 22 │
│ > 23 from .utils import CONFIG_NAME, PeftType, TaskType │
│ 24 │
│ 25 │
│ 26 @DataClass │
│ │
│ C:\Users\kosia\AppData\Roaming\Python\Python311\site-packages\peft\utils_ │
│ init.py:23 in │
│ │
│ 20 # from .config import PeftConfig, PeftType, PromptLearningConfig, Task │
│ 21 from .loftq_utils import replace_lora_weights_loftq │
│ 22 from .peft_types import PeftType, TaskType │
│ > 23 from .other import ( │
│ 24 │ TRANSFORMERS_MODELS_TO_PREFIX_TUNING_POSTPROCESS_MAPPING, │
│ 25 │ TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING, │
│ 26 │ TRANSFORMERS_MODELS_TO_ADALORA_TARGET_MODULES_MAPPING, │
│ │
│ C:\Users\kosia\AppData\Roaming\Python\Python311\site-packages\peft\utils\ot │
│ her.py:24 in │
│ │
│ 21 import accelerate │
│ 22 import torch │
│ 23 from accelerate.hooks import add_hook_to_module, remove_hook_from_mod │
│ > 24 from accelerate.utils import is_npu_available, is_xpu_available │
│ 25 from huggingface_hub import file_exists │
│ 26 from huggingface_hub.utils import EntryNotFoundError, HFValidationErr │
│ 27 from packaging import version │
└─────────────────────────────────────────────────────────────────────────────┘
ImportError: cannot import name 'is_npu_available' from 'accelerate.utils'
(C:\Users\kosia\AppData\Roaming\Python\Python311\site-packages\accelerate\utils
_init.py)

The above exception was the direct cause of the following exception:

┌───────────────────── Traceback (most recent call last) ─────────────────────┐
│ C:\Users\kosia\home-llm\train.py:11 in │
│ │
│ 8 import time │
│ 9 import shutil │
│ 10 from torch.utils.data import SequentialSampler, Subset, RandomSampler │
│ > 11 from transformers import AutoModelForCausalLM, AutoTokenizer, Trainin │
│ 12 │ PreTrainedTokenizerFast, HfArgumentParser, AutoConfig, TrainerCal │
│ 13 from transformers.trainer_utils import EvalPrediction │
│ 14 │
│ in _handle_fromlist:1229 │
│ │
│ C:\anaconda3\envs\testowe\Lib\site-packages\transformers\utils\import_utils │
│ .py:1525 in getattr │
│ │
│ 1522 │ │ if name in self._modules: │
│ 1523 │ │ │ value = self._get_module(name) │
│ 1524 │ │ elif name in self._class_to_module.keys(): │
│ > 1525 │ │ │ module = self._get_module(self._class_to_module[name]) │
│ 1526 │ │ │ value = getattr(module, name) │
│ 1527 │ │ else: │
│ 1528 │ │ │ raise AttributeError(f"module {self.name} has no att │
│ │
│ C:\anaconda3\envs\testowe\Lib\site-packages\transformers\utils\import_utils │
│ .py:1537 in get_module │
│ │
│ 1534 │ │ try: │
│ 1535 │ │ │ return importlib.import_module("." + module_name, self. │
│ 1536 │ │ except Exception as e: │
│ > 1537 │ │ │ raise RuntimeError( │
│ 1538 │ │ │ │ f"Failed to import {self.name}.{module_name} bec │
│ 1539 │ │ │ │ f" traceback):\n{e}" │
│ 1540 │ │ │ ) from e │
└─────────────────────────────────────────────────────────────────────────────┘
RuntimeError: Failed to import transformers.trainer because of the following
error (look up to see its traceback):
cannot import name 'is_npu_available' from 'accelerate.utils' `

acon96 · 2024-06-16T00:10:03Z

acon96
Jun 16, 2024
Maintainer

An update for those following this thread:

I've uploaded a model I trained a little while ago but never got around to fully finalizing. It is a full fine-tuning of StableLM Zephyr 3B (instead of just a LoRA) using 3 languages (German, French, & Spanish) in addition to English.

It is available here: https://huggingface.co/acon96/stablehome-multilingual-experimental

Any feedback on how the model performs in languages other than English would be appreciated. I want to do another training run (it involves cloud compute instead of my workstation) but want to iterate on the translated dataset a bit before attempting to do that.

Make sure you install the develop branch of the integration if you want to use the new models (it should auto detect the settings for them)

3 replies

witold-gren Jul 30, 2024

Hey @acon96 , I saw that you also prepared polish dataset and you train some experimental model. When you plan merge those changes to development branch? I can help with translations (check if it is correct). Currently, there is no description of how to run the polish model in the current configuration (and I think it is impossible to do it without updating the code). Could you add a description of how to run this experimental model?

PS. Thank you very much for your incredible contribution to the creation of such a model.

gacekk Jul 30, 2024

Hey @acon96 , I saw that you also prepared polish dataset and you train some experimental model. When you plan merge those changes to development branch? I can help with translations (check if it is correct). Currently, there is no description of how to run the polish model in the current configuration (and I think it is impossible to do it without updating the code). Could you add a description of how to run this experimental model?

PS. Thank you very much for your incredible contribution to the creation of such a model.

Hej. Let’s take it offline in Polish

gacekk Aug 5, 2024

An update for those following this thread:

I've uploaded a model I trained a little while ago but never got around to fully finalizing. It is a full fine-tuning of StableLM Zephyr 3B (instead of just a LoRA) using 3 languages (German, French, & Spanish) in addition to English.

It is available here: https://huggingface.co/acon96/stablehome-multilingual-experimental

Any feedback on how the model performs in languages other than English would be appreciated. I want to do another training run (it involves cloud compute instead of my workstation) but want to iterate on the translated dataset a bit before attempting to do that.

Make sure you install the develop branch of the integration if you want to use the new models (it should auto detect the settings for them)

Hi @acon96 ,

Me and Witold are already ready to test out some fine tuned polish models.l

My Fine tune of Krakiowak 7b is available on Huggingface.

Can you please make a new version of the Home Assistant plugin to support Polish language? I think Witold already made valid commits for that

gacekk · 2024-12-11T20:34:28Z

gacekk
Dec 11, 2024

Hi

I finally have some more time to take a look at this and actual there was a new release of Bielik model in Polish. It's now available in v2.2. Can we work on better support of Polish?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multilanguage support #14

{{title}}

Replies: 9 comments 38 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Multilanguage support #14

Replies: 9 comments · 38 replies

acon96 Jan 14, 2024 Maintainer

acon96 Jan 22, 2024 Maintainer

acon96 Feb 1, 2024 Maintainer

acon96 Feb 23, 2024 Maintainer

acon96 Feb 23, 2024 Maintainer

acon96 Feb 23, 2024 Maintainer

acon96 Mar 15, 2024 Maintainer

acon96 Mar 17, 2024 Maintainer

acon96 Jun 14, 2024 Maintainer

acon96 Jun 15, 2024 Maintainer

acon96 Jun 16, 2024 Maintainer

Replies: 9 comments 38 replies

acon96
Jan 14, 2024
Maintainer

acon96
Jan 22, 2024
Maintainer

acon96
Feb 1, 2024
Maintainer

acon96
Feb 23, 2024
Maintainer

acon96 Feb 23, 2024
Maintainer

acon96 Feb 23, 2024
Maintainer

acon96 Mar 15, 2024
Maintainer

acon96 Mar 17, 2024
Maintainer

acon96 Jun 14, 2024
Maintainer

acon96 Jun 15, 2024
Maintainer

acon96
Jun 16, 2024
Maintainer