Skip to content

Commit

Permalink
pass rates with current approach
Browse files Browse the repository at this point in the history
  • Loading branch information
pavel-esir committed Dec 16, 2024
1 parent e749333 commit c7bb795
Show file tree
Hide file tree
Showing 6 changed files with 4,450 additions and 4,385 deletions.
100 changes: 50 additions & 50 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -512,17 +512,17 @@ This report is autogenerated and includes tokenizers and detokenizers tests. The
<tbody>
<tr>
<td >BPE</td>
<td >97.18</td>
<td >95.69</td>
<td >4544</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >89.19</td>
<td >86.36</td>
<td >6633</td>
</tr>
<tr>
<td >Tiktoken</td>
<td >96.56</td>
<td >95.42</td>
<td >524</td>
</tr>
<tr>
Expand All @@ -548,283 +548,283 @@ This report is autogenerated and includes tokenizers and detokenizers tests. The
<tr>
<td >BPE</td>
<td >EleutherAI/gpt-neox-20b</td>
<td >95.92</td>
<td >94.29</td>
<td >245</td>
</tr>
<tr>
<td >BPE</td>
<td >NousResearch/Meta-Llama-3-8B-Instruct</td>
<td >100.00</td>
<td >99.19</td>
<td >247</td>
</tr>
<tr>
<td >BPE</td>
<td >Salesforce/codegen-16B-multi</td>
<td >96.17</td>
<td >94.64</td>
<td >261</td>
</tr>
<tr>
<td >BPE</td>
<td >Xenova/gpt-4o</td>
<td >100.00</td>
<td >98.47</td>
<td >261</td>
</tr>
<tr>
<td >BPE</td>
<td >ai-forever/rugpt3large_based_on_gpt2</td>
<td >94.64</td>
<td >93.10</td>
<td >261</td>
</tr>
<tr>
<td >BPE</td>
<td >bigscience/bloom</td>
<td >97.55</td>
<td >95.92</td>
<td >245</td>
</tr>
<tr>
<td >BPE</td>
<td >databricks/dolly-v2-3b</td>
<td >95.92</td>
<td >94.29</td>
<td >245</td>
</tr>
<tr>
<td >BPE</td>
<td >deepseek-ai/deepseek-coder-6.7b-instruct</td>
<td >99.24</td>
<td >98.48</td>
<td >263</td>
</tr>
<tr>
<td >BPE</td>
<td >facebook/galactica-120b</td>
<td >95.92</td>
<td >94.29</td>
<td >245</td>
</tr>
<tr>
<td >BPE</td>
<td >facebook/opt-66b</td>
<td >96.73</td>
<td >95.92</td>
<td >245</td>
</tr>
<tr>
<td >BPE</td>
<td >gpt2</td>
<td >95.40</td>
<td >93.87</td>
<td >261</td>
</tr>
<tr>
<td >BPE</td>
<td >koalajun/Gemma-2-9b-it-Ko-Crypto-Translate</td>
<td >100.00</td>
<td >99.19</td>
<td >247</td>
</tr>
<tr>
<td >BPE</td>
<td >laion/CLIP-ViT-bigG-14-laion2B-39B-b160k</td>
<td >100.00</td>
<td >95.40</td>
<td >261</td>
</tr>
<tr>
<td >BPE</td>
<td >microsoft/deberta-base</td>
<td >96.73</td>
<td >95.92</td>
<td >245</td>
</tr>
<tr>
<td >BPE</td>
<td >roberta-base</td>
<td >95.40</td>
<td >94.64</td>
<td >261</td>
</tr>
<tr>
<td >BPE</td>
<td >stabilityai/stablecode-completion-alpha-3b-4k</td>
<td >95.92</td>
<td >94.29</td>
<td >245</td>
</tr>
<tr>
<td >BPE</td>
<td >stabilityai/stablelm-2-1_6b</td>
<td >100.00</td>
<td >98.37</td>
<td >245</td>
</tr>
<tr>
<td >BPE</td>
<td >tiiuae/falcon-7b</td>
<td >93.87</td>
<td >92.34</td>
<td >261</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >NousResearch/Llama-2-13b-hf</td>
<td >97.55</td>
<td >99.18</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >NousResearch/Llama-2-13b-hf_legacy_sp_backend</td>
<td >97.55</td>
<td >98.37</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >NousResearch/Llama-2-13b-hf_sp_backend</td>
<td >94.29</td>
<td >99.18</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >TinyLlama/TinyLlama-1.1B-Chat-v1.0</td>
<td >100.00</td>
<td >99.19</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >TinyLlama/TinyLlama-1.1B-Chat-v1.0_legacy_sp_backend</td>
<td >98.38</td>
<td >97.57</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >TinyLlama/TinyLlama-1.1B-Chat-v1.0_sp_backend</td>
<td >100.00</td>
<td >99.19</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >baichuan-inc/Baichuan2-7B-Chat_legacy_sp_backend</td>
<td >100.00</td>
<td >98.37</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >camembert-base_legacy_sp_backend</td>
<td >75.51</td>
<td >69.80</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >camembert-base_sp_backend</td>
<td >52.24</td>
<td >46.53</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >facebook/musicgen-small_legacy_sp_backend</td>
<td >78.37</td>
<td >72.65</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >facebook/musicgen-small_sp_backend</td>
<td >83.67</td>
<td >77.96</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >microsoft/Phi-3-mini-128k-instruct</td>
<td >100.00</td>
<td >98.38</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >microsoft/Phi-3-mini-128k-instruct_legacy_sp_backend</td>
<td >97.57</td>
<td >95.95</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >microsoft/Phi-3-mini-128k-instruct_sp_backend</td>
<td >99.19</td>
<td >97.57</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >microsoft/deberta-v3-base_legacy_sp_backend</td>
<td >100.00</td>
<td >94.29</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >microsoft/deberta-v3-base_sp_backend</td>
<td >96.73</td>
<td >91.02</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >mlx-community/quantized-gemma-7b-it</td>
<td >97.57</td>
<td >96.76</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >mlx-community/quantized-gemma-7b-it_legacy_sp_backend</td>
<td >97.57</td>
<td >96.76</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >mlx-community/quantized-gemma-7b-it_sp_backend</td>
<td >96.76</td>
<td >95.95</td>
<td >247</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >rinna/bilingual-gpt-neox-4b_legacy_sp_backend</td>
<td >86.12</td>
<td >85.31</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >rinna/bilingual-gpt-neox-4b_sp_backend</td>
<td >80.41</td>
<td >77.14</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >t5-base_legacy_sp_backend</td>
<td >80.00</td>
<td >74.29</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >t5-base_sp_backend</td>
<td >85.31</td>
<td >79.59</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >xlm-roberta-base_legacy_sp_backend</td>
<td >95.10</td>
<td >89.39</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >xlm-roberta-base_sp_backend</td>
<td >95.10</td>
<td >89.39</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >xlnet-base-cased_legacy_sp_backend</td>
<td >57.96</td>
<td >52.24</td>
<td >245</td>
</tr>
<tr>
<td >SentencePiece</td>
<td >xlnet-base-cased_sp_backend</td>
<td >64.49</td>
<td >58.78</td>
<td >245</td>
</tr>
<tr>
<td >Tiktoken</td>
<td >Qwen/Qwen-14B-Chat</td>
<td >100.00</td>
<td >98.47</td>
<td >261</td>
</tr>
<tr>
<td >Tiktoken</td>
<td >THUDM/glm-4-9b-chat</td>
<td >93.16</td>
<td >92.40</td>
<td >263</td>
</tr>
<tr>
Expand Down
Loading

0 comments on commit c7bb795

Please sign in to comment.