Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add paddlenlp tokenizer #2706

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open

add paddlenlp tokenizer #2706

wants to merge 2 commits into from

Conversation

zxcd
Copy link

@zxcd zxcd commented Dec 20, 2024

add paddlenlp tokenizer

Copy link

paddle-bot bot commented Dec 20, 2024

Thanks for your contribution!

Copy link
Collaborator

@TingquanGao TingquanGao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

有几个地方需要确认下。另外就是重要函数、方法的type hints和docstring写清楚,特别是公共接口。docstring风格尽量参考Google Python Style Guide。

paddlex/inference/models_new/common/__init__.py Outdated Show resolved Hide resolved
import jieba
import numpy as np
import sentencepiece as spm
from paddle.utils import try_import
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paddle和fastdeploy冲突,无法同时安装,因此paddle的导入需要使用lazy_paddle代替,或是在函数内导入:使用 import lazy_paddle as paddle; paddle.try_import 代替,或是将该行代码置于函数内。

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

具体代码逻辑我不太懂,重要的函数、类方法写清楚docstring就行,特别是对外暴露的接口。

import json
import os

# import shutil
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

调试的代码删掉吧

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants