LEKCut

LEKCut (เล็ก คัด) is a Thai tokenization library that ports the deep learning model to the onnx model.

Install

pip install lekcut

How to use

from lekcut import word_tokenize
word_tokenize("ทดสอบการตัดคำ")
# output: ['ทดสอบ', 'การ', 'ตัด', 'คำ']

API

word_tokenize(text: str, model: str="deepcut", path: str="default", providers: List[str]=None) -> List[str]

Parameters:

text: Text to tokenize
model: Model to use (default: "deepcut")
path: Path to custom model file (default: "default")
providers: List of ONNX Runtime execution providers (default: None, which uses default CPU provider)

GPU Support

LEKCut supports GPU acceleration through ONNX Runtime execution providers. To use GPU acceleration:

Install ONNX Runtime with GPU support:
```
pip install onnxruntime-gpu
```

Use the providers parameter to specify GPU execution:

from lekcut import word_tokenize

# Use CUDA GPU
result = word_tokenize("ทดสอบการตัดคำ", providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])

# Use TensorRT (if available)
result = word_tokenize("ทดสอบการตัดคำ", providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'])

Available Execution Providers:

CPUExecutionProvider - Default CPU execution
CUDAExecutionProvider - NVIDIA CUDA GPU acceleration
TensorrtExecutionProvider - NVIDIA TensorRT optimization
DmlExecutionProvider - DirectML for Windows GPU
And more (see ONNX Runtime documentation)

Note: The providers are tried in order, and the first available one will be used. Always include CPUExecutionProvider as a fallback.

Model

deepcut - We ported deepcut model from tensorflow.keras to ONNX model. The model and code come from Deepcut's Github. The model is here.

Load custom model

If you have trained your custom model from deepcut or other that LEKCut support, You can load the custom model by path in word_tokenize after porting your model.

How to train custom model with your dataset by deepcut - Notebook (Needs to update deepcut/train.py before train model)

How to porting model?

See notebooks/

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
lekcut		lekcut
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
example_gpu_support.py		example_gpu_support.py
requirements.txt		requirements.txt
setup.py		setup.py
test_gpu_support.py		test_gpu_support.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LEKCut

Install

How to use

GPU Support

Model

Load custom model

How to porting model?

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

PyThaiNLP/LEKCut

Folders and files

Latest commit

History

Repository files navigation

LEKCut

Install

How to use

GPU Support

Model

Load custom model

How to porting model?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages