pip install tokenizers==0.10.3

Fast and Customizable Tokenizers

Source
Among top 1000 packages on PyPI.
Over 11.0M downloads in the last 90 days.

Commonly used with tokenizers

Based on how often these packages appear together in public requirements.txt files on GitHub.

transformers

State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch

t5

Text-to-text transfer transformer

rouge-score

Pure python implementation of ROUGE-1.5.5.

tfds-nightly

tensorflow/datasets is a library of datasets ready to use with TensorFlow.

tensorflow-gan

TF-GAN: A Generative Adversarial Networks library for TensorFlow.

tensorflow-hub

TensorFlow Hub is a library to foster the publication, discovery, and consumption of reusable parts of machine learning models.

tf-slim

TensorFlow-Slim: A lightweight library for defining, training and evaluating complex models in TensorFlow

sacrebleu

Hassle-free computation of shareable, comparable, and reproducible BLEU, chrF, and TER scores

sentencepiece

SentencePiece python wrapper

kfac

K-FAC for TensorFlow

mesh-tensorflow

Mesh TensorFlow

tensorflow-text

TF.Text is a TensorFlow library of text related ops, modules, and subgraphs.

pyarrow

Python library for Apache Arrow

sacremoses

SacreMoses

omegaconf

A flexible configuration library

albumentations

Fast image augmentation library and easy to use wrapper around other libraries

avro-python3

Avro is a serialization and RPC framework.

tensorboard-plugin-wit

What-If Tool TensorBoard plugin.

fastavro

Fast read/write of AVRO files

Version usage of tokenizers

Proportion of downloaded versions in the last 3 months (only versions over 1%).

0.10.3

52.22%

0.9.4

17.81%

0.5.2

7.46%

0.7.0

6.81%

0.8.1rc2

4.60%

0.9.3

3.45%

0.8.1rc1

3.43%

0.0.11

1.33%