Tokenizer#

class langchain_text_splitters.base.Tokenizer(
chunk_overlap: int,
tokens_per_chunk: int,
decode: Callable[[list[int]], str],
encode: Callable[[str], list[int]],
)[source]#

Tokenizer data class.

Attributes

Methods

__init__(chunk_overlap,Β tokens_per_chunk,Β ...)

Parameters:
  • chunk_overlap (int)

  • tokens_per_chunk (int)

  • decode (Callable[[list[int]], str])

  • encode (Callable[[str], list[int]])

__init__(
chunk_overlap: int,
tokens_per_chunk: int,
decode: Callable[[list[int]], str],
encode: Callable[[str], list[int]],
) β†’ None#
Parameters:
  • chunk_overlap (int)

  • tokens_per_chunk (int)

  • decode (Callable[[list[int]], str])

  • encode (Callable[[str], list[int]])

Return type:

None