Tikentoken
Tikentoken is an Elixir library and CLI tool designed for experimentation and learning by tokenizing text with various Ollama models (supporting multiple modes for different tasks). It offers real tokenization when Ollama is available and falls back to a mock tokenizer when Ollama is not running.
Features
- Tokenize text into IDs using various Ollama models
- Automatic fallback to mock tokenization when Ollama is unavailable
- Compute embeddings using various embedding models
- CLI interface for easy use
- Standalone escript executable
Installation
Option 1: Install from Hex (recommended)
Add tikentoken to your list of dependencies in mix.exs:
def deps do
[
{:tikentoken, "~> 0.1.0"}
]
endThen run:
mix deps.getOption 2: Local Development
Ensure you have Elixir installed (version ~> 1.18)
Clone or download this repository
Install dependencies:
mix deps.get
Compilation
Compile the project:
mix compileTo build the standalone executable:
mix escript.build
This creates a tikentoken executable in the project root.
Usage
CLI
Tokenization
Tokenize text using any supported Ollama model (default: embeddinggemma):
echo "Hello world" | ./tikentokenOr specify a different model:
./tikentoken --model bge-large "Hello world"Output:
Tokens: 5 5(Note: With Ollama running, you'll get actual model token IDs. Without Ollama, it falls back to mock IDs based on word lengths.)
Embedding
Compute embeddings using BGE-Large. Embeddings are numerical representations of text that capture semantic meaning. Each text input is converted into a list of numbers (a vector) that can be used for similarity comparison, search, or machine learning. BGE-Large provides superior semantic matching with up to 1024 dimensions - or more, if the model used allows it.
echo "Hello world" | ./tikentoken --embed --embed_dim 768Output:
Embedding (psql vector): [0.123, 0.456, ...] # 768-dimensional vectorUse smaller dimensions (256, 128) for faster processing and less storage, larger dimensions (768, 512) for more semantic detail.
Chat
Generate text responses using chat-capable models. This uses Ollama's text generation capabilities:
tikentoken --chat "Write a haiku about programming"Output:
Software, code so clear
Infinite possibilities yet to unlock,
While you, the human, work day and night.
To program with ease, be mine!Or with a specific model:
tikentoken --model tinyllama --chat "Explain recursion in simple terms"Output:
Recursion is a programming technique where a function calls itself to solve a problem...Options
--model <name>: Ollama model name (default:embeddinggemmafor tokenize/embed,tinyllamafor chat). Supports:embeddinggemma,bge-large,gte-large,nomic-embed-text,tinyllama,mistral, etc.--format <id>: Token format (currently onlyidsupported)--embed: Compute embedding using the specified model--chat: Generate chat/text response using the specified model--ollama_url <url>: Ollama API base URL (default:http://localhost:11434)--embed_dim <int>: Embedding dimension - vector size (varies by model). Higher = more detail but more storage (default: 768)--help: Show help
Programmatic Usage
After adding the dependency, you can use Tikentoken in your Elixir code:
# Tokenize text (with fallback to mock if Ollama unavailable)
{:ok, tokens} = Tikentoken.tokenize("Hello world")
# tokens: [5, 5] (mock IDs) or [105, 1919] (real Ollama IDs)
# Use a different model
{:ok, tokens} = Tikentoken.tokenize("Hello world", "bge-large")
# tokens: [5, 5] (mock) or actual BGE token IDs
# Compute embeddings (requires Ollama running)
{:ok, embedding} = Tikentoken.compute_embedding("Hello world", 768, "embeddinggemma")
# embedding: [0.123, 0.456, ...] (768-dimensional vector)
# Use BGE-Large for higher dimensional embeddings
{:ok, embedding} = Tikentoken.compute_embedding("Hello world", 1024, "bge-large")
# embedding: [0.123, 0.456, ...] (1024-dimensional vector)
# Chat with AI models (uses tinyllama by default)
{:ok, response} = Tikentoken.chat("Write a short poem about AI")
# response: "In circuits deep, where data streams flow..."
# Chat with custom options
{:ok, response} = Tikentoken.chat("Explain quantum physics", "tinyllama", %{"temperature" => 0.5})Requirements
- Ollama: For real tokenization, embeddings, and chat, install and run Ollama locally. Pull any supported models:
ollama pull embeddinggemma # Default for tokenization/embeddings (768 dimensions) ollama pull bge-large # Up to 1024 dimensions for embeddings ollama pull tinyllama # Default for chat (text generation) ollama pull mistral # Alternative chat model ollama serve - Without Ollama, the tool falls back to mock tokenization for development/testing.
Testing
Run the test suite:
mix testThe tests work with or without Ollama running:
- With Ollama: Tests real tokenization
- Without Ollama: Tests fall back to mock tokenization automatically
All tests should pass regardless of Ollama status.
Development
-
Format code:
mix format -
Run linter:
mix credo(if installed) -
Generate documentation:
mix docs
Interactive Development
For developers who want to experiment and test the tokenizer interactively, start an IEx session with the project loaded:
iex -S mixThen you can play with the tokenizer in real-time:
# Test tokenization with default model
{:ok, tokens} = Tikentoken.tokenize("Hello world")
IO.inspect(tokens) # See the token IDs
# Try a different model
{:ok, tokens} = Tikentoken.tokenize("Hello world", "bge-large")
IO.inspect(tokens)
# Test embeddings (requires Ollama running)
{:ok, embedding} = Tikentoken.compute_embedding("Hello world", 1024, "bge-large")
IO.inspect(length(embedding)) # Should be 1024
# Test chat (requires Ollama running with tinyllama)
{:ok, response} = Tikentoken.chat("Hello, how are you?")
IO.inspect(response) # AI responseThis is useful for understanding how tokenization works, testing edge cases, and developing new features without rebuilding the CLI each time.
License
This project is licensed under the Affero GPLv3 license