Token Counter

Count tokens, characters, and words in text. Essential tool for AI developers working with LLMs to estimate token usage and costs.

Token Counter

An essential tool for AI developers working with Large Language Models (LLMs). This component provides real-time counting of tokens, characters, and words in text, helping you estimate token usage and costs before sending requests to AI APIs.

Features

Real-Time Counting - Updates as you type
Token Estimation - Estimates token count using a hybrid algorithm
Character Count - Total character count including spaces
Word Count - Accurate word count
Copy to Clipboard - One-click copy of the text
Visual Statistics - Beautiful card-based display of metrics
TypeScript - Fully typed with TypeScript interfaces
Responsive Design - Works on all screen sizes

Installation

pnpm dlx ncdai add token-counter

Basic Usage

import { TokenCounter } from "@/components/token-counter";
 
export function MyCounterApp() {
  return (
    <TokenCounter
      defaultText="Enter your text here..."
      onCopy={(text, tokenCount) => {
        console.log(`Copied text with ${tokenCount} tokens`);
      }}
    />
  );
}

Advanced Usage

import { TokenCounter } from "@/components/token-counter";
import { useState } from "react";
 
export function AdvancedCounterApp() {
  const [text, setText] = useState("");
 
  const handleCopy = (text: string, tokenCount: number) => {
    // Calculate cost (example: $0.002 per 1K tokens)
    const cost = (tokenCount / 1000) * 0.002;
    
    console.log({
      text,
      tokens: tokenCount,
      estimatedCost: `$${cost.toFixed(4)}`,
    });
  };
 
  return (
    <div>
      <TokenCounter
        defaultText={text}
        onCopy={handleCopy}
      />
    </div>
  );
}

Props

`TokenCounterProps`

Prop	Type	Default	Description
`defaultText`	`string`	`""`	Initial text value
`onCopy`	`(text: string, tokenCount: number) => void`	-	Callback function called when text is copied
`className`	`string`	-	Additional CSS classes for the container

Token Estimation Algorithm

The component uses a hybrid estimation algorithm that combines:

Word-based estimation - ~1.3 tokens per word (average)
Character-based estimation - ~4 characters per token (average)

The final token count is the average of both methods, providing a more accurate estimate than either method alone.

Note: This is an estimation. For production use with specific models, consider using dedicated tokenizer libraries:

tiktoken for OpenAI models
@anthropic-ai/tokenizer for Anthropic models
gpt-tokenizer for GPT models

Use Cases

1. Cost Estimation

const handleCopy = (text: string, tokenCount: number) => {
  // GPT-4 pricing: $0.03 per 1K input tokens
  const inputCost = (tokenCount / 1000) * 0.03;
  
  // Estimate output (typically 50% of input)
  const outputCost = (tokenCount / 1000) * 0.06;
  
  const totalCost = inputCost + outputCost;
  
  console.log(`Estimated cost: $${totalCost.toFixed(4)}`);
};

2. Prompt Length Validation

const handleCopy = (text: string, tokenCount: number) => {
  const MAX_TOKENS = 4000; // Model context limit
  
  if (tokenCount > MAX_TOKENS) {
    alert(`Prompt too long! ${tokenCount} tokens exceeds ${MAX_TOKENS} limit.`);
    return;
  }
  
  // Proceed with API call
};

3. Batch Processing

const texts = ["Text 1", "Text 2", "Text 3"];
 
texts.forEach((text) => {
  const tokenCount = estimateTokens(text);
  console.log(`${text}: ${tokenCount} tokens`);
});

Integration with Tokenizer Libraries

For more accurate token counting, you can integrate with dedicated tokenizer libraries:

Using tiktoken (OpenAI)

import { encoding_for_model } from "tiktoken";
 
const enc = encoding_for_model("gpt-4");
const tokens = enc.encode(text);
const tokenCount = tokens.length;

Using @anthropic-ai/tokenizer (Anthropic)

import { countTokens } from "@anthropic-ai/tokenizer";
 
const tokenCount = countTokens(text);

Best Practices

Use Estimations for Planning - Use this component for quick estimates during development
Use Tokenizers for Production - Switch to dedicated tokenizers for production accuracy
Monitor Costs - Track token usage to monitor API costs
Set Limits - Implement token limits to prevent excessive API usage
Cache Results - Cache token counts for frequently used prompts

Statistics Display

The component displays three key metrics:

Tokens - Estimated token count (most important for LLM APIs)
Characters - Total character count including spaces
Words - Word count (useful for readability metrics)

Example

Text Input

Tokens

Estimated

Characters

166

Including spaces

Words

Word count

Note: Token count is an estimation. For accurate token counting with specific models (GPT-4, Claude, etc.), use dedicated tokenizer libraries like tiktoken or @anthropic-ai/tokenizer.

Dependencies

lucide-react - For icons
@ncdai/utils - Utility functions
button - Button component (shadcn/ui)
textarea - Textarea component (shadcn/ui)

Notes

Token estimation is approximate and may vary by model
For production use, consider integrating with model-specific tokenizers
Character and word counts are exact
The component is optimized for real-time updates