Documents
how translation cache works
how translation cache works
Type
Document
Status
Published
Created
Aug 8, 2025
Updated
Mar 28, 2026
Updated by
Dosu Bot

How Translation Cache Works#

The translation cache in Read Frog optimizes performance and reduces redundant translation requests by storing previously translated results in persistent browser storage.

Implementation#

Read Frog uses Dexie (a wrapper for IndexedDB) to provide persistent storage for translations in the browser extension. The cache is implemented as a table named translationCache in the Dexie database. Each cache entry is represented by the TranslationCache class, which includes:

  • key: a string used as the cache key (see below for details)
  • translation: the translated text
  • createdAt: the timestamp when the entry was created

Cache Usage Flow#

When a translation request is made, the system first checks the cache for an existing translation using a hash key. If a cached translation is found, it is returned immediately, avoiding an external translation API call. If the cache does not contain the requested translation, the system performs the translation using the selected provider (Google, Microsoft, or an AI model). After a successful translation, the result is stored in the cache with the hash key and a timestamp.

Cache Key Generation#

Article Translation#

The cache key for article translation is generated as a hash of the following values:

  • The original text to be translated
  • The provider configuration
  • The configured source language code (as set by the user or 'auto'; detected language is not included)
  • The target language code
  • For LLM (AI) providers: The resolved translation prompt (including any customizations)
  • For LLM (AI) providers with AI Content Aware enabled: Article context (title and content)

Note: The detected language code (detectedCode) is intentionally excluded from the cache key. This ensures that cache lookups remain stable even if language detection changes after translation. For LLM providers, the prompt used for translation is included in the cache key. This means that if you change the translation prompt (for example, by editing or selecting a different custom prompt), the cache will be invalidated for that prompt, and new translations will be fetched and cached accordingly. For non-LLM providers, the prompt is not included in the cache key.

Subtitle Translation#

Subtitle translation uses separate cache key generation logic from article translation. The cache key for subtitle translation includes all the standard components (text, provider, languages, prompt for LLM providers) plus additional subtitle-specific context:

When AI Content Aware is enabled:

  • Summary status flag: Either subtitleSummary=ready or subtitleSummary=missing
  • Summary content: The actual summary text (when available), prepended with summary:

This means that subtitle translations are cached separately based on whether AI Smart Context summary was available at translation time. The same subtitle text will have different cache entries when:

  1. Translated with AI Smart Context summary available vs. missing
  2. Translated with different summary content (e.g., different videos or context)

Implementation Details:
The subtitle summary context hash is computed using the buildSubtitlesSummaryContextHash() function in translator.ts, which hashes the cleaned subtitle transcript content and provider configuration. The summary is fetched asynchronously in the background via the getSubtitlesSummary message handler and injected into translation requests once available. If the context hash changes (e.g., video navigation), stale summaries are discarded to ensure cache consistency.

Impact on Cache Behavior:

  • Subtitle translations without a summary are cached separately from those with a summary
  • This prevents cache mismatches when AI Smart Context becomes available
  • Cache invalidation occurs when summary content or availability changes
  • The decoupled summary fetch allows translation to start immediately without waiting for summary generation

Article Context in Cache Keys (LLM Providers with AI Content Aware)#

When using LLM providers (not API providers) with AI Content Aware enabled, the cache key includes article context extracted from the current page:

  • Article title: Added to hash components as title:{title}
  • Article text content: First 1000 characters added as content:{textContent.slice(0, 1000)}

This means that the same text will have different cache entries when:

  1. It appears on different pages (different article titles)
  2. It appears in articles with different content
  3. The page title changes

Article Context Extraction:

  • Article context is extracted using the article-context.ts module
  • Uses Mozilla Readability to extract clean article content
  • Falls back to document.body.textContent if Readability fails
  • Context is cached per URL to avoid repeated extraction

Impact on Cache Behavior:

  • Cache entries are more specific when AI Content Aware is enabled with LLM providers
  • The same sentence on different pages will have separate cache entries
  • This allows context-aware translations to be properly cached without cross-contamination
  • Cache invalidation happens when navigating to a different URL (different article context)

Implementation Details:
The buildHashComponents function in translate-text.ts accepts an optional articleContext parameter with title and textContent fields. These are included in the hash when:

  1. Using an LLM provider (not API provider)
  2. AI Content Aware is enabled
  3. Article context data is available

The articleContext is explicitly passed through the translateTextCore() options and through all translation variant functions (translateTextForPage, translateTextForPageTitle, translateTextForSelection, translateTextForInput).

Example Cache Lookup and Write#

// Check cache first
const cached = await db.translationCache.get(hash);
if (cached) {
  return cached.translation;
}

// ...perform translation...

// Cache the translation result
await db.translationCache.put({
  key: hash,
  translation: result,
  createdAt: new Date(),
});

Cache Invalidation, Cleanup, and Manual Clearing#

To prevent the cache from growing indefinitely, Read Frog periodically deletes cache entries older than 7 days. This cleanup process is triggered by a browser alarm that runs every 24 hours and also executes immediately when the background script starts. The cleanup function deletes all entries in translationCache where createdAt is older than the cutoff date.

Manual Cache Clearing#

Users can manually clear all translation cache entries via the Options → Translation page in the extension. This feature provides a "Clear Cache" button, which, after confirmation, deletes all entries in the translationCache store. This action is irreversible and immediately frees up storage used by cached translations. The cache clearing is handled by the clearAllCache protocol message and the corresponding background handler.

Benefits#

This caching mechanism improves performance by reducing the number of external translation requests, ensures translations persist across browser sessions, and automatically manages storage by cleaning up old entries. Including the resolved prompt in the cache key for LLM providers ensures that changes to translation prompts are respected and do not result in stale or mismatched translations. Users also have direct control to clear all cached translations if needed.


Summary of Key Points:

  • The cache key includes the resolved prompt for LLM providers; changing the prompt invalidates the cache for those translations.
  • For article translation with LLM providers and AI Content Aware enabled, the cache key includes article context (title and first 1000 characters of content).
  • For subtitle translation with AI Content Aware enabled, the cache key includes a summary status flag and summary content (when available).
  • Subtitle translations are cached separately based on whether AI Smart Context summary was available at translation time.
  • Only custom prompts are stored in user config; the default prompt is a code constant.
  • Cache entries older than 7 days are automatically deleted.
  • Users can manually clear the cache from the Options page.
  • The cache improves performance and reduces unnecessary API calls.