Bundler-Aware Message Sidecars
This note captures a future-facing idea that likely belongs primarily in Palamedes or another host adapter, but is directly informed by Ferrocat's catalog compilation work.
The central problem is not "how do we parse more PO files?" It is:
- how do we keep client bundle size under control for large applications
- without forcing manual message sharding onto application authors
- while still using stable compiled message IDs
Problem Shape
The naive client-side approach is:
- build one ESM catalog module per locale
- dynamically import that locale module
- return a large in-memory message map to the client runtime
That works, but scales poorly as applications grow:
- the client often loads many more messages than the current UI actually needs
- large locale modules become expensive to ship, parse, and keep resident
- locale variants such as
de-CHnaturally want overlay semantics rather than another full copy of the world
At the same time, manual message code-splitting is not attractive:
- authors should not need to decide which file or route "owns" a message
- duplicate source strings across shards make conflict management harder
- translators and extraction workflows should not be shaped by bundler internals
Observed Constraints
The most important product constraints for this direction are:
- runtime language switching in the same loaded client is intentionally out of scope
- a fast reload is preferable to trying to live-switch all translated UI state
- top-level translation macro expansion is already considered an anti-pattern because locale selection and i18n bootstrapping can be async
That last point is especially important. If top-level translation lookups are already forbidden, message payloads do not need to exist before JavaScript chunk evaluation. They only need to exist before later render or handler code calls into the translation runtime.
This makes an async sidecar-loading model much more feasible.
Core Idea
Keep normal Vite/Rollup chunking as the primary source of truth, then attach message payloads to those chunks as locale-specific sidecars.
In rough terms:
- translation macros continue to produce stable compiled IDs in application code
- build tooling records which message IDs each module references
- after final bundler chunking is known, the IDs are aggregated from
module -> idsintochunk -> ids - for each
(chunk, locale)pair, emit a compact message sidecar - emit a manifest that maps JS chunks to their locale-specific sidecars
The result is not "split messages by hand." It is:
- let the bundler decide code chunk boundaries
- derive message payload boundaries from those same chunk boundaries
This keeps message distribution aligned with the actual loading behavior of the application.
Why This Is Interesting
This approach potentially gives the best parts of two worlds:
- application authors keep normal bundler-driven code splitting
- message payloads can be loaded only for chunks that are actually used
- locale-specific payloads stay much smaller than one giant per-locale catalog
- no human-maintained "message shard files" are required
For locale overlays such as de-CH, the same model can later support:
- a small
de-CHsidecar - fallback to a larger
desidecar - optional fallback to source locale data only when needed
That is conceptually similar to the new compile_catalog_artifact semantics,
but the delivery unit changes from "entire requested-locale artifact" to
"chunk-addressable locale sidecar."
Suggested Architecture
Build-Time Collection
Do not parse final JavaScript output to discover message usage if it can be avoided.
Prefer a two-stage model:
- the macro/plugin layer records message IDs per source module
- the bundler integration aggregates those IDs after final chunking is known
This is more robust than scraping emitted code and stays compatible with tree-shaking and chunk renaming.
Emitted Artifacts
The likely outputs are:
- normal JS chunks from Vite/Rollup
- one locale-specific message sidecar per chunk
- one manifest describing which sidecar belongs to which chunk and locale
Conceptually:
assets/
app-ABC123.js
checkout-XYZ999.js
i18n-manifest.json
i18n/de/app-ABC123.messages.json
i18n/de/checkout-XYZ999.messages.json
i18n/de-CH/app-ABC123.messages.jsonThe sidecar format does not need to be ESM. It could be JSON or another compact lookup-oriented artifact.
Runtime Flow
At runtime:
- application decides the locale once during boot
- when a chunk is about to be used, its locale sidecar is loaded
- sidecar messages are registered in the in-memory translation store
- UI code that runs later can resolve
t(id)as normal
Because top-level translation usage is intentionally disallowed, the runtime can remain async here without trying to beat chunk evaluation itself.
Dev Server Flow
The same idea becomes even more interesting in development:
- JS chunk loads can trigger a corresponding sidecar lookup
- the dev server can keep Rust-side catalogs resident and watched
- changed translations can be pushed through hot reload without rebuilding a giant locale module
The development protocol might eventually become batch-oriented rather than key-oriented:
- load all message IDs needed for chunk
X - not one HTTP request per translation key
Where Ferrocat Fits
Ferrocat should probably stay focused on catalog semantics and artifact building, not own the bundler integration itself.
Useful Ferrocat responsibilities for this direction:
- stable compiled message IDs
- locale fallback semantics
- host-neutral catalog artifacts
- compact, reproducible message payload generation
Likely Palamedes or host-adapter responsibilities:
- mapping modules to message IDs
- mapping final chunks to those module-level IDs
- emitting sidecars and manifests in bundler output
- dev-server protocol and runtime chunk/sidecar loading
Current Ferrocat Status
The original shape described in this note is no longer purely hypothetical. Ferrocat now already exposes the core build-time primitives a host adapter would need to prototype chunk-addressable locale sidecars.
Available today:
compile_catalog_artifactfor full requested-locale host-neutral runtime artifacts with fallback resolution, missing reports, and final ICU stringsCompiledCatalogIdIndexfor deterministiccompiled_id -> source_keyindexing across one or more normalized catalogscompile_catalog_artifact_selectedfor compiling only a selected subset of compiled runtime IDs into a locale artifact slicecompile_catalog_artifact_selected_reportfor the same selected compile flow with structured reporting of unknown or unavailable compiled IDsCompiledCatalogIdIndex::describe_compiled_idsfor lightweight metadata about requested compiled IDs, including available locales and singular vs plural shapeCompiledCatalogIdIndex::as_btreemapandinto_btreemapfor exporting the ordered compiled-ID mapping into host-side caches or manifests
This means the chunk-based ideal is already reachable at build time:
- a host adapter collects
compiled_ids per module or final chunk - Ferrocat builds or reuses a
CompiledCatalogIdIndex - the host adapter calls
compile_catalog_artifact_selectedper(chunk, locale)pair - the result is emitted as a locale-specific sidecar payload
What remains outside Ferrocat is mostly orchestration rather than missing catalog semantics:
- module-to-ID collection in macros or build plugins
- aggregation from modules into final bundler chunks
- manifest generation and sidecar emission format
- runtime registration/loading behavior
- dev-server hot reload protocol
One later optimization track still intentionally remains out of scope for now: borrowed or lazy subset compilation that avoids fully materializing normalized catalogs before compiling a selected ID subset. That may become interesting for very large catalogs, but it is not required for the first chunk-sidecar prototype.
Non-Goals
This note is not proposing:
- manual per-route message files
- runtime language switching within the same loaded client
- full locale-permutated builds as the only strategy
- putting Rust directly in the browser
Locale-permutated builds are still interesting for some deployment models, but the sidecar concept is attractive precisely because it preserves normal client chunking while reducing message payload size.
Open Questions
- what should the sidecar wire format be
- how aggressively should shared-chunk message sets be deduplicated
- should sidecars contain final locale-resolved strings or layered overlay data
- how should locale overlays such as
de-CH -> de -> enbe represented - how should the runtime coordinate "chunk loaded" vs "messages registered"
- how much over-approximation is acceptable when tree-shaking changes module contents after the macro/plugin layer recorded message IDs
Current Recommendation
Treat this as a serious follow-up direction for Palamedes:
- keep Ferrocat responsible for semantic compile primitives
- explore chunk-addressable locale sidecars in the host integration layer
- prefer bundler-aware message distribution over manual message sharding
The idea appears especially promising for large applications where the current "one large locale catalog module" approach becomes increasingly wasteful.