Patch Notes

A chronicle of Momory's evolution and continuous improvements for a better streaming experience.

3.1.0

2026-02-05

Next-Gen Engine: Media & VC Mode Overhaul

Overhauled Media and VC modes to use the latest Web Speech API standard, passing tab audio directly to the on-device SODA engine for superior accuracy, zero cost, and low latency.
Applied the robust self-healing architecture from Mic Mode to all modes, ensuring continuous transcription even with `no-speech` errors or other interruptions.
Fixed a critical bug where the translation API key could become stale, causing translations to fail across all modes. Also fixed a bug where transcriptions were not displayed correctly in VC mode.
Fully updated all documentation (Terms, Guide, Tech, FAQ) to reflect the new architecture, provide clearer instructions, and add important disclaimers.
Removed the deprecated high-precision (Whisper) AI mode from the UI and deleted all associated files, streamlining the codebase and user experience.

3.0.0

2026-02-03

Multi-Mode Support: Discord & Media Audio Translation

Introduced three distinct input modes: 'Microphone' (your voice), 'Media' (Discord, browser tabs, etc.), and 'Collab/VC' (both yours and Discord voices simultaneously).
Implemented 'Direct Media SODA' architecture, leveraging advanced browser features for high-precision, low-latency transcription of media audio.
Optimized conversation history retention to provide richer context for AI: increased up to 60 items for Paid Tier and 20 items for Free Tier.
Fully updated the Setup Guide, FAQ, and Tech Insights to cover the new input modes and technical background.
Added explicit disclaimers to Terms of Service regarding privacy and copyright when using Media and Collab modes.
Unified transcription handling logic across all modes to improve system stability and response consistency.

2.3.0

2026-01-30

Media Translation 2.0 & Advanced VAD

Completely overhauled 'Media Translation Mode' (Tab Audio). It now utilizes an in-browser Whisper AI pipeline, significantly improving accuracy and reducing costs compared to the previous direct-stream method.
Implemented a new 'Spectral VAD' (Voice Activity Detection) engine. It analyzes spectral flatness (SFM) to intelligently distinguish speech from BGM and noise, enabling robust transcription even during gaming or movie playback.
Added automatic VAD tuning for Japanese speech, optimizing sensitivity to capture soft-spoken phrases and trailing whispers typical in Japanese.
Media Mode now supports OBS WebSocket integration, allowing translated subtitles from tab audio to be sent directly to OBS text sources.
Unified the Dashboard Preview UI, enabling real-time verification of both transcription and translation in Media Mode, just like in Microphone Mode.

2.2.0

2026-01-22

SEO & Rich Results Optimization

Implemented comprehensive structured data (JSON-LD) for WebSite, Organization, and SoftwareApplication to enable Google Rich Results.
Added FAQPage schema to the FAQ section, allowing questions and answers to appear directly in search results.
Integrated BreadcrumbList across all pages to support hierarchical URL display in search engines.
Automated sitemap.xml and robots.txt generation to improve crawl efficiency and index coverage.
Unified brand identity on Japanese pages by prioritizing the 'ももりー' label for better local search visibility.
Refined the FAQ page layout to perfectly align with other documentation pages for a consistent UI experience.

2.1.0

2026-01-22

Enhanced Documentation & Navigation

Detailed our technical philosophy and architecture on the landing page and FAQ to provide deeper insights into Momory's design.
Added comprehensive explanations of the 'Adaptive Debounce Algorithm' in Tech Insights and FAQ, clarifying how we achieve balanced latency and precision.
Improved cross-page anchor support to ensure smooth navigation to specific sections across different languages.
Refined internal linking structure for a more seamless and consistent multi-language browsing experience.

2.0.0

2026-01-19

OBS WebSocket Integration & Enhanced Streaming

Introduced OBS WebSocket integration, allowing direct subtitle transfer to a text source in OBS. This eliminates the need for window capture, improving streaming performance and stability.
Added a dedicated control panel on the dashboard to configure and manage your connection to OBS. Simply enter your connection details to get started.
Provided a comprehensive, step-by-step setup guide for both the new WebSocket method (recommended) and the traditional overlay window method.
Ensured multiple translation languages are correctly formatted with line breaks when sent to OBS.
Improved internal connection logic for a more robust and reliable experience.

1.4.1

2026-01-16

Self-Healing Microphone & Stability Fixes

Implemented a 'Self-Healing' architecture for the microphone. It now automatically restarts in the background if interrupted by system resource pressure, ensuring continuous capture.
Removed an unnecessary page reload workaround for mobile Safari, improving the user experience.
Updated FAQ with the latest OBS setup instructions and more specific future development goals.
Fixed an issue preventing the API Usage Monitor from updating in real-time.
Corrected translation keys for Open Graph images to ensure titles and descriptions display correctly on social media.

1.4.0

2026-01-16

FAQ Section & Enhanced Speech Stability

Added a comprehensive, multilingual FAQ page to provide self-service support and troubleshooting.
Enhanced the FAQ page with smooth-scroll navigation and content categorization for a better user experience.
Further improved the stability of the 'Standard' speech recognition engine to prevent unexpected stalls.
Centralized Open Graph (OG) image generation to ensure consistent and beautiful social sharing previews.
Resolved several build errors related to dependencies and CSS configuration to improve platform reliability.

1.3.0

2026-01-15

High-Precision AI Mode & Enhanced Responsiveness

Introduced 'High-Precision AI Mode' using a WebGPU-powered Whisper model for exceptionally accurate transcription.
Re-engineered the AI transcription engine, removing the 1-second buffer previously used for stabilization, resulting in a more immediate transcription experience.
Achieved robust voice activity detection (VAD) by leveraging the browser's native Web Speech API, significantly improving stability.
Implemented 'Power Saving Mode': dashboard UI updates are now paused when the tab is in the background to reduce CPU load.
Optimized dashboard performance by throttling microphone volume meter updates, improving responsiveness.

1.2.0

2026-01-14

Stability & UX Enhancements

Improved speech recognition robustness with a 15-second heartbeat monitor to recover from silent freezes.
Implemented an automatic page reload trigger for mobile Safari when returning from background to address layout issues.
Refined feedback system: notifications are now unified to orange in the overlay and limited to positive interactions for better stream flow.
Fixed a visual bug where the microphone volume bar showed 100% on zero input.
Strengthened API security with request validation, length limits, and data sanitization.
Improved microphone capture stability with a more reliable hardware reset logic during start.

1.1.0

2026-01-13

Precision & Real-time Monitoring

Added real-time API usage monitor (RPM, TPM, RPD) and cost estimation to the dashboard for transparent tracking.
Expanded free-tier conversation history limit to 20 lines, providing more context for translations.
Optimized prompt data limits to better utilize available TPM (Tokens Per Minute) for higher translation quality.
Refined AI repair instructions to better handle repetitions, fillers (umm, uh), and STT errors.
Fixed double-encoding issues and CSS specificity conflicts in the overlay settings.

1.0.8

2026-01-13

Branding & Transparency Update

Updated branding including a new logo, mobile-optimized icons, and design refinements across the platform.
Implemented dynamic OGP image generation to ensure appropriate social sharing for all pages.
Improved accessibility with better contrast ratios and reduced flickering animations.
Refined privacy documentation and architecture descriptions to clarify the 'Zero Server Storage' policy.
Corrected layout issues on the Author page and updated favicons for consistent display.

1.0.5

2026-01-13

Privacy Protection: Local Relay

Implemented 'Local Relay' mode using Service Workers: translation data is transferred to OBS within the browser.
Enhanced privacy by ensuring sensitive data remains in the user's local environment.
Introduced adaptive token management to balance quality and speed based on API availability.
Applied security headers including HSTS, X-Frame-Options, and COOP for platform safety.

1.0.0

2026-01-12

Initial Release of Momory

Official launch of Momory: A privacy-focused, low-latency AI subtitle service for streamers.
Modularized dashboard components for better maintainability and reliability.
Enabled vertical display of multiple target languages in the overlay.
Implemented initial interactive learning: AI improves style based on user feedback.