Momory

Patch Notes

A chronicle of Momory's evolution and continuous improvements for a better streaming experience.

3.1.0
2026-02-05

Next-Gen Engine: Media & VC Mode Overhaul

  • Overhauled Media and VC modes to use the latest Web Speech API standard, passing tab audio directly to the on-device SODA engine for superior accuracy, zero cost, and low latency.
  • Applied the robust self-healing architecture from Mic Mode to all modes, ensuring continuous transcription even with `no-speech` errors or other interruptions.
  • Fixed a critical bug where the translation API key could become stale, causing translations to fail across all modes. Also fixed a bug where transcriptions were not displayed correctly in VC mode.
  • Fully updated all documentation (Terms, Guide, Tech, FAQ) to reflect the new architecture, provide clearer instructions, and add important disclaimers.
  • Removed the deprecated high-precision (Whisper) AI mode from the UI and deleted all associated files, streamlining the codebase and user experience.
3.0.0
2026-02-03

Multi-Mode Support: Discord & Media Audio Translation

  • Introduced three distinct input modes: 'Microphone' (your voice), 'Media' (Discord, browser tabs, etc.), and 'Collab/VC' (both yours and Discord voices simultaneously).
  • Implemented 'Direct Media SODA' architecture, leveraging advanced browser features for high-precision, low-latency transcription of media audio.
  • Optimized conversation history retention to provide richer context for AI: increased up to 60 items for Paid Tier and 20 items for Free Tier.
  • Fully updated the Setup Guide, FAQ, and Tech Insights to cover the new input modes and technical background.
  • Added explicit disclaimers to Terms of Service regarding privacy and copyright when using Media and Collab modes.
  • Unified transcription handling logic across all modes to improve system stability and response consistency.
2.3.0
2026-01-30

Media Translation 2.0 & Advanced VAD

  • Completely overhauled 'Media Translation Mode' (Tab Audio). It now utilizes an in-browser Whisper AI pipeline, significantly improving accuracy and reducing costs compared to the previous direct-stream method.
  • Implemented a new 'Spectral VAD' (Voice Activity Detection) engine. It analyzes spectral flatness (SFM) to intelligently distinguish speech from BGM and noise, enabling robust transcription even during gaming or movie playback.
  • Added automatic VAD tuning for Japanese speech, optimizing sensitivity to capture soft-spoken phrases and trailing whispers typical in Japanese.
  • Media Mode now supports OBS WebSocket integration, allowing translated subtitles from tab audio to be sent directly to OBS text sources.
  • Unified the Dashboard Preview UI, enabling real-time verification of both transcription and translation in Media Mode, just like in Microphone Mode.
2.2.0
2026-01-22

SEO & Rich Results Optimization

  • Implemented comprehensive structured data (JSON-LD) for WebSite, Organization, and SoftwareApplication to enable Google Rich Results.
  • Added FAQPage schema to the FAQ section, allowing questions and answers to appear directly in search results.
  • Integrated BreadcrumbList across all pages to support hierarchical URL display in search engines.
  • Automated sitemap.xml and robots.txt generation to improve crawl efficiency and index coverage.
  • Unified brand identity on Japanese pages by prioritizing the 'ももりー' label for better local search visibility.
  • Refined the FAQ page layout to perfectly align with other documentation pages for a consistent UI experience.
2.1.0
2026-01-22

Enhanced Documentation & Navigation

  • Detailed our technical philosophy and architecture on the landing page and FAQ to provide deeper insights into Momory's design.
  • Added comprehensive explanations of the 'Adaptive Debounce Algorithm' in Tech Insights and FAQ, clarifying how we achieve balanced latency and precision.
  • Improved cross-page anchor support to ensure smooth navigation to specific sections across different languages.
  • Refined internal linking structure for a more seamless and consistent multi-language browsing experience.
2.0.0
2026-01-19

OBS WebSocket Integration & Enhanced Streaming

  • Introduced OBS WebSocket integration, allowing direct subtitle transfer to a text source in OBS. This eliminates the need for window capture, improving streaming performance and stability.
  • Added a dedicated control panel on the dashboard to configure and manage your connection to OBS. Simply enter your connection details to get started.
  • Provided a comprehensive, step-by-step setup guide for both the new WebSocket method (recommended) and the traditional overlay window method.
  • Ensured multiple translation languages are correctly formatted with line breaks when sent to OBS.
  • Improved internal connection logic for a more robust and reliable experience.
1.4.1
2026-01-16

Self-Healing Microphone & Stability Fixes

  • Implemented a 'Self-Healing' architecture for the microphone. It now automatically restarts in the background if interrupted by system resource pressure, ensuring continuous capture.
  • Removed an unnecessary page reload workaround for mobile Safari, improving the user experience.
  • Updated FAQ with the latest OBS setup instructions and more specific future development goals.
  • Fixed an issue preventing the API Usage Monitor from updating in real-time.
  • Corrected translation keys for Open Graph images to ensure titles and descriptions display correctly on social media.
1.4.0
2026-01-16

FAQ Section & Enhanced Speech Stability

  • Added a comprehensive, multilingual FAQ page to provide self-service support and troubleshooting.
  • Enhanced the FAQ page with smooth-scroll navigation and content categorization for a better user experience.
  • Further improved the stability of the 'Standard' speech recognition engine to prevent unexpected stalls.
  • Centralized Open Graph (OG) image generation to ensure consistent and beautiful social sharing previews.
  • Resolved several build errors related to dependencies and CSS configuration to improve platform reliability.
1.3.0
2026-01-15

High-Precision AI Mode & Enhanced Responsiveness

  • Introduced 'High-Precision AI Mode' using a WebGPU-powered Whisper model for exceptionally accurate transcription.
  • Re-engineered the AI transcription engine, removing the 1-second buffer previously used for stabilization, resulting in a more immediate transcription experience.
  • Achieved robust voice activity detection (VAD) by leveraging the browser's native Web Speech API, significantly improving stability.
  • Implemented 'Power Saving Mode': dashboard UI updates are now paused when the tab is in the background to reduce CPU load.
  • Optimized dashboard performance by throttling microphone volume meter updates, improving responsiveness.
1.2.0
2026-01-14

Stability & UX Enhancements

  • Improved speech recognition robustness with a 15-second heartbeat monitor to recover from silent freezes.
  • Implemented an automatic page reload trigger for mobile Safari when returning from background to address layout issues.
  • Refined feedback system: notifications are now unified to orange in the overlay and limited to positive interactions for better stream flow.
  • Fixed a visual bug where the microphone volume bar showed 100% on zero input.
  • Strengthened API security with request validation, length limits, and data sanitization.
  • Improved microphone capture stability with a more reliable hardware reset logic during start.
1.1.0
2026-01-13

Precision & Real-time Monitoring

  • Added real-time API usage monitor (RPM, TPM, RPD) and cost estimation to the dashboard for transparent tracking.
  • Expanded free-tier conversation history limit to 20 lines, providing more context for translations.
  • Optimized prompt data limits to better utilize available TPM (Tokens Per Minute) for higher translation quality.
  • Refined AI repair instructions to better handle repetitions, fillers (umm, uh), and STT errors.
  • Fixed double-encoding issues and CSS specificity conflicts in the overlay settings.
1.0.8
2026-01-13

Branding & Transparency Update

  • Updated branding including a new logo, mobile-optimized icons, and design refinements across the platform.
  • Implemented dynamic OGP image generation to ensure appropriate social sharing for all pages.
  • Improved accessibility with better contrast ratios and reduced flickering animations.
  • Refined privacy documentation and architecture descriptions to clarify the 'Zero Server Storage' policy.
  • Corrected layout issues on the Author page and updated favicons for consistent display.
1.0.5
2026-01-13

Privacy Protection: Local Relay

  • Implemented 'Local Relay' mode using Service Workers: translation data is transferred to OBS within the browser.
  • Enhanced privacy by ensuring sensitive data remains in the user's local environment.
  • Introduced adaptive token management to balance quality and speed based on API availability.
  • Applied security headers including HSTS, X-Frame-Options, and COOP for platform safety.
1.0.0
2026-01-12

Initial Release of Momory

  • Official launch of Momory: A privacy-focused, low-latency AI subtitle service for streamers.
  • Modularized dashboard components for better maintainability and reliability.
  • Enabled vertical display of multiple target languages in the overlay.
  • Implemented initial interactive learning: AI improves style based on user feedback.