Patch Notes
A chronicle of Momory's evolution and continuous improvements for a better streaming experience.
3.1.0
2026-02-05
Next-Gen Engine: Media & VC Mode Overhaul
- Overhauled Media and VC modes to use the latest Web Speech API standard, passing tab audio directly to the on-device SODA engine for superior accuracy, zero cost, and low latency.
- Applied the robust self-healing architecture from Mic Mode to all modes, ensuring continuous transcription even with `no-speech` errors or other interruptions.
- Fixed a critical bug where the translation API key could become stale, causing translations to fail across all modes. Also fixed a bug where transcriptions were not displayed correctly in VC mode.
- Fully updated all documentation (Terms, Guide, Tech, FAQ) to reflect the new architecture, provide clearer instructions, and add important disclaimers.
- Removed the deprecated high-precision (Whisper) AI mode from the UI and deleted all associated files, streamlining the codebase and user experience.
3.0.0
2026-02-03
Multi-Mode Support: Discord & Media Audio Translation
- Introduced three distinct input modes: 'Microphone' (your voice), 'Media' (Discord, browser tabs, etc.), and 'Collab/VC' (both yours and Discord voices simultaneously).
- Implemented 'Direct Media SODA' architecture, leveraging advanced browser features for high-precision, low-latency transcription of media audio.
- Optimized conversation history retention to provide richer context for AI: increased up to 60 items for Paid Tier and 20 items for Free Tier.
- Fully updated the Setup Guide, FAQ, and Tech Insights to cover the new input modes and technical background.
- Added explicit disclaimers to Terms of Service regarding privacy and copyright when using Media and Collab modes.
- Unified transcription handling logic across all modes to improve system stability and response consistency.
2.3.0
2026-01-30
Media Translation 2.0 & Advanced VAD
- Completely overhauled 'Media Translation Mode' (Tab Audio). It now utilizes an in-browser Whisper AI pipeline, significantly improving accuracy and reducing costs compared to the previous direct-stream method.
- Implemented a new 'Spectral VAD' (Voice Activity Detection) engine. It analyzes spectral flatness (SFM) to intelligently distinguish speech from BGM and noise, enabling robust transcription even during gaming or movie playback.
- Added automatic VAD tuning for Japanese speech, optimizing sensitivity to capture soft-spoken phrases and trailing whispers typical in Japanese.
- Media Mode now supports OBS WebSocket integration, allowing translated subtitles from tab audio to be sent directly to OBS text sources.
- Unified the Dashboard Preview UI, enabling real-time verification of both transcription and translation in Media Mode, just like in Microphone Mode.
2.2.0
2026-01-22
SEO & Rich Results Optimization
- Implemented comprehensive structured data (JSON-LD) for WebSite, Organization, and SoftwareApplication to enable Google Rich Results.
- Added FAQPage schema to the FAQ section, allowing questions and answers to appear directly in search results.
- Integrated BreadcrumbList across all pages to support hierarchical URL display in search engines.
- Automated sitemap.xml and robots.txt generation to improve crawl efficiency and index coverage.
- Unified brand identity on Japanese pages by prioritizing the 'ももりー' label for better local search visibility.
- Refined the FAQ page layout to perfectly align with other documentation pages for a consistent UI experience.
2.1.0
2026-01-22
Enhanced Documentation & Navigation
- Detailed our technical philosophy and architecture on the landing page and FAQ to provide deeper insights into Momory's design.
- Added comprehensive explanations of the 'Adaptive Debounce Algorithm' in Tech Insights and FAQ, clarifying how we achieve balanced latency and precision.
- Improved cross-page anchor support to ensure smooth navigation to specific sections across different languages.
- Refined internal linking structure for a more seamless and consistent multi-language browsing experience.
2.0.0
2026-01-19
OBS WebSocket Integration & Enhanced Streaming
- Introduced OBS WebSocket integration, allowing direct subtitle transfer to a text source in OBS. This eliminates the need for window capture, improving streaming performance and stability.
- Added a dedicated control panel on the dashboard to configure and manage your connection to OBS. Simply enter your connection details to get started.
- Provided a comprehensive, step-by-step setup guide for both the new WebSocket method (recommended) and the traditional overlay window method.
- Ensured multiple translation languages are correctly formatted with line breaks when sent to OBS.
- Improved internal connection logic for a more robust and reliable experience.
1.4.1
2026-01-16
Self-Healing Microphone & Stability Fixes
- Implemented a 'Self-Healing' architecture for the microphone. It now automatically restarts in the background if interrupted by system resource pressure, ensuring continuous capture.
- Removed an unnecessary page reload workaround for mobile Safari, improving the user experience.
- Updated FAQ with the latest OBS setup instructions and more specific future development goals.
- Fixed an issue preventing the API Usage Monitor from updating in real-time.
- Corrected translation keys for Open Graph images to ensure titles and descriptions display correctly on social media.
1.4.0
2026-01-16
FAQ Section & Enhanced Speech Stability
- Added a comprehensive, multilingual FAQ page to provide self-service support and troubleshooting.
- Enhanced the FAQ page with smooth-scroll navigation and content categorization for a better user experience.
- Further improved the stability of the 'Standard' speech recognition engine to prevent unexpected stalls.
- Centralized Open Graph (OG) image generation to ensure consistent and beautiful social sharing previews.
- Resolved several build errors related to dependencies and CSS configuration to improve platform reliability.
1.3.0
2026-01-15
High-Precision AI Mode & Enhanced Responsiveness
- Introduced 'High-Precision AI Mode' using a WebGPU-powered Whisper model for exceptionally accurate transcription.
- Re-engineered the AI transcription engine, removing the 1-second buffer previously used for stabilization, resulting in a more immediate transcription experience.
- Achieved robust voice activity detection (VAD) by leveraging the browser's native Web Speech API, significantly improving stability.
- Implemented 'Power Saving Mode': dashboard UI updates are now paused when the tab is in the background to reduce CPU load.
- Optimized dashboard performance by throttling microphone volume meter updates, improving responsiveness.
1.2.0
2026-01-14
Stability & UX Enhancements
- Improved speech recognition robustness with a 15-second heartbeat monitor to recover from silent freezes.
- Implemented an automatic page reload trigger for mobile Safari when returning from background to address layout issues.
- Refined feedback system: notifications are now unified to orange in the overlay and limited to positive interactions for better stream flow.
- Fixed a visual bug where the microphone volume bar showed 100% on zero input.
- Strengthened API security with request validation, length limits, and data sanitization.
- Improved microphone capture stability with a more reliable hardware reset logic during start.
1.1.0
2026-01-13
Precision & Real-time Monitoring
- Added real-time API usage monitor (RPM, TPM, RPD) and cost estimation to the dashboard for transparent tracking.
- Expanded free-tier conversation history limit to 20 lines, providing more context for translations.
- Optimized prompt data limits to better utilize available TPM (Tokens Per Minute) for higher translation quality.
- Refined AI repair instructions to better handle repetitions, fillers (umm, uh), and STT errors.
- Fixed double-encoding issues and CSS specificity conflicts in the overlay settings.
1.0.8
2026-01-13
Branding & Transparency Update
- Updated branding including a new logo, mobile-optimized icons, and design refinements across the platform.
- Implemented dynamic OGP image generation to ensure appropriate social sharing for all pages.
- Improved accessibility with better contrast ratios and reduced flickering animations.
- Refined privacy documentation and architecture descriptions to clarify the 'Zero Server Storage' policy.
- Corrected layout issues on the Author page and updated favicons for consistent display.
1.0.5
2026-01-13
Privacy Protection: Local Relay
- Implemented 'Local Relay' mode using Service Workers: translation data is transferred to OBS within the browser.
- Enhanced privacy by ensuring sensitive data remains in the user's local environment.
- Introduced adaptive token management to balance quality and speed based on API availability.
- Applied security headers including HSTS, X-Frame-Options, and COOP for platform safety.
1.0.0
2026-01-12
Initial Release of Momory
- Official launch of Momory: A privacy-focused, low-latency AI subtitle service for streamers.
- Modularized dashboard components for better maintainability and reliability.
- Enabled vertical display of multiple target languages in the overlay.
- Implemented initial interactive learning: AI improves style based on user feedback.