<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Automatic Speech Recognition Archives - Gizmochina</title>
	<atom:link href="https://www.gizmochina.com/tag/automatic-speech-recognition/feed/" rel="self" type="application/rss+xml" />
	<link>https://www.gizmochina.com/tag/automatic-speech-recognition/</link>
	<description>Latest Tech News, Product Reviews and Deals</description>
	<lastBuildDate>Tue, 23 May 2023 04:05:29 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=5.9.9</generator>
	<item>
		<title>Meta&#8217;s Massively Multilingual Speech to Redefine Boundaries of Language</title>
		<link>https://www.gizmochina.com/2023/05/23/meta-massively-multilingual-speech-model/</link>
		
		<dc:creator><![CDATA[Anubhav]]></dc:creator>
		<pubDate>Tue, 23 May 2023 04:01:06 +0000</pubDate>
				<category><![CDATA[News]]></category>
		<category><![CDATA[Automatic Speech Recognition]]></category>
		<category><![CDATA[Meta]]></category>
		<guid isPermaLink="false">https://www.gizmochina.com/?p=538818</guid>

					<description><![CDATA[<img width="300" height="200" src="https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-300x200.jpg?x10805" class="webfeedsFeaturedVisual wp-post-image" alt="Meta" style="display: block; margin: auto; margin-bottom: 5px;max-width: 100%;" link_thumbnail="" srcset="https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-300x200.jpg 300w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-1024x683.jpg 1024w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-768x512.jpg 768w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-696x464.jpg 696w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-1068x712.jpg 1068w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-630x420.jpg 630w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2.jpg 1536w" sizes="(max-width: 300px) 100vw, 300px" /><p>Multilingual speech projects represent a significant leap forward in advancing language technology and promoting global linguistic diversity. These projects utilize AI language models to recognize and generate speech in a wide array of languages, often spanning thousands of diverse linguistic backgrounds. By leveraging innovative approaches, such as incorporating unconventional data sources or employing self-supervised speech [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://www.gizmochina.com/2023/05/23/meta-massively-multilingual-speech-model/">Meta&#8217;s Massively Multilingual Speech to Redefine Boundaries of Language</a> appeared first on <a rel="nofollow" href="https://www.gizmochina.com">Gizmochina</a>.</p>
]]></description>
										<content:encoded><![CDATA[<img width="300" height="200" src="https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-300x200.jpg?x10805" class="webfeedsFeaturedVisual wp-post-image" alt="Meta" loading="lazy" style="display: block; margin: auto; margin-bottom: 5px;max-width: 100%;" link_thumbnail="" srcset="https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-300x200.jpg 300w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-1024x683.jpg 1024w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-768x512.jpg 768w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-696x464.jpg 696w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-1068x712.jpg 1068w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-630x420.jpg 630w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2.jpg 1536w" sizes="(max-width: 300px) 100vw, 300px" />
<p>Multilingual speech projects represent a significant leap forward in advancing language technology and promoting global linguistic diversity. These projects utilize <a href="http://gizmochina.com/tag/artificial-intelligence">AI</a> language models to recognize and generate speech in a wide array of languages, often spanning thousands of diverse linguistic backgrounds. By leveraging innovative approaches, such as incorporating unconventional data sources or employing self-supervised speech representation learning, multilingual speech projects aim to break barriers and empower individuals to communicate, learn, and access information in their native languages. </p>



<h3>Meta has decided to put MMS out as an Open Source Project</h3>



<p>Meta has unleashed its latest feat in AI language models with the groundbreaking Massively Multilingual Speech (MMS) project, setting it apart from mere <a href="http://gizmochina.com/tag/chatgpt">ChatGPT</a> replicas. In an unprecedented stride towards innovation, Meta&#8217;s MMS boasts the ability to recognize and generate speech in an astounding array of over 4,000 spoken languages, surpassing the capabilities of its predecessors. Not content with keeping this breakthrough under wraps, Meta has decided to open-source MMS, inviting researchers to leverage and expand upon its foundation. In doing so, <a href="http://gizmochina.com/tag/meta">Meta</a> aims to reign over language diversity preservation and encourage collaborative advancement in the field.</p>



<div class="wp-block-image"><figure class="aligncenter size-large"><img loading="lazy" width="1024" height="683" src="https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-1024x683.jpg?x10805" alt="Meta" class="wp-image-538825" srcset="https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-1024x683.jpg 1024w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-300x200.jpg 300w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-768x512.jpg 768w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-696x464.jpg 696w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-1068x712.jpg 1068w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2-630x420.jpg 630w, https://www.gizmochina.com/wp-content/uploads/2023/05/dima-solomin-mr26tQgHGmc-unsplash-2048x1365-1-1536x1024-1-2.jpg 1536w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure></div>



<p>Traditional speech recognition and text-to-speech models require extensive training on vast audio datasets, complete with meticulous transcription labels that facilitate machine learning algorithms. However, many endangered languages, predominantly found outside industrialized nations, lack such comprehensive data, placing them at risk of vanishing altogether. Acknowledging this predicament, Meta adopted an ingenious approach by tapping into translated religious texts. These texts, like the Bible, offer diverse linguistic renditions that have undergone extensive scrutiny for text-based language translation research.</p>



<p>Employing the wav2vec 2.0 model for self-supervised speech representation learning, Meta further refined the data&#8217;s usability by training an alignment model. The synergy between unorthodox data sources and self-supervised speech modeling yielded remarkable results. Comparative evaluations against <a href="http://gizmochina.com/tag/openai">OpenAI</a>&#8216;s Whisper revealed MMS&#8217;s superiority, achieving a 50% reduction in word error rate while surpassing Whisper&#8217;s language coverage by a staggering factor of 11.</p>



<p>With the release of MMS as an open-source research project, Meta aspires to reverse the concerning trend of technology eroding linguistic diversity, often limiting support to the most common 100 languages favoured by tech giants. Envisioning a world where assistive technology, text-to-speech, and even <a href="http://gizmochina.com/tag/virtual-reality">virtual</a> and <a href="http://gizmochina.com/tag/augmented-reality">augmented reality</a> technologies enable individuals to communicate and learn in their native tongues, Meta hopes to inspire the preservation and vitality of languages worldwide.</p>



<p><strong><span style="text-decoration: underline">RELATED:</span></strong></p>



<ul><li><a href="https://www.gizmochina.com/2023/05/22/meta-license-magic-leap-ar-technology/">Meta reportedly in talks to license Magic Leap’s AR technology</a></li><li><a href="https://www.gizmochina.com/2023/05/20/project-p92-twitter-alternative-instagra/">Twitter Alternative, Codenamed P92, may be Unveiled by Meta Next Month</a></li><li><a href="https://www.gizmochina.com/guides/best-gaming-controllers-of-2023-enhance-your-gaming-experience/">Best Gaming Controllers of 2023: Enhance Your Gaming Experience</a></li></ul>



<figure class="wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio"><div class="wp-block-embed__wrapper">
<iframe loading="lazy" title="Legion Y900 Tablet Review: A Portable 14&quot; Monitor Runs Android With Excellent Performance" width="696" height="392" src="https://www.youtube.com/embed/DkcQ7txsMwY?feature=oembed" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>
</div></figure>



<p>(<a href="https://www.engadget.com/metas-open-source-speech-ai-recognizes-over-4000-spoken-languages-161508200.html">Via</a>)</p>
<p>The post <a rel="nofollow" href="https://www.gizmochina.com/2023/05/23/meta-massively-multilingual-speech-model/">Meta&#8217;s Massively Multilingual Speech to Redefine Boundaries of Language</a> appeared first on <a rel="nofollow" href="https://www.gizmochina.com">Gizmochina</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Amazon&#8217;s Echo devices get a Live Translation feature</title>
		<link>https://www.gizmochina.com/2020/12/15/amazon-live-translation-feature-echo-devices/</link>
		
		<dc:creator><![CDATA[Jed John Ikoba]]></dc:creator>
		<pubDate>Tue, 15 Dec 2020 16:22:38 +0000</pubDate>
				<category><![CDATA[Amazon]]></category>
		<category><![CDATA[News]]></category>
		<category><![CDATA[Alexa]]></category>
		<category><![CDATA[Amazon Alexa]]></category>
		<category><![CDATA[Amazon Echo]]></category>
		<category><![CDATA[Amazon Echo Live translate]]></category>
		<category><![CDATA[Amazon's Echo Live Translate]]></category>
		<category><![CDATA[Automatic Speech Recognition]]></category>
		<category><![CDATA[Live Translate]]></category>
		<guid isPermaLink="false">https://www.gizmochina.com/?p=360344</guid>

					<description><![CDATA[<img width="300" height="175" src="https://www.gizmochina.com/wp-content/uploads/2019/04/Amazon-Echo-Show-300x175.png?x10805" class="webfeedsFeaturedVisual wp-post-image" alt="Amazon Echo Show" loading="lazy" style="display: block; margin: auto; margin-bottom: 5px;max-width: 100%;" link_thumbnail="" srcset="https://www.gizmochina.com/wp-content/uploads/2019/04/Amazon-Echo-Show-300x175.png 300w, https://www.gizmochina.com/wp-content/uploads/2019/04/Amazon-Echo-Show-696x406.png 696w, https://www.gizmochina.com/wp-content/uploads/2019/04/Amazon-Echo-Show.png 720w" sizes="(max-width: 300px) 100vw, 300px" /><p>Amazon has just announced that its virtual assistant, Alexa now supports a Live Translation feature, allowing users who speak different languages to communicate with each other. The complex process will involve the virtual assistant translating both sides of such a conversation. At the moment, Alexa can translate English, Spanish, German, Italian, Brazilian Portuguese, and Hindi, [&#8230;]</p>
<p>The post <a rel="nofollow" href="https://www.gizmochina.com/2020/12/15/amazon-live-translation-feature-echo-devices/">Amazon&#8217;s Echo devices get a Live Translation feature</a> appeared first on <a rel="nofollow" href="https://www.gizmochina.com">Gizmochina</a>.</p>
]]></description>
										<content:encoded><![CDATA[<img width="300" height="175" src="https://www.gizmochina.com/wp-content/uploads/2019/04/Amazon-Echo-Show-300x175.png?x10805" class="webfeedsFeaturedVisual wp-post-image" alt="Amazon Echo Show" loading="lazy" style="display: block; margin: auto; margin-bottom: 5px;max-width: 100%;" link_thumbnail="" srcset="https://www.gizmochina.com/wp-content/uploads/2019/04/Amazon-Echo-Show-300x175.png 300w, https://www.gizmochina.com/wp-content/uploads/2019/04/Amazon-Echo-Show-696x406.png 696w, https://www.gizmochina.com/wp-content/uploads/2019/04/Amazon-Echo-Show.png 720w" sizes="(max-width: 300px) 100vw, 300px" /><p><span style="font-weight: 400"><a href="http://gizmochina.com/tag/amazon">Amazon</a> has just announced that its virtual assistant, Alexa now supports a Live Translation feature, allowing users who speak different languages to communicate with each other. The complex process will involve the virtual assistant translating both sides of such a conversation.</span></p>
<p><figure id="attachment_344559" aria-describedby="caption-attachment-344559" style="width: 1200px" class="wp-caption aligncenter"><a href="https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3.jpg?x10805"><img loading="lazy" class="size-full wp-image-344559" src="https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3.jpg?x10805" alt="Amazon Echo" width="1200" height="800" srcset="https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3.jpg 1200w, https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3-300x200.jpg 300w, https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3-768x512.jpg 768w, https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3-1024x683.jpg 1024w, https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3-696x464.jpg 696w, https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3-1068x712.jpg 1068w, https://www.gizmochina.com/wp-content/uploads/2020/09/amazon-echo-dot-3-630x420.jpg 630w" sizes="(max-width: 1200px) 100vw, 1200px" /></a><figcaption id="caption-attachment-344559" class="wp-caption-text">Amazon Echo Dot Kids Edition</figcaption></figure></p>
<p><span style="font-weight: 400">At the moment, Alexa can translate English, Spanish, German, Italian, Brazilian Portuguese, and Hindi, with the Live Translation feature. Amazon said the feature is supported on Echo devices with the location set to English U.S.</span></p>
<p><span style="font-weight: 400">The Live Translation feature draws support from existing Amazon systems and features, including Alexa’s automatic-speech-recognition (ASR) and its text-to-speech systems.  These, in combination with Amazon Translate, as well as the whole gamut of Amazon&#8217;s machine learning models, designed and optimized for conversational-speech translation all combine seamlessly to produce the Live Translation feature, Amazon said in a recent statement.</span></p>
<p><div class="su-note"  style="border-color:#e5e5e5;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><div class="su-note-inner su-u-clearfix su-u-trim" style="background-color:#ffffff;border-color:#ffffff;color:#333333;border-radius:3px;-moz-border-radius:3px;-webkit-border-radius:3px;"><strong>Editor&#8217;s Pick: <a title="Nokia 5.4 launched with Snapdragon 662, 48MP quad cameras and punch-hole display" href="https://www.gizmochina.com/2020/12/15/nokia-5-4-launched-with-snapdragon-662-48mp-quad-cameras-and-punch-hole-display/" rel="bookmark">Nokia 5.4 launched with Snapdragon 662, 48MP quad cameras and punch-hole display</a></strong></div></div></p>
<p><span style="font-weight: 400">The berthing of this translation feature opens up new vistas in real-time communication, providing exciting possibilities with the seeming demolition of the communication barrier. </span></p>
<p><span style="font-weight: 400">Similar to what obtains with most ASR systems, Amazon said that the live translation system incorporates both an acoustic model and a language model, both models combining to provide insights for the ASR system to decide between alternative interpretations of the same sequence of phonemes.<a href="https://www.gizmochina.com/wp-content/uploads/2020/02/Amazon-Echo-Show-8.jpg?x10805"><img loading="lazy" class="aligncenter size-full wp-image-304037" src="https://www.gizmochina.com/wp-content/uploads/2020/02/Amazon-Echo-Show-8.jpg?x10805" alt="Amazon Echo Show 8" width="1000" height="1000" srcset="https://www.gizmochina.com/wp-content/uploads/2020/02/Amazon-Echo-Show-8.jpg 1000w, https://www.gizmochina.com/wp-content/uploads/2020/02/Amazon-Echo-Show-8-150x150.jpg 150w, https://www.gizmochina.com/wp-content/uploads/2020/02/Amazon-Echo-Show-8-300x300.jpg 300w, https://www.gizmochina.com/wp-content/uploads/2020/02/Amazon-Echo-Show-8-768x768.jpg 768w, https://www.gizmochina.com/wp-content/uploads/2020/02/Amazon-Echo-Show-8-696x696.jpg 696w, https://www.gizmochina.com/wp-content/uploads/2020/02/Amazon-Echo-Show-8-420x420.jpg 420w" sizes="(max-width: 1000px) 100vw, 1000px" /></a></span></p>
<p><span style="font-weight: 400">To provide users with a near-natural experience, Amazon adapted Alexa to a conversational speech by modifying its end-pointer to accommodate longer pauses for the Live Translation feature. Alexa uses the end-pointer to determine when a person has finished speaking. Alexa can also easily identify pauses in the middle and end of sentences.</span></p>
<p><span style="font-weight: 400">Search engine giant Google also offers a similar feature called Interpreter Mode. It is yet to be seen how the two services square up side by side, but this certainly provides an additional option for live translation and inter-language communication. </span></p>
<p><strong>UP NEXT: <a title="Poll of The Week: Have you received the Android 11 update?" href="https://www.gizmochina.com/2020/12/14/poll-of-the-week-have-you-received-the-android-11-update/" rel="bookmark">Poll of The Week: Have you received the Android 11 update?</a></strong></p>
<p>&nbsp;</p>
<p>The post <a rel="nofollow" href="https://www.gizmochina.com/2020/12/15/amazon-live-translation-feature-echo-devices/">Amazon&#8217;s Echo devices get a Live Translation feature</a> appeared first on <a rel="nofollow" href="https://www.gizmochina.com">Gizmochina</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.boldgrid.com/w3-total-cache/

Object Caching 35/64 objects using Redis
Page Caching using Disk: Enhanced 
Content Delivery Network Full Site Delivery via cloudflare
Database Caching 15/30 queries in 0.008 seconds using Redis
Fragment Caching 2/3 fragments using Redis

Served from: www.gizmochina.com @ 2026-04-22 05:02:36 by W3 Total Cache
-->