Although we’ve been working on speech recognition for several years, every new language requires our engineers and scientists to tackle unique challenges. Our most recent additions - Croatian, Filipino, Ukrainian, and Vietnamese - required creative solutions to reflect how each language is used across devices and in everyday conversations.
For example, since Vietnamese is a tonal language, we had to explore how to take tones into consideration. One simple technique is to model the tone and vowel combinations (tonemes) directly in our lexicons. This, however, has the side effect of a larger phonetic inventory. As a result we had to come up with special algorithms to handle the increased complexity. Additionally, Vietnamese is a heavily diacritized language, with tone markers on a majority of syllables. Since Google Search is very good at returning valid results even when diacritics are omitted, our Vietnamese users frequently omit the diacritics when typing their queries. This creates difficulties for the speech recognizer, which selects its vocabulary from typed queries. For this purpose, we created a special diacritic restoration algorithm which enables us to present properly formatted text to our users in the majority of cases.
Filipino also presented interesting challenges. Much like in other multilingual societies such as Hong Kong, India, South Africa, etc., Filipinos often mix several languages in their daily life. This is called code switching. Code switching complicates the design of pronunciation, language, and acoustic models. Speech scientists are effectively faced with a dilemma: should we build one system per language, or should we combine all languages into one?
In such situations we prefer to model the reality of daily language use in our speech recognizer design. If users mix several languages, our recognizers should do their best in modeling this behavior. Hence our Filipino voice search system, while mainly focused on the Filipino language, also allows users to mix in English terms.
The algorithms we’re using to model how speech sounds are spoken in each language make use of our distributed large-scale neural network learning infrastructure (yes, the same one that spontaneously discovered cats on YouTube!). By partitioning the gigantic parameter set of the model, and by evaluating each partition on a separate computation server, we’re able to achieve unprecedented levels of parallelism in training acoustic models.
The more people use Google speech recognition products, the more accurate the technology becomes. These new neural network technologies will help us bring you lots of improvements and many more languages in the future.
Related Post:
and
- MOOC Research and Innovation
- Collection of SQL queries with Answer and Output Set 2
- PiAUISuite Update and Voicecommand v3 1
- Sign in to edx org with Google and Facebook and
- Throwing fireballs with the Kinect and Oculus Rift in Unity 3D
- IT Laws and Patents notes for BSc IT Mumbai University
- How To Bypass Megaupload Wait Time And Download At Maximum Speed !!!
- The rise of the Bots Robots Surgeons and Disruptive Technology
- The Computer Science Pipeline and Diversity Part 2 Some positive signs and looking towards the future
- Collection of SQL queries with Answer and Output Set 4
- Skill maps analytics and more with Google’s Course Builder 1 8
- Why Watson and Siri Are Not Real AI
- PPT Presentation on Memory Management in Winnows2000 and WindowsXP
- Moore’s Law Part 1 Brief history of Moores Law and current state
- Information sharing for more efficient network utilization and management
- A year and a bit with Inbox Zero
- Explore the history of Pop and Punk Jazz and Folk with the Music Timeline
- Tips on Choosing Apt Web Templates and Service Providers
- Remembering to forget
- See through the clouds with Earth Engine and Sentinel 1 Data
- Teaching machines to read between the lines and a new corpus with entity salience annotations
- The Computer Science Pipeline and Diversity Part 1 How did we get here
- Getting your fridge to order food for you with a RPi camera and a hacked up Instacart API
- Google’s Course Builder 1 9 improves instructor experience and takes Skill Maps to the next level
- Sudoku Linear Optimization and the Ten Cent Diet
computer
- Take a better selfie with Lily
- Free Lecture The Psychology of Computer Insecurity
- MOOC Research and Innovation
- Calculating Ada The Countess of Computing
- When can Quantum Annealing win
- Creating a templated Binary Search Tree Class in C
- Projecting without a projector sharing your smartphone content onto an arbitrary display
- Will a robot take your job
- Facebook Introduces ‘Hack ’ the programming language of the future
- High Resolution Scary Haunted House Wallpapers for Desktop
- TYBSC IT Sem V Question Papers 2009 Mumbai University
- Home automation update
- Very easy to download youtube videos audio mp3 format
- HD Dark Desktop Background Wallpapers Download
- Launching the Quantum Artificial Intelligence Lab
- Syrias children learn to code with the Raspberry Pi
- Running omxplayer from the command line easily using alias
- Largest collection of Google Logos on the web Set 7
- Collection of SQL queries with Answer and Output Set 2
- Prevent access to specific partition or drive
- Summer Games Learn to Program
- PiAUISuite Update and Voicecommand v3 1
- Sign in to edx org with Google and Facebook and
- Large Scale Machine Learning for Drug Discovery
- Hacker Tricks from Insiders A Threat to ERP Systems
croatian
0 comments:
Post a Comment