Article

Voice_image

Category: Client cases

Voice recognition in vehicles

How does voice recognition function in vehicles, and what technologies and expertise drive its success? Let's delve into the future by exploring what goes on behind the cockpit.

Since 2019, emagine has been backing prominent automotive and software manufacturers in implementing voice recognition within vehicles, with its dedicated team comprising computer linguists, developers, and data scientists.

Approximately 50% of drivers currently utilize voice commands while in their vehicles, employing them to control the navigation system, initiate playlists, or receive spoken news updates. Similar to the smart home environment, language assistants integrated into vehicles rely on advanced and extensively tested language models to guarantee accurate comprehension and execution of voice commands.

 

From speech to text, from code to action

For a vehicle computer to accurately comprehend and flawlessly execute voice recognition, it requires the appropriate voice data and data models. This necessitates the conversion of voice inputs into a technically processable format. Through this process, speech transforms into actionable commands.

Voice_regoniction

The specified actions that a vehicle should undertake are clearly outlined, including tasks like initiating a phone call, adjusting the air conditioning, or navigating to a destination. When triggering the action "start a phone call," a driver might use various voice commands such as "Call [person abc]" or "Start call with [person abc]" or others.

Irrespective of how the driver phrases the command, the car is designed to execute the designated action. These different variants of utterances are also predefined to ensure seamless interaction.

Step 1: Language

Subsequently, the data-driven process starts: Language data is gathered, either through spoken interactions or processed by native speakers.

This gradual compilation gives rise to comprehensive corpora, serving as language or data models, forming the statistical foundation. Each language needs its unique corpora, accompanied by its distinct set of rules.

Step 2: Text

Computer-based conversion transforms the voice recordings from audio format into text format. The greater the diversity of variants for a specific command, the stronger the support for voice input, leading to more precise execution of desired actions by the vehicle.

During this process, the text data is categorized and annotated with relevant labels and keywords, for instance, "calls" being mapped to "phone." As a result, the prompt is appropriately associated with the relevant topic, optimizing the system's understanding and response.

Step 3: Code

The next step involves generating processable codes and scripts, using widely used programming languages like Python and PowerShell. Human language goes through a transformation into computer language, and the vast volumes of data culminate in the creation of a language model that drives the system's understanding and response capabilities.

Step 4: Action

Utilizing the data stored in the code, the software proceeds to execute the desired command precisely as intended. In our example scenario, the software establishes a telephone connection (without video) to the specified call partner "abc" (not "xyz") without any interruptions, fulfilling the intended action seamlessly.

What happens next?

  • The driver makes a statement.
  • Automatic Speech Recognition (ASR) recognizes speech input and transcribes it into text.
  • The software analyzes this text and classifies the subject by comparing it with the rules written for each language.
  • If the command has been correctly assigned, the dialog management system (also Human Machine Interface) reacts by activating the function (starts the desired call) or generates a voice response ("Please repeat the command.").

 

  Similar to the smart home environment, language assistants integrated into vehicles rely on advanced and extensively tested language models to guarantee accurate comprehension and execution of voice commands.

 


Speech models from emagine

Since 2019, we have been providing expert guidance and assistance to prominent automotive and software manufacturers, aiding them in integrating cutting-edge voice technology into their vehicles. Through close technical collaboration with our clients' development teams, emagine has successfully crafted all the speech models for the current vehicles of the world's second-largest car manufacturer.

Additionally, we proudly support the world market leader in speech recognition software. The system operates with near-flawless efficiency across an impressive array of 17 languages.

Under the skillful leadership of a technical lead, our team comprises NLU/NLP experts, computational linguists, machine learning specialists, data scientists, and native speakers, specifically assembled to cater to the needs of our valued customers.

At present, our team proficiently handles 17 of the world's most widely used languages. emagine takes full ownership of both the commercial and technical aspects of the project, ensuring absolute accountability throughout the entire process.

Ready to find out more?

Ask us how we can help you succeed.

Blog

Read more testimonials

left-arrow
right-arrow

Case: Agile
Case: Projects & implementation
Client cases

Empowering project managers through innovative training

With a passion for effective leadership and strategic aspirations toward sustainability, Philips has long been on the forefront of people-centric innovation. A need for an optimized training approach, they sought out emagine for help.

Client cases
Nearshoring

Eika Group leverages emagine’s nearshoring capabilities for IT expertise

In this interview, Anders Bjerkestrand from Eika explains how and why they started nearshoring with emagine in 2019.

Client cases
Tech & Development

Voice recognition in vehicles

How does voice recognition function in vehicles, and what technologies and expertise drive its success? Let’s delve into the future by exploring what goes on behind the cockpit.