BMW is becoming an annual present at the Mobile World Congress 2019 in Barcelona. Hosted this week in the beautiful Catalan city, the even gives BMW the opportunity to introduce new tech products. One of them is the BMW Natural Interaction which combines the most advanced voice command technology available with expanded gesture control and gaze recognition to enable genuine multimodal operation for the first time. The first BMW Natural Interaction functions will be available in the BMW iNEXT from 2021.

The BMW Natural Interaction allows the driver to use their voice, gestures and gaze at the same time in various combinations to interact with their vehicle. The preferred mode of operation can be selected intuitively, according to the situation and context. Voice commands, gestures and the direction of gaze can be reliably detected by the vehicle, combined and the desired operation executed.

 

This free, multimodal interaction is made possible by speech recognition, optimized sensor technology and context-sensitive analysis of gestures. Spoken instructions are registered and processed using Natural Language Understanding. An intelligent learning algorithm, which is constantly being refined, combines and interprets the complex information so that the vehicle can respond accordingly. This creates a multimodal interactive experience geared towards the driver’s wishes.

The driver decides how they want to interact with the car, based on their own personal preferences, habits or the current situation. So, when the driver is engaged in conversation, they would probably choose gesture and gaze control; when their eyes are on the road, better to rely on speech and gestures. In this way, for example, car windows or the sunroof can be opened or closed, air vents adjusted or a selection made on the Control Display. If the driver wants to learn more about vehicle functions, they can also point to buttons and ask what they do.

With enhanced gesture recognition and the car’s high level of connectivity, the interaction space is no longer confined to the interior. For the first time, occupants will be able to interact with their direct surroundings, such as buildings or parking spaces. Even complex queries can be answered quickly and easily by pointing a finger and issuing a voice command. “What’s this building? How long is that business open? What is this restaurant called? Can I park here and what does it cost?”

“Customers should be able to communicate with their intelligent connected vehicle in a totally natural way,” explains Christoph Grote, Senior Vice President, BMW Group Electronics. “People shouldn’t have to think about which operating strategy to use to get what they want. They should always be able to decide freely – and the car should still understand them. BMW Natural Interaction is also an important step for the future of autonomous vehicles, when interior concepts will no longer be geared solely towards the driver’s position and occupants will have more freedom.”

The advances in recognition and evaluation of voice commands, gestures and gaze required for natural driver-vehicle interaction are delivered by improved sensor and analysis technologies. Using an infrared light signal, the gesture camera can now capture hand and finger movements in three dimensions throughout the driver’s entire operating environment and determine a precise directional vector. For example, pointing a forefinger at the Control Display and saying a command is sufficient to initiate the desired operation without touching the screen.

The high-definition camera integrated into the instrument cluster also registers head and eye direction. The built-in camera technology evaluates the images and uses them to calculate the required vector data, which is then processed in the vehicle. To interpret voice instructions quickly and reliably in addition to gestures, the information transmitted by the driver to the vehicle in a multimodal manner is combined and evaluated with the help of artificial intelligence. The algorithm responsible for interpreting the data in-car is continuously optimized and refined using machine-learning and evaluation of different operating scenarios.

Thanks to intelligent networking, the area of BMW Natural Interaction extends beyond the vehicle interior. For example, the driver can point a finger at objects in their field of vision and give related voice commands, such as asking for information about opening hours or customer ratings, or reserving a table at a restaurant. Thanks to the vehicle’s depth of connectivity, extensive environmental data and artificial intelligence enable BMW Natural Interaction to transform the vehicle into a well-informed, helpful passenger.

By connecting digital services, it will be possible to expand the scope of interaction in the future. For example, when the driver spots a parking space, they will easily be able to find out whether they are allowed to park there and what it costs, and then reserve and pay for it directly without ever pushing a button.

As part of a sophisticated mixed-reality installation, BMW will immerse visitors to Mobile World Congress 2019 in application scenarios where they can experience the customer benefits of BMW Natural Interaction for themselves hands-on. A specially-designed spatial concept and virtual-reality goggles are used to create a thoroughly realistic experience that showcases the new possibilities during a virtual ride in the BMW Vision iNEXT.

Visitors discover the previously unknown freedom of gesture control throughout the area detected by the gesture camera, which extends across the entire width of the front vehicle interior. Initially, in training mode, directional detection of the pointing gesture is visualised by a dynamic light pulse that follows the direction. Objects the driver can interact with via pointing are then highlighted. Just how natural this interaction is becomes apparent in the simple combination of gesture and language. For example, if the driver points to a side window, this is visually highlighted with a frame and the voice command “Open” will then open the chosen window. These totally new possibilities for interaction with the immediate environment are revealed during an automated journey through a futuristic city the driver is unfamiliar with. The vehicle takes over driving and the visitor embarks on a sightseeing tour of a very different kind – simply pointing at buildings to obtain all the information they need about events and exhibitions. Towards the end of the ride, the user reserves tickets for a cinema they drive by along their route and streams the trailer for the film directly into the vehicle.

The first BMW Natural Interaction functions will be available in the BMW iNEXT as early as 2021. Development of driver-vehicle interaction will make further advances in parallel. In the future, with the help of artificial intelligence, the system will continue to learn and enhanced sensor technology will be able to take occupants’ emotions into account and integrate them into the interaction in a meaningful fashion. In this way, interaction between driver and vehicle will become even more personalised and tailored to the overall situation. Based on experience and depending on the situation and mood, the intelligent assistant will then be able to decide whether to wait for instructions or proactively make suggestions for interaction.