Automotive AI Agent Product Development: Enhancing the “Cockpit Endorser” with Foundation Models
When it comes to the development of AI in the automotive industry, AI Agents are positioned at Level 3 according to OPEN AI’s taxonomy of AI development. However, the current popular foundation models are only able to reach Level 2 (Reasoners) at best due to limitations in interaction modes and tool usage capabilities. To enhance the intelligence level of the automotive cockpit, building automotive agents that can call upon active intelligent features and multiple tools/foundation models is a more suitable approach.
The concept of an “emotional cockpit” has been around for some time, but truly bringing it to life starts with integrating foundation models into vehicles. By incorporating Agents that are equipped with multiple foundation model bases, the automotive system can accurately recognize the environment and utilize a variety of tool library interfaces to enhance its ability to interact with users in a warm and personalized manner, becoming the ultimate “cockpit endorser.”
Several OEMs and Tier1s have introduced technologies and products aimed at enhancing the emotional value of Agents. For instance, Xiaoai Tongxue’s “emotional dialogue system” utilizes a mixed strategy dredging model consisting of a mental state-enhanced encoder, mixed strategy learning module, and multi-factor-aware decoder to improve emotional interactions. Similarly, the Affectively Framework proposed by the Institute of Digital Games at the University of Malta focuses on establishing emotional models and incorporating behavior and affective rewards in the training process to help Agents better understand human emotions.
Despite the advancements in automotive AI technology, there are still challenges that need to be addressed to improve the user experience. One key aspect is ensuring that Agents can accurately perceive and respond to user needs in real-time scenarios. Current Agents often struggle with active intention recognition, leading to issues such as wake-up failure, recognition errors, and false wake-ups. These shortcomings can significantly impact the user experience and hinder the Agent’s ability to act as a reliable “cockpit endorser.”
To overcome these challenges, a quick-response multi-agent framework can be implemented to design the service framework of Agents in diverse scenarios. This framework allows for the seamless integration of different Agents with varying functions to quickly understand user needs, make decisions, execute tasks, and reflect on iterations. Multi-agent systems, like the one used in NIO Nomi, enable Agents to collaborate and handle complex instructions more effectively, ultimately enhancing the overall user experience and making the automotive AI system more flexible and efficient.
In conclusion, by leveraging advanced technologies such as multi-agent frameworks and integrating foundation models, automotive AI Agents can evolve into truly intelligent “cockpit endorsers” that anticipate user needs, provide personalized interactions, and enhance the overall driving experience. Through continuous innovation and refinement, automotive AI technology has the potential to revolutionize the way we interact with vehicles and create a more seamless and intuitive driving experience for users.