Article Image

IPFS News Link • Robots and Artificial Intelligence

Future Tesla FSD Will Merge and Expand with XAI Grok

•, by Brian Wang

This can be structured in a way where XAI retains more of the higher AI functions.

Large multimodal models and virtual agents are the future of large language models and Tesla FSD and XAI Grok have to try to compete and dominate those other AIs.

FSD v13 will be a LMM (Large Multimodal Model) or have one integrated deeply into it. John Gibb and Jim Fan describe how Grok and FS will merge.

Grok 1.5+ has clearly being trained on FSD data. This won't be a bolt-on situation where an LLM simply talks to the passenger, but deeply integrated into the processing pipeline itself. There will be chain-of-thought reasoning and the ability for LMMs to retain conversations in the form of long context windows.

John thinks Grok will be able to form memories of a sort concerning local driving and driver profiles, allowing the car to be very satisfactory to every driver. FSD should personalize for every driver via a straightforward inference time conversation with each locality and each driver.

Jim Fan says the following

Tesla FSD v13 will likely be grokking language tokens. What excites me the most about Grok-1.5V is the potential to solve edge cases in self-driving. Using language for "chain of thought" will help the car break down a complex scenario, reason with rules and counterfactuals, and explain its decisions. What Grok-1.5V can help is to lift pixel->action mapping to pixel->language->action instead.

With @Tesla_AI's highly mature data pipeline, it is not hard to label tons of edge cases with high-quality human explanation traces, and finetune Grok to be far better than GPT-4V and Gemini for multimodal FSD reasoning.