Tesla’s Director of Artificial Intelligence, Andrej Karpathy, spoke at the 2019 PyTorch Developer Conference and shared some of the details around Tesla’s Autopilot Neural Network.
Check out the short 10 minute video here:
Famously, Tesla relies primarily on cameras to perceive its environment (plus a front facing radar and ultrasonic sensors). Unlike other self-driving automotive players, it does not use LiDAR (see LiDAR vs Cameras for Self-Driving for more). All new Tesla vehicles with AP2 and above (see AP1 vs AP2) have eight cameras surrounding the vehicle.
They process these images with Convolutional neural networks. Since Tesla is a vertically integrated company, they own the entire stack.
One of the biggest component centers around analyzing images in the PyTorch distributed training portion of the stack. In just a single image frame there are numerous elements the network needs to identify and tag.
The “Hydranet” ties together all the cameras to help identify key road features:
The Tesla Hydranet is also used, of course, in the case of Smart Summon, as shown here:
The Tesla Hydranet is also used for road layout prediction, stitching together a top-down view of the road as shown here:
Quite a bit of the newer featured being rolled out via these neural networks target the new Full Self Driving Computer (FSD Computer, aka Hardware 3) which is an order of magnitude faster then the original NVIDIA that was used in Hardware 2 (AP2) Tesla vehicles (see AP1 vs AP2 vs AP3).
Elon Musk has said that he expects Full Self Driving to become available in 2020 based on the rapid progress the machine learning and training systems are making from the almost quarter million vehicles on the road helping to train the network.