Upon reading “On the Opportunities and Risks of Foundation Models”, 2.3 “Robotics”, I predict that manufacturing automation could, and probably will be automated without foundation models for robotics. Robotic AI which uses non-foundation models and is supervised by humans will be used in manufacturing until AGI arrives.
Making a foundation for robotics is already very much like making a general intelligence, and collecting the data for learning such models is more challenging than for language and vision foundation models because this data should be multi-modal (perception, actuation, and language).
Tesla has made a generic movement planning and dynamic modelling AI for cars and applied it to robots with more classical architecture and on-device inference, i. e. without a foundation model.
In autonomous driving as well as manufacturing robotics, it’s easy to specify the goal for the robot. This is not the case for a more generic vision of an assistant robot which should be able to complete tasks like “make a breakfast” or “clean the house”.
Politics would be an additional challenge with using foundation models (whose inference happens in the cloud) for manufacturing. Governments likely won’t permit the local manufacturing to depend on the models owned and executed by corporations like Google or OpenAI, which would then have the power to unplug their models and halt a big portion of national manufacturing in a moment.
The local inference is not such a big deal in manufacturing as in energy infrastructure, as I described here, so mixed local and cloud inference is possible in manufacturing, as long as the model running is owned by the manufacturing automation service provider, but I don’t see any benefits in using such an architecture either. Tesla inference fits a few chips whose cost is already or will soon look negligible relative to the cost of typical manufacturing equipment and the service costs.