Illustration by Alex Castro / The Verge
Just two months after releasing its last big AI model, Meta is back with a major update: its first open-source model capable of processing both images and text.
The new model, Llama 3.2, could allow developers to create more advanced AI applications, like augmented reality apps that provide real-time understanding of video, visual search engines that sort images based on content, or document analysis that summarizes long chunks of text for you.
Meta says it’s going to be easy for developers to get the new model up and running. Developers will have to do little except add this “new multimodality and be able to show Llama images and have it communicate,” Ahmad Al-Dahle, vice president of generative AI at Meta, told The Verge.
Other AI…