French AI start-up Mistral has revealed its first multimodal AI model—called Pixtral 12GB—which is capable of prcocessing images and text, like OpenAI’s ChatGPT models can.
Powered by one of Mistral’s existing models—Nemo 12B—Pixtral can (on top of generating text-based responses to prompts) caption images, identify and count objects within images, and answer image-related queries.
It’s free to download (on an Apache 2.0 licence) and is an open-source model (like several of Mistral’s AI models are), meaning anyone can download it from GitHub or Hugging Face, fine-tune it, and train it for their custom needs, without any restrictions.
Although there is no functional demo yet, users will be able to access Pixtral via Mistral’s chatbot Le Chat or their API platform, Le Platforme, in the next few days.
The launch of Pixtral comes after Mistral successfully raised $645M in June, in a round led by General Catalyst, pushing its valuation to $6B, in just a year, pitting it against OpenAI which has had a similar trajectory, with people calling Pixtral, Europe’s version of ChatGPT.