First announced in May, OpenAI has finally released real-time vision capabilities for ChatGPT, to celebrate the 6th day of the ‘12 Days of OpenAI.’
Users can now point their phone camera at any object, and ChatGPT will ‘see’ what it is, understand it, and answer questions about it, in real-time. For example, if someone was drawing an anatomical representation of the human body, it can offer feedback like “the location of the brain is spot on, but the shape is more oval.”
It can also ‘see’ what’s on a device screen and offer advice, such as explaining what a menu setting is or providing the answer to a math problem.
During demos of the feature, it did get the answer to a geometry problem wrong, highlighting that it is—like many other AI models, no matter how advanced—still prone to error and hallucination.