These days, Artificial Intelligence Can Generate Photorealistic Images, Write Novels, Do Your Homework, and even Predict Protein StructuresNew Research, however, reveals that it often fails at a very basic task: Telling time.
Researchers at Edinburgh University Have Tested The Ability of Seven Well-KNOWN MULTIMODAL LARGE LARGE LARGUAGE MODELS-to Kind of Ai That Can Interpret and Generate Various Various Various Kinds of MeDia-Mein Time-Related questions based on different images of clocks or calendars. Their study, forthcoming in April and Currently Hosted on the preprint server arxiv, demonstrates that the llms have direction with these basic tasks.
“The ability to interpret and reason about time from Visual Inputs is Critical for Many Real-World Applications-Ranging from Event Scheduling to Autonomous Systems,” The Researchers Wrote in the Study. “Despite Advances in Multimodal Large Language Models (MLLMS), Most work have focused on object detection, image captioning, or Scene Understanding, Leaving Temporal InfeRENCE
The team tested openai’s GPT-4o and GPT-O1; Google Deepmind’s Gemini 2.0; Anthropic’s Claude 3.5 sonnet; Meta’s LLAma 3.2-11B-Vision-Instruct; Alibaba’s Qwen2-VL7B-insstruct; And modelbest’s minicpm-V-2.6. They fed the models different images of analog clocks – Timekeepers with Roman Numerals, Different Dial Colors, and even some missing the seconds hand -well as 10 years.
For the clock images, the researchers asked the llms, wHat time is shown on the clock in the given image? For the calendar images, the researchrs asked simple questions such as, wHat day of the week is new year’s day? And Harder Queries Including wHat is the 153rd day of the year,
“Analogue Clock Reading and Calendar Comprehension Involve Intricate Cognitive Steps: They Demand Fine-Grained Visual Recognition (EG, Clock-Hand Position, Clock-Hand Position, Day-Cell Layout) And Non-Cell Layout) Reasoning (EG, Calculating Day Offsets), “The Researchers Explained.
Overall, The AI Systems did not perform well. They are read the time on analog clocks correctly Less than 25% of the time. They struck with clocks bearing roman numerals and stylized hands as they are did with clocks Lacking a seconds hand altogethr, indicating that is the isesue may stem from stem; Angles on the clock face, according to the reserchers.
Google’s gemini-2.0 scored highest on the team’s clock task, while gPT-o1 was accurate on the calendar task 80% of the time-a far better results competitions. But even then, the most successful MLLM on the Calendar Task Still Made Mistakes About 20% of the time.
“Most people can tell the time and use calendars from an early age. Our findings highlight a significant gap in the ability of ai to carry out what are quite basic skills for people, “Rohit Saxena, A Co-Author of the Study and PHD STUDENTIN SATUDENTARE Edinburgh’s school of informatics, said in a university statement“These shortfalls must be addressed if AI Systems are to be successfully integrated into time-sensitive, real-world applications, such as Scheduling, Automation and Assistant Technologies.”
So while ai might be removed to complete your homework, do’t count on it sticking to any deadlines.