My Eye AI: A Hybrid Cloud-Mobile Object Detection System for the Visually Impaired Using YOLOv11, OWL-ViT, and BLIP

My Eye AI: A Hybrid Cloud-Mobile Object Detection System for the Visually Impaired Using YOLOv11, OWL-ViT, and BLIP

Authors

DOI:

https://doi.org/10.37965/jait.2025.0908

Keywords:

assistive technology, BLIP, object detection, OWL-Vit, scene description, YOLO

Abstract

My Eye AI is a hybrid cloud-mobile assistive system that delivers real-time object detection and scene description for visually impaired users. The system integrates three AI components: YOLOv11 for object detection, OWL-ViT for zero-shot open-vocabulary recognition, and Bootstrapping Language-Image Pretraining for natural-language scene captioning. Two YOLOv11 variants were trained on custom-curated datasets: the Medium model achieved mAP@0.5 = 0.443 and recall = 0.457, while the X-Large model improved to mAP@0.5 = 0.578 and recall = 0.603—reducing false negatives by 14.6 %. OWL-ViT extended detection to unseen objects with 71.4 % zero-shot accuracy. The cloud-based architecture offloads computation from the smartphone, maintaining low latency while supporting Android and iOS without special hardware. My Eye AI demonstrates measurable improvements in detection accuracy, adaptability, and real-time usability, directly benefiting visually impaired individuals through affordable, accessible mobile deployment.

Author Biography

Yinong Chen, School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA

Yinong Chen is a teaching professor in the School of Computing and Augmented Intelligence in the Ira A. Fulton Schools of Engineering at Arizona State University. He received his doctorate from the University of Karlsruhe / Karlsruhe Institute of Technology (KIT), Germany, in 1993. He did postdoctoral research at Karlsruhe and at LAAS-CNRS in France in 1994 and 1995. From 1994 to 2000, he was a lecturer and then senior lecturer in the School of Computer Science at the University of the Witwatersrand, Johannesburg, South Africa. Chen joined Arizona State University in 2001. He's (co-) authored more than 10 textbooks and over 500 research papers. He is on the editorial boards of several journals, including Journal of Artificial Intelligence and Technology, Journal of Systems and Software, Simulation Modeling Practice and Theory, and International Journal of Simulation and Process Modelling.

Chen's areas of expertise include: Software Engineering, Service-Oriented Computing, Visual Programming, Big data Processing and Machine Learning, Robotics and AI, and Computer Science Education. 

Education
  • Ph.D. Computer Science, University of Karlsruhe / Karlsruhe Institute of Technology (KIT), Germany 1993
  • M.S. Computer Science, Chongqing University, China 1984
  • B.S. Software Engineering, Chongqing University, China 1982

Downloads

Published

2025-11-28

How to Cite

Wahwah, S., & Chen, Y. (2025). My Eye AI: A Hybrid Cloud-Mobile Object Detection System for the Visually Impaired Using YOLOv11, OWL-ViT, and BLIP. Journal of Artificial Intelligence and Technology. https://doi.org/10.37965/jait.2025.0908

Issue

Section

SI:ISADS
Loading...