Automatic Detection and Description Generation of Drones and Their Actions via YOLOv7 and LLMs​

Gustavo Garcia-Vargas

Co-Presenters: Phil Ho Combatir

College: The Dorothy and George Hennings College of Science, Mathematics and Technology

Major: Computer Science

Faculty Research Mentor: Yulia Kumar

Abstract:

Drones' proliferation and diverse applications demand robust real-time detection and description systems to enhance safety, prevent collisions, and optimize autonomy in multi-drone environments. This study introduces an autonomous drone detection and description framework, leveraging the YOLOv7 object detection model integrated with advanced Large Language Models (LLMs). The system processes a curated dataset of 1,359 drone images from Kaggle, programmatically labeled using OpenAI's ChatGPT-4o via API. The labeling process employed a novel classification scheme proposed by leading LLMs, including Google’s Gemini-1.5-Pro-002. The annotated dataset was divided into training, testing, and validation subsets, enabling fine-tuning of the YOLOv7 model for superior detection performance. Initial results revealed suboptimal performance with pre-trained weights; however, fine-tuning significantly enhanced generalization and accuracy in diverse scenarios. Drone videos were analyzed by extracting frames at 30 frames per second, where detected drones were described using the LLM pipeline. These descriptions were overlaid onto the videos, providing an autonomous, real-time analysis of drone activity. Preliminary trials demonstrate high detection accuracy and computational efficiency, with ongoing experiments exploring alternative YOLO models to optimize results. This research establishes a foundation for AI-driven drone monitoring systems with applications in airspace management, regulatory compliance, and security. This study advances autonomous unmanned aerial vehicle (UAV) detection and situational awareness technologies by combining state-of-the-art object detection with LLM-driven descriptions.

Previous
Previous

Energy Consumption Analysis of Parallelization Strategies with Adaptive Precision in Deep Learning Models

Next
Next

Mental Health-Seeking: Impact of Generational Status at Kean University