This project focused on implementing a state-of-the-art vision system for autonomous drones using YOLOv9 on NVIDIA Jetson hardware, achieving 30 FPS with 95% mAP on custom aerial datasets.
Autonomous drones require robust vision systems capable of identifying and tracking objects in real-time under various environmental conditions. This project developed an optimized computer vision system for unmanned aerial vehicles (UAVs) that can detect and classify multiple object categories from aerial perspectives while operating with limited onboard computational resources.
The vision system employs several cutting-edge techniques to achieve real-time performance on edge hardware:
One of the major challenges was creating a suitable dataset for training the model. The dataset was compiled from multiple sources:
The vision system was integrated with the drone's flight controller using ROS2 (Robot Operating System), enabling:
The development process encountered several challenges unique to deploying computer vision on aerial platforms:
These challenges were addressed through a combination of hardware and software solutions. On the hardware side, the NVIDIA Jetson Xavier NX was selected for its optimal balance between computational power and energy efficiency. Custom cooling solutions were implemented to maintain stable performance during extended flight operations.
For software optimization, the YOLOv9 model underwent extensive pruning and quantization to reduce its computational footprint while preserving accuracy. A multi-stage detection pipeline was implemented, where an initial lightweight model performed region proposal, followed by a more accurate classifier focusing only on regions of interest. This approach significantly reduced the computational load while maintaining high detection accuracy.
To address the challenges of variable lighting and motion blur, the training dataset incorporated extensive data augmentation techniques, including synthetic lighting changes, motion blur simulation and domain randomization. This approach ensured the model's robustness in real-world conditions.