Meet YOLO26: next-gen vision AI.
Ultralytics
Events

Powering open-source computer vision with Hugging Face's transformers

Dive into open-source computer vision with Hugging Face! Learn about transfer learning, transformers, and explore over 8,000 models. Join Merve Noyan for insights and practical demos, empowering developers to innovate in AI exploration.

NUNuvola Ladi
5 min read
Open-source computer vision with Hugging Face Transformers

As we keep exploring highlights from the YOLO VISION 2023 (YV23) event, let’s meet Merve Noyan, Developer Advocacy Engineer at Hugging Face, the leading NLP platform with pre-trained models for efficient development of language applications. In her talk, Merve shared some incredible insights into the world of open-source computer vision.

Join us as we take you on a journey through the fascinating universe of transfer learning, transformers, and the open-source computer vision ecosystem.

Link to this sectionTransfer learning unveiled: A quick recap#

Merve kicked things off with a quick primer on transfer learning, the magic wand that allows us to transfer knowledge from one neural network to another. Imagine training a model on the universal features in the early layers, like edges and corners, and then fine-tuning it for specific tasks. This is the essence of transfer learning, reducing data dependencies and boosting accuracy.

Merve highlighted classical convolutional backbones like ResNet and Inception, setting the stage for the transformational journey ahead.

Link to this sectionEnter the transformers: A riddle unveiled#

What makes Transformers special? Merve likened it to a riddle, showcasing how they differ from traditional convolution-based models. The secret sauce lies in their ability to perform self-supervised learning, capturing features without the need for labeled data. Vision Transformer, Data Efficient Transformer, CLIP, and Swin Transformer were among the star-studded cast of transformer-based models she introduced.

Laying some common ground with Ultralytics who provides support for a transformer model designed for object detection. This model features an effective hybrid encoder, IOU-aware query selection, and adjustable inference speed. Notably, it adheres to the familiar pattern of other Ultralytics YOLOv8 models, presenting options for prediction, training, validation, and export.

Link to this sectionYour one-stop-shop#

Merve then delved into the treasure trove of Hugging Face's offerings, with over 8,000 models for classical computer vision tasks and 10,000 models for multimodal applications. The Hugging Face Hub boasts a whopping 3,000+ datasets, making it a playground for developers and enthusiasts alike. Merve emphasized the seamless experience, thanks to Hugging Face's consistent API, offering ready-to-use models for various use cases.

Link to this sectionHands-on magic with Hugging Face#

The talk transitioned into practical demonstrations, showcasing how effortlessly one can work with models. From instantiating models and processors to fine-tuning with the Trainer API, Merve made it clear that the Hugging Face Transformers library is a developer's best friend. She even introduced the Pipeline API, a personal favorite, simplifying the workflow for users.

Merve Noyan presenting at YOLO VISION 2023 in Madrid

Fig 1. Merve Noyan presenting at YV23 at the Google for Startups Campus in Madrid.

Link to this sectionA glimpse into applications#

Merve wrapped up the talk with a glimpse into some fantastic applications, including the Plot model for visual question answering, Blip for image captioning, and the powerful Segment Anything model for image segmentation. The Hugging Face Ecosystem's Pipeline API took the spotlight, making it a breeze to use models without diving deep into the technicalities.

The cherry on top was Merve's showcase of creating optical illusions with Elysian Diffusion, a captivating experience that adds a fun twist to the world of AI.

Link to this sectionIn a nutshell!#

In conclusion, Merve's talk left us inspired and itching to explore the endless possibilities of open-source computer vision. Hugging Face has truly made AI accessible, fun, and exciting, empowering developers to unleash their creativity. Here's to the future of the open-source community and the incredible innovations it holds!

Watch the full Hugging Face computer vision talk!

Explore solutions

Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more
Real-time AI that works with your team

AI in Robotics

Power smarter machines with Ultralytics YOLO models. Vision AI in robotics drives autonomous navigation, perception, object tracking, and real-time control.

Learn more
Real-time AI that works with your team

AI in Logistics

Streamline logistics with Ultralytics YOLO models. Vision AI enables package inspection, sorting, vehicle tracking, and real-time warehouse safety monitoring.

Learn more
Real-time AI that works with your team

AI in Retail

Reimagine retail with Ultralytics YOLO models. Vision AI powers inventory tracking, shelf monitoring, queue management, and smarter customer insights.

Learn more
Real-time AI that works with your team

AI in Healthcare

Build healthcare solutions with Ultralytics YOLO models. Vision AI in healthcare powers faster medical imaging, smarter diagnostics, and patient monitoring.

Learn more
Real-time AI that works with your team

AI in Manufacturing

Optimize manufacturing with Ultralytics YOLO models. Vision AI drives quality control, defect detection, PPE compliance, and assembly line automation.

Learn more
Real-time AI that works with your operation

AI in Automotive

Apply computer vision in automotive with Ultralytics YOLO models. Vision AI elevates road safety, driver assistance, and vehicle automation for smarter roads.

Learn more
Real-time AI tailored to your operation

AI in Agriculture

Bring vision AI to smart agriculture with Ultralytics YOLO models. Power crop monitoring, livestock tracking, and precision farming for higher, smarter yields.

Learn more

Let's build the future of AI together!

Begin your journey with the future of machine learning