Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.

Repository Overview



The Agricultural Image Repository (NAIRAgIR) , a collection of images designed for serves as a resource for advancing computer vision and , artificial intelligence research in the field of weed management. The need for effective weed management in agriculture is growing as the demand for food and fiber increase, as incidence of herbicide resistance rises, and as the call for more sustainable methods grows louder. In response, researchers and developers are turning to computer vision algorithms to help automate the detection and identification of weeds in agricultural fields.This dataset was created to support this research by providing a diverse and large collection of labeled images that represent different types of , and image-based phenotyping within precision agricultural. It aims to support the development of automated tools for weed management, crop monitoring, and plant trait analysis. By providing a rich and diverse collection of labeled images representing various weeds, cover crops, and crops, grown in different settings and growing conditions. A semi-automatic approach that uses photogrammetry, classical digital image processing, and deep learning has been used to generate multiple type of labels, capturing information about the plants present in the image, and making them suitable across various machine learning tasks. Here, you will find a comprehensive overview of the dataset, including its contents, annotation process, metadata, characteristics, and availability.


NAIR cash crops under diverse conditions, AgIR empowers researchers and developers to create innovative solutions for sustainable and efficient agricultural practices.


AgIR is made-up of 3 broad plant datasets; Weeds (WIR), Cover crops (CCIR), and Cash crops (CIR) types; weeds, cash crops, and cover crops, each of which can be further divided into two main scene types, Semi-fieldField and true real-world Field each providing a unique set of advantages and use cases.

The datasets also include metadata captured during collection and throughout the processing pipeline. This information can be used to further understand the diversity of the dataset, deficiencies, general makeup, and the types of conditions that the algorithms are exposed to.

In summary, the weed image dataset is a comprehensive collection of images and metadata designed to support research in the field of weed management. The diverse and large dataset provides a wealth of data for computer vision algorithms to learn from, helping to improve the accuracy and efficiency of automatic weed detection in agricultural fields

Plant Types

AgIR encompasses a broad range of agricultural imagery, specifically focusing on:

  • Weeds: A diverse collection of weed species, over 40 agronomically relevant species across different regions in the United States.

  • Cover Crops: Common cover crop

  • Cash Crops: Key cash crop species grown in the US.

  • Scene Types: Images capture both semi-field and real-world field settings, providing a comprehensive view of plant growth environments.

Scene Types

AgIR encompasses two distinct scene types, each with unique advantages and applications.

Semi-Field Data

  • Environment: Images captured in semi-controlled environments (e.g., nurseries) using the BenchBot, a gantry-like robotic system.

  • Throughput: High-throughput collection allows for the acquisition of large amounts of data daily (over 500 images across a large nursery potting area) across 3 US locations.

  • Annotation: Facilitated by the plain black background of weed fabric and hand weeds, enabling automatic annotation through photogrammetry, digital image processing, and deep learning.

  • Trade-offs: While offering scalability and automation, these images may not fully represent real-world plant conditions.


Field Data

  • Environment: Images collected manually by teams in real-world field settings.

  • Real-World Relevance: Images closely reflect natural plant growth and environmental variations.

  • Annotation: Images captured with a grey mat background to aid in segmentation and annotation processes.

  • Trade-offs: Limited by manual collection, resulting in a smaller data pool compared to semi-field data.

field_collection.pngImage Added

Complementary Nature

The semi-field and field data work in tandem to overcome their individual limitations. The extensive, automatically annotated semi-field data can be leveraged to train annotation helper models, which then aid in the annotation of the more limited, real-world field data. This approach addresses one of the most challenging and expensive tasks in deep learning model development: image labeling.

An Agricultural Image Repository for the United States


Cover Crops

Cash Crops


Black Oats


Amaranthus spinosus

Brassica hirta


Acalypha ostryifolia



Urochloa platyphylla

Brassica juncea

Sida spinosa

Brassica napus

Amaranthus palmeri

Brassica rapa

Eleusine indica

Cereal Rye

Trianthema portulacastrum

Crimson Clover

Urochloa ramosa

Hairy Vetch

Euphorbia hyssopifolia


Amaranthus palmeri

Raphanus sativus

Convolvulus arvensis

Red Clover

Solanum elaeagnifolium


Amaranthus tuberculatus

Winter Pea

Ambrosia trifida

Winter Wheat

Cyperus esculentus

Xanthium strumarium

Cirsium arvense

Amaranthus retroflexus

Setaria viridis

Chenopodium album

Abutilon theophrasti

Chenopodium album

Ambrosia artemisiifolia

Digitaria spp.

Amaranthus hybridus/A. retroflexus

Panicum dichotomiflorum

Datura stramonium

Setaria pumila

Setaria faberi

Erigeron canadensis

Amaranthus palmeri

Senna obtusifolia

Xanthium strumarium

Digitaria sanguinalis

Eleusine indica

Echinochloa crus-galli

Cyperus esculentus

Amaranthus tuberculatus

Urochloa texana

Echinochloa colona

Kochia scoparia

Helianthus annuus

Parthenium hysterophorus

Sorghum halepense

Semi-Automatic Labeling

A major component of NAIR and the semi-automatic labeling approach is the development of plant segments, or cutouts, that can be used to develop temporary datasets of synthetic images for training annotation assistant detection and segmentation models. Synthetic data along with weak image labels is used to iteratively refine whole image labels.
