Repository Overview
...
Purpose
The Agricultural Image Repository (NAIRAgIR) , a collection of images designed for serves as a resource for advancing computer vision and , artificial intelligence research in the field of weed management. The need for effective weed management in agriculture is growing as the demand for food and fiber increase, as incidence of herbicide resistance rises, and as the call for more sustainable methods grows louder. In response, researchers and developers are turning to computer vision algorithms to help automate the detection and identification of weeds in agricultural fields.This dataset was created to support this research by providing a diverse and large collection of labeled images that represent different types of , and image-based phenotyping within precision agricultural. It aims to support the development of automated tools for weed management, crop monitoring, and plant trait analysis. By providing a rich and diverse collection of labeled images representing various weeds, cover crops, and crops, grown in different settings and growing conditions. A semi-automatic approach that uses photogrammetry, classical digital image processing, and deep learning has been used to generate multiple type of labels, capturing information about the plants present in the image, and making them suitable across various machine learning tasks. Here, you will find a comprehensive overview of the dataset, including its contents, annotation process, metadata, characteristics, and availability.
Makeup
NAIR cash crops under diverse conditions, AgIR empowers researchers and developers to create innovative solutions for sustainable and efficient agricultural practices.
Scope
AgIR is made-up of 3 broad plant datasets; Weeds (WIR), Cover crops (CCIR), and Cash crops (CIR) types; weeds, cash crops, and cover crops, each of which can be further divided into two main scene types, Semi-fieldField and true real-world Field each providing a unique set of advantages and use cases.
The datasets also include metadata captured during collection and throughout the processing pipeline. This information can be used to further understand the diversity of the dataset, deficiencies, general makeup, and the types of conditions that the algorithms are exposed to.
In summary, the weed image dataset is a comprehensive collection of images and metadata designed to support research in the field of weed management. The diverse and large dataset provides a wealth of data for computer vision algorithms to learn from, helping to improve the accuracy and efficiency of automatic weed detection in agricultural fields
Plant Types
AgIR encompasses a broad range of agricultural imagery, specifically focusing on:
Weeds: A diverse collection of weed species, over 40 agronomically relevant species across different regions in the United States.
Cover Crops: Common cover crop
Cash Crops: Key cash crop species grown in the US.
Scene Types: Images capture both semi-field and real-world field settings, providing a comprehensive view of plant growth environments.
Scene Types
AgIR encompasses two distinct scene types, each with unique advantages and applications.
Semi-Field Data
Environment: Images captured in semi-controlled environments (e.g., nurseries) using the BenchBot, a gantry-like robotic system.
Throughput: High-throughput collection allows for the acquisition of large amounts of data daily (over 500 images across a large nursery potting area) across 3 US locations.
Annotation: Facilitated by the plain black background of weed fabric and hand weeds, enabling automatic annotation through photogrammetry, digital image processing, and deep learning.
Trade-offs: While offering scalability and automation, these images may not fully represent real-world plant conditions.
...
Field Data
Environment: Images collected manually by teams in real-world field settings.
Real-World Relevance: Images closely reflect natural plant growth and environmental variations.
Annotation: Images captured with a grey mat background to aid in segmentation and annotation processes.
Trade-offs: Limited by manual collection, resulting in a smaller data pool compared to semi-field data.
Complementary Nature
The semi-field and field data work in tandem to overcome their individual limitations. The extensive, automatically annotated semi-field data can be leveraged to train annotation helper models, which then aid in the annotation of the more limited, real-world field data. This approach addresses one of the most challenging and expensive tasks in deep learning model development: image labeling.
An Agricultural Image Repository for the United States
...
Cover Crops | Cash Crops | Weeds |
---|---|---|
Black Oats | Corn | Amaranthus spinosus |
Brassica hirta | Cotton | Acalypha ostryifolia |
Barley | Soybean | Urochloa platyphylla |
Brassica juncea | Sida spinosa | |
Brassica napus | Amaranthus palmeri | |
Brassica rapa | Eleusine indica | |
Cereal Rye | Trianthema portulacastrum | |
Crimson Clover | Urochloa ramosa | |
Hairy Vetch | Euphorbia hyssopifolia | |
Oats | Amaranthus palmeri | |
Raphanus sativus | Convolvulus arvensis | |
Red Clover | Solanum elaeagnifolium | |
Triticale | Amaranthus tuberculatus | |
Winter Pea | Ambrosia trifida | |
Winter Wheat | Cyperus esculentus | |
Xanthium strumarium | ||
Cirsium arvense | ||
Amaranthus retroflexus | ||
Setaria viridis | ||
Chenopodium album | ||
Abutilon theophrasti | ||
Chenopodium album | ||
Ambrosia artemisiifolia | ||
Digitaria spp. | ||
Amaranthus hybridus/A. retroflexus | ||
Panicum dichotomiflorum | ||
Datura stramonium | ||
Setaria pumila | ||
Setaria faberi | ||
Erigeron canadensis | ||
Amaranthus palmeri | ||
Senna obtusifolia | ||
Xanthium strumarium | ||
Digitaria sanguinalis | ||
Eleusine indica | ||
Echinochloa crus-galli | ||
Cyperus esculentus | ||
Amaranthus tuberculatus | ||
Urochloa texana | ||
Echinochloa colona | ||
Kochia scoparia | ||
Helianthus annuus | ||
Parthenium hysterophorus | ||
Sorghum halepense |
Semi-Automatic Labeling
A major component of NAIR and the semi-automatic labeling approach is the development of plant segments, or cutouts, that can be used to develop temporary datasets of synthetic images for training annotation assistant detection and segmentation models. Synthetic data along with weak image labels is used to iteratively refine whole image labels.
...