“The scarcity of public image datasets remains a key bottleneck in developing next-generation computer vision and intelligent systems for precision agriculture.” (Lu and Young 2020)
Why?
Developing image datasets is difficult, especially for agricultural applications. Collecting and annotating images pose major challenges.
Labeling is Time Consuming
Time consuming
Need tens - hundreds of thousands, sometimes millions, of images to train data hungry deep learning models
other have noted that manually labeling images can take minutes to hours (Table 1)
Labeling Time from the literature
Source | Annotation Technique | Scene type | Time |
---|---|---|---|
manual segmentation of cutouts | Simple | 5-30 min / real image | |
Skovsen et al. (2019) (in conversation) | manual segmentation of cutouts | complex (field) | hour(s) / real images |
manual segmentation of drone images of 3 classes | complex (field) | 60 min / image | |
manual segmentation 3 classes | complex (field) | 2-4 hrs / image |
Labeling is Expensive
Texas A&M Case Study
Used a third-party labeling service to sematically label images of agricultural weeds and plants at early growth stages roughly 2-6 weeks after emergence. Details are as follows:
Company: Precise BPO Solution
1000 images
11726 segments
$0.125 per segment*
*While TAMU was given a discount of $0.095 per segment, $0.125, the non-discounted price, is used here. It is unlikely the company will provide the same discount for images over 1000.
Time to label all images - 2.5 weeks
$1465.75 total cost
Agricultural Scenes are Diverse
High individual diversity from differences in growth stages and effects of biotic and abiotic factors on morphology
Large intra-species variation and similarity with other species
Complex agricultural scenes as the result of effect from climate, soil, ground residue, and other weed populations.
Labeling Cost Projections for SemiField
We can estimate the expected costs and time for SemiField data collection using the Texas A&M numbers.
here we use a smaller average of 8 segments per image (instead of 11.726 like TAMU)
$0.125 per segment
2.5 weeks (13 working days from 8am - 5pm) = ~104 worker hours
104 hours translates to 32 seconds per segment (104 / 11726)
Images | Worker Hours estimate | Total Cost ($) |
---|---|---|
1,000 | 6249.958 | $1,465.750 |
1,000 | 4264 | $1,000.000 |
10,000 | 42640 | $10,000.000 |
25,000 | 106600 | $25,000.000 |
100,000 | 426400 | $100,000.000 |
500,000 | 2132000 | $500,000.000 |
0 Comments