Storage, VM setup, and running the pipeline

BenchBot operators manually upload to blob container (“Upload”)
Blob container is mounted to VM
Cron job is scheduled to start pipeline every N hours
pipeline runs and stores data to temporary locations until complete
- the last process is moving data to the appropriate blob container mounted on VM
processed batches are recorded

Pipeline

The pipeline includes seven main processes and places data in five blob containers.

Upload and Preprocessing

Benchbot operators manually upload batches to “upload” blob container. Images are then processed using color calibration card.

Upload batches include images and metadata from a single location and capture period. Batches are named after location ID and date, for example, NC_2022-03-22.

Metadata is made up of:

Ground control point locations csv measured b
Specie map (csv)

Mapping and Detection

AutoSfM

The AutoSfM process takes in developed images and ground control point metadata to create a global CRS. An orthomosaic, or collage of stitched images, and detailed camera reference information is generated, the latter being used to convert local image coordinates into global potting area locations.

For example, an image 2000 pixels high and 4000 pixels wide has a local center point at (1000, 2000), half its height and half its width. Camera reference information allows us to convert this local center point to location on the ground or in the real world in meters, (1.23m, 4.56m) for example.

Detection

Object detection is performed to identify plant locations and create local bounding box coordinates.

There are no images attached to this page.

Detection results from 2022-03-11

Remap

Infers global bounding box positions using autoSfM camera reference information.

WHY?

The object detection model alone does not solve all our problems. It only detect plants, not species. Species information is necessary to create accurate and detailed label data.

Species mapping: Species level detection for this project (24 species) is unrealistic at this early stage. When a user-defined species map and geospatial data are applied, AutoSfM results can provide specie level information. If we know what row or general geographic area these species are located, then we can label each bounding box appropriately.

Unique detection result: Provides unique (primary) bounding box information. The benchbot is taking 6 images along a single row of 4 pots. These images overlap considerably and the same plant is often detected, and thus segmented, multiple times at different angles. While multiple angles are good, its important to identify the unique, or primary detection result (when the camera is directly over the plants). Doing so allows us to:

maximize synthetic image diversity and avoid using the same plant segment (albeit at slightly different angles) multiple times.
us monitor and understand the distribution of primary vs non-primary data for training models. A dataset with many non-unique duplicates, while large, will not be diverse and will lead to poor model performance.
Lastly, being able to identify unique plant/pot position allows us to monitor individual plants throughout their growth

Monitoring: Monitor for inconsistencies and error in image capture across sites using detailed reporting of camera reference information

Segment Vegetation and Cutout Data

After remapping bounding box coordinates