Table of Contents | ||||
---|---|---|---|---|
|
Conceptual Design
The pipeline is broken into three general processing steps:
Image preprocessing (blue)
Global coordinate referencing (yellow)
Vegetation segmentation (green)
Upload, running the pipeline, and storage
Batch Upload
Benchbot operators manually upload batches to “upload” batch folder images to the “uploads” blob container with a custom app.
Upload batches include images and metadata from a single location and capture period. Batches are named after location ID and date, for example, NC_2022-03-22.Benchbot operators will manually upload (by dragging and dropping) batch folder into the “Uploads” blob container using Azure Storage Explorer:
Images
Potting area and species location metadata
Images are uniquely labeled using UNIX timestamps.
Metadata includes:
Ground control point location marker locations in the form of a (csv) measured in meters .
Species Map (csv) that lists species for each row.
and
Blob containers are mounted to VM.
A cron job is scheduled to start the pipeline every N hours.
To avoid processing the same batch multiple times, new batches are selected using a batch log file.
The pipeline runs and stores data to temporary locations until complete.
While blob container do not technically use directories or “folders”, this project uses a pseudo-directory structure for organizing the multiple data products that are being created and updated. The temporary storage locations replicate the pseudo-directory structure used in blob containers.
After the pipeline finishes, the processed batch is logged to the batch log file.
Lastly, temporary batch data is moved permanently to the appropriate blob containers.
AutoSfM runs in a docker container
All code is written in Python 3.9
Pipeline
The pipeline includes seven main processes and places data in five temporary storage locations
Preprocessing
using a single point of origin.
Species map that explain specie groups location in the potting area
Inputs:
Outputs:
Preprocessing
Includes color calibration…
Raw images are preprocessed using color calibration card.
Inputs:
Raw images from “uploads” blob container
Outputs:
Processed jpgs
masks the remove blue BenchBot area to reduce AutoSfM alignment error
Mapping and Detection
AutoSfM
The AutoSfM process takes in developed images and ground control point metadata to create a global coordinate reference system (CRS). An orthomosaic, or collage of stitched images, and detailed camera reference information is generated, the latter being used to convert local image coordinates into global potting area locations.
For example, an image 2000 pixels high and 4000 pixels wide has a local center point at (1000, 2000), half its height and half its width, measured in pixels. Camera reference information allows us to project this local center point to a geographical potting area location in meters, (1.23m, 4.56m) for example.
Inputs:
Developed images
Ground control point information (.csv)
Outputs:
camera matrix and orientation information for each image (.csv)
fov for each image (.csv)
and processing accuracy assessment (.csv)
orthomosaic of potting area (.tif)
Detection
Object detection is performed to identify plant locations and create local bounding box coordinates. More detailed pixel wise segmentation is performed with the bounding box areas.
Model - YOLOv5
Single class - “plant”
Trained and tested on 753 images captured and labeled for the 2021 OpenCV AI competition.
mAP_0.5 = 0.93, mAP_0.5:0.95 = 0.67, recall = 0.9, precision = 0.93
Inputs:
Developed images
Trained model
Outputs:
Local detection results for each image. Detection results are in normalized xyxy format (.csv).
.
Remap
Local plant detection results are remapped to global potting area coordinates.
Inputs:
Images
camera reference information (.csv)
Outputs:
detailed metadata with camera information and detection results (.json)
WHY?
Image annotations need specie-level information, but the detection model only provides the location of “plants”. We also have many overlapping images resulting in duplicate segments which can lead to an imbalanced or homogeneous dataset.
Species mapping: We can infer the species of each detection result with a user-defined species map and geospatial data. If we know what row or general geographic area each species is located, we can label each bounding box appropriately.
Unique detection result: The benchbot is taking 6 images along a single row of 4 pots. These images overlap considerably and the same plant is often detected, and thus segmented, multiple times at different angles. While multiple angles are good, its important to identify the unique, or primary detection result (when the camera is directly over the plants). Doing so allows us to:
maximize synthetic image diversity and avoid using the same plant segment (albeit at slightly different angles) multiple times, which could lead to homogeneous data and thus poor model performance.
Identify unique plant/pot position throughout their growth stages leading to detailed phenotypic profiles
Monitoring: Monitor for inconsistencies and error in image capture across sites using detailed reporting of camera reference information
Segment Vegetation and Cutout Data
A combination of digital image processing techniques including index thresholding, unsupervised classification, and morphological operations separate vegetation from background. The resulting plant cutouts will be used for generating synthetic data.
…
Gallery | ||||||
---|---|---|---|---|---|---|
|
Synthetic Data
TODO:
implement species class labels
add transformations to cutouts, pots, and backgrounds
make more pots
Models
TBD