/
Semantic Segmentation in the SemiF-AnnotationPipeline

Semantic Segmentation in the SemiF-AnnotationPipeline

Model Documentation: DeepLabV3+ for Semantic Segmentation

Version: v6.0 | Date: 02/09/2025
Author(s): Matthew Kutugata

1. Overview

Model Name: DeepLabV3+ (ResNet-152 Backbone)
Purpose: Segment and classify pixels in SemiF images into background and plants for automatic annotation purposes.
Pipeline Integration: Used in the preprocessing stage of the AgIR SemiField pipeline to generate semantic masks.
Status: Deployed


2. Model Details

2.1. Input & Output

  • Input: RGB images (JPEG) at 512x512 resolution

  • Output: Segmentation mask with integer class labels

2.2. Training Data

  • Dataset Source: SemiF Agricultural Image Repository (AgIR) and Synthetic data

  • Train Images: 75,399

  • Validation Image: 9,425

  • Test Images: 9,418

  • Annotations Used: Pixel-wise labeled masks for plants and background

  • Preprocessing Steps:

    • Normalization :

    • (mean=[0.398, 0.380, 0.293] std=[0.186, 0.183, 0.153])

2.3. Model Architecture & Hyperparameters

  • Base Model: DeepLabV3+ with a ResNet-152 backbone

  • Fine-tuned or Trained from Scratch? Fine-tuned on agricultural datasets

  • Key Hyperparameters:

    • Learning Rate: 1e-3

    • Batch Size: 32

    • Epochs: 50

    • Optimizer: Adam

    • Loss Function: Dice Loss


3. Model Performance & Evaluation

3.1. Metrics

  • mIoU (Mean Intersection over Union): 92.6%

  • Pixel Accuracy: 92.4%

  • Precision / Recall / F1-score:

    • Crops: P=85.3%, R=80.7%, F1=82.9%

    • Weeds: P=74.2%, R=71.6%, F1=72.9%

3.2. Limitations & Known Issues:

  • Struggles with shadows and occlusions in dense crop fields

  • Requires high-resolution images for best results


4. Deployment & Integration

4.1. Deployment Details

  • Environment: HPC Cluster (SLURM) and Local GPU Workstation

  • Inference Hardware: NVIDIA A100 (HPC) / RTX 4090 (Local)

  • Model Format: PyTorch .pth file

  • Inference Speed: ~85ms per 1024x1024 image

4.2. Pipeline Integration

  • Preprocessing Dependencies:

    • Image resizing to 1024x1024

    • Normalization

  • Postprocessing Steps:

    • Convert mask to class-wise labeled PNG

    • Overlay with input image for visualization

  • Where is it Used in the Pipeline?

    • Step 2: Semantic segmentation of crops and weeds

  • Integration Code Snippet:


5. Maintenance & Future Improvements

  • Retraining Requirements: Every 6 months based on new field imagery

  • Planned Improvements:

    • Fine-tune with more diverse lighting conditions

    • Implement real-time inference with TensorRT for Jetson devices

    • Add multi-spectral image support

  • Responsible Team / Contact: [Your Name] ([Your Email])


6. References & Additional Resources

  • Model Training Logs: [Link to TensorBoard logs]

  • Code Repository: [GitHub or internal repo link]

Related content