Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

This document describes the weedsimagerepo azure storage account structure and how the pre processing metadata tables relate to each other.

Storage structure

There are 5 tables and 1 blob of interest for the purpose of reporting back to partners. The blob contains the image files which are saved as in two formats, JPG and ARW, whereas the tables contain all the pre processing metadata associated with those images.

Tables

The metadata is organized as a relational database.

Even though the idea was to avoid data repetition, that is not the case in this storage. Also it’s important to note that in spite of there being 3 levels of data organization, the 1st and 3rd level tables are related as well as the 1st and 2nd and 2nd and 3rd.

image-20240906-161152.png

These are the 5 tables of relevance for reporting back to our partners and brief descriptions of their contents:

  • wirmastermeta: first level table. Bellow are the most relevant fields that it contains:

    • PatitionKey (string, autogenerated): azure storage name

    • RowKey (string, autogenerated): unique id for each table entry. Used to relate this table to others.

    • UsState (string, user input, dropdown): Unique partner code. There are inconsistencies in the partner code since we have more than 1 partner for some of the states which results in having some 2 characters codes and other 4 characters codes. I would like to modify this so all codes are 4 characters, the first 2 letters for the state followed by 2 numbers (01, 02, etc). The back end parter codes are called affiliations in the front end and are formed by the US state initials + the primary investigator’s last name for that group (e.g. MD-Mirsky).

    • PlantType (string, user input, dropdown): Three plant categories, all upper case, no spaces.

    • CloudCover (string, user input, dropdown):

    • GroundResidue (string, user input, dropdown and type): type of ground residue, e.g. previous crop in the rotation.

    • GroundCover (string, user input, dropdown): 5 ranges from 0 to 100% coverage.

    • Timestamp (date, autogenerated): is the date and time of upload to this storage.

    • Username (string, user input, type): This one is a free for all, we didn’t ask the users to enter anything specific. In some cases they did enter a name in others just a letter or initials. There are also empty cells sue to an early version of the app which didn’t require the users to complete this field. There can be multiple user names per partner code.

    • WeedsOrCrops:

  • wircovercropsmeta: second level table. Contains PlantType = COVERCROP only data. Things to note about the data in this table: PartitionKey and Affiliation both contain the same information and these information already exists in the higher level table as UsState. CloudCover, GroundResidue and GroundCover are also repeated from the higher level table.

    • FlowerFruitOrSeeds (Boolean, user input, multiple choice): are there or are there not reproductive organs.

    • CoverCropSpecies (string, user input, dropdown): species of cover crops specifically selected for this repository.

    • CoverCropFamily (string, user input, dropdown): category of cover crop.

There is no reason for this table to exist since all the distinct variables that exist here belong in the higher level table.

  • wircropsmeta: second level table. Contains PlantType = CASHCROPS only data.

    • PartitionKey (string, autogenerated): azure storage name

    • RowKey (string, autogenerated): unique id for each table entry. I don’t see the use for this column, uid is not used to relate this table to others nor is it present in the blob.

    • CropName (string, user input, dropdown): cash crop name, 3 categories.

    • MasterRefID (string, autogenerated): unique id for each table entry. Used to relate this table to others.

    • SizeClass (string, user input, dropdown): determined by the size of the target plant. This collumn was added in the second year of image collection, previously we used height.

    • Height (string, user input, dropdown): ranges of heights. Determined by the size of the target plant. This field was only used the first year of image collection and was later on replaced by SizeClass.

    • GrowthStage (string, user input, dropdown): growth stages for cotton. This field was meant to be only used for Cotton, check on the app.

    • Timestamp (date, autogenerated): is the date and time of upload to this storage.

    • CottonVariety (string, user input, dropdown): cotton varieties which were specifically selected to be included in this repository.

There is no reason for this table to exist since all the distinct variables that exist here belong in the higher level table.

  • wirweedsmeta: second level table. Contains PlantType = WEEDS only data.

  • wirimagerefs

Blob:

  • weedsimagerepo: contains all the image files. These files are related to the metadata saved in the tables by the field “Name”.

  • No labels

0 Comments

You are not logged in. Any changes you make will be marked as anonymous. You may want to Log In if you already have an account.