Pipeline structure
Each stage needs an hash
create a new config entry
- Check that the config does not already exist
Segmentation of all features
- register the input-files Hash (into run.csv)
- register the config Hash (into the run.csv)
- register the segmentation Hash (into the run.csv)
- segmentation of all features
- labelize the features: npixels, width, height, bl, tr
- add total_nbr_features in the run.csv
- add nbr_features in the plate.csv
Compute properties of selected features
- register the properties Hash (into run.csv)
- selection of meaningful features: less than 1000 pix, more than 25 pix (parameters)
- add the nbr_meaningful_features in the plate.csv
- compute properties of the meaningful features (see config) add them to features.csv
Generate Training set (how? until now did by hand)
- register the training selection Hash (into run.csv)
- verify compatibility of config with data (filter conditions with seg data...)
- select plates with less than 100 (meaningful?) features (good plates)
- ask the KI if more than 90% of these are spots? if yes select only the >90% features (?? is that a good idea) 2b. select features with circularity and other conditions (set by config)
- select plates with more than 500 features (bad plates)
- ask the KI if less than 10% of these are spots? if yes select only the <10% features (?? is that a good idea) 4b. select features with mean_intensity above 0.8 and other conditions (set by config)
- add the spot_probability to the features.csv data (0. for non-spots and 1. for spots)
Validate the training set ???!
Train the KI
- register the training Hash (into run.csv)
- register the parent training Hash (into run.csv)
- setup the network (according to config)
- preprocess the training set according to config
- separate training from validation
- train the KI according to config
- save the history and various metrics plots
- save the trained KI
Run the KI on full dataset
- register the selection Hash
- setup network by loading trained KI
- select full dataset from config
- Run the KI on the selected dataset
Validate the results
- generate publishable plots of the spots (with and without contour)
- generate publishable tables (.csv) and metadata (.yml)
- generate daily average of spots graph
- generate monthly average of spots graph
- generate yearly average of spots graph
- group spots in solar groups (labels)
- Look at statistic (Area/Perimeter and other statistical properties of solar spots)
Edited by Yori Fournier