You are here: Start » User Interface » Creating Deep Learning Model

Creating Deep Learning Model

Contents:

  1. Introduction
  2. Workflow
  3. Detecting anomalies 1
  4. Detecting anomalies 2
  5. Detecting features
  6. Classifying objects
  7. Segmenting instances
  8. Locating points

Introduction

Deep Learning editors are dedicated graphical user interfaces for DeepModel objects (which represent training results). Each time a user opens such an editor he is able to add or remove images, adjust parameters and perform new training.

Since version 4.10 it is also an option to open a Deep Learning Editor as a stand-alone application, which is especially useful for re-training models with new images in production environment.

Requirements:

  • A DeepLearning license is required to use Deep Learning editors and filters.
  • Deep Learning Service must be up and running to perform model training.

Currently five deep learning tools are available:

  1. Anomaly Detection – for detecting unexpected object variations; trained with sample images marked simply as good or bad.
  2. Feature Detection – for detecting regions of defects (such as surface scratches) or features (such as vessels on medical images); trained with sample images accompanied with precisely marked ground-truth regions.
  3. Object Classification – for identifying the name or the class of the most prominent object on an input image; trained with sample images accompanied with the expected class labels.
  4. Instance Segmentation – for simultaneous location, segmentation and classification of multiple objects in the scene; trained with sample images accompanied with precisely marked regions of each individual object.
  5. Point Location – for location and classification of multiple key points; trained with sample images accompanied with marked points of expected classes.

Technical details about these tools are available at Machine Vision Guide: Deep Learning while this article focuses on the training graphical user interface.

Workflow

You can open a Deep Learning Editor via:

  • a filter in Adaptive Vision Studio:
    1. Place the relevant Deep Learning filter in the Program Editor.
    2. Go to its Properties.
    3. Click on the button next to the inDeepModel parameter.

  • a standalone Deep Learning Editor application:
    1. Open a standalone Deep Learning Editor application (which can be found in the Adaptive Vision Studio installation folder as "DeepLearningEditor.exe", in Adaptive Vision folder in Start menu or in Adaptive Vision Studio application in Tools menu).
    2. Choose whether you want to create a new model or use an existing one:
      • Creating a new model: Select the relevant tool for your model and press OK, then select or create a new folder where files for your model will be contained and press OK.
      • Choosing existing model: Navigate to the folder containing your model files – either write a path to it, click on the button next to field to browse to it or select one of the recent paths if there are any; then press OK.

The Deep Learning model preparation process is usually split into the following steps:

  1. Loading images – load training images from disk
  2. Labeling images – mark features or attach labels on each training image
  3. Setting the region of interest (optional) – select the area of the image to be analyzed
  4. Adjusting training parameters – select training parameters, preprocessing steps and augmentations specific for the application at hand
  5. Training the model and analyzing results

Overview of a Deep Learning Editor

Detecting anomalies 1 (reconstructive approach)

In this tool the user only needs to mark which images contain correct cases (good) or incorrect ones (bad).

1. Marking Good and Bad samples

Use Label as 'Good' or Label as 'Bad' buttons to label each image in the training set. Green and red icons on the right side of the training images indicate to which set the image belongs to.

Labeled images in Deep Learning Editor

2. Configuring augmentations

It is usually recommended to add some additional sample augmentations, especially when the training set is small. For example, the user can add additional variations in pixels intensity to prepare the model for varying lighting conditions on the production line. Refer to "Augmentation" section for detailed description of parameters: Deep Learning – Augmentation.

3. Reducing region of interest

Reduce region of interest to focus only on the important part of the image. Reducing region of interest will speed up both training and inference.

Hint: It is highly recommended to use the same region of interest for training and inference.

By default region of interest contains the whole image.

4. Setting training parameters

  • Network type – selecting between Global and Local network types.
  • Network depth – predefined network architecture parameter. For more complex problems higher depth might be useful.
  • Approach (for the Local type only) – selecting between Contextual and Denoising approach to defect removal.
  • Feature size (for the Local type only) – the width of the inspection window; it should be greater than a typical anomaly diameter.

For more details read Deep Learning – Setting parameters.

5. Performing training

During training, two figures are successively displayed: training error and validation error. Both charts should have a similar pattern.

More detailed information is displayed below the chart:

  • current training statistics (training and validation),
  • number of processed samples (depends on the number of images and feature size),
  • estimated training time.

Training process can take its time depending on the selected stopping criteria and available hardware. During this time training can be finished manually anytime by clicking Stop button. If no model is present (the first training attempt) model with the best validation accuracy will be saved. Consecutive training attempts will prompt user about old model replacement.

6. Analyzing results

The window shows a histogram of sample scores and a heatmap of found defects. The left column contains a histogram of scores computed for each image in the training set. Additional statistics are displayed below the histogram.

Classify and Classify All buttons can be used to evaluate trained model. It can be useful after adding new images to the data set or after changing the area of interest.

After training two border values are computed:

  1. Maximum good sample score (T1) – all values from 0 to T1 are marked as Good.
  2. Minimum bad sample score (T2) – all values greater than T2 are marked as Bad.

All scores between T1 and T2 are marked as "Low quality". Results in this range are uncertain and may be not correct. Filters contain an additional output outIsConfident which determines the values which are not in the T1-T2 range.

In the top left corner of the editor a small rectangle visualizes the selected feature size.

After evaluation additional filtering options may be used in the the list of training images.

Detecting anomalies 2 (classificational approach)

DeepLearning_DetectAnomalies2 is a second filter which can be used for detecting anomalies. It is designed to solve the same problem, but in a different way. Instead of using image reconstruction techniques Anomaly Detection 2 performs one-class classification of each part of the input image.

As both tools are very similar the steps of creating a model are the same. There are only a few differences between those filters in the model parameters section. In case of DeepLearning_DetectAnomalies2 the user does not need to change iteration count and network type. Instead, it is possible to set the Sampling density that defines the step of analysis using the inspection window. The higher Sampling density, the more precise heatmaps, but longer training and inference times.

The resulting heatmap is usually not as accurate spatially as in case of using reconstructive anomaly detection, but the score accuracy and group separation on the histogram may be much better.

Detecting features (segmentation)

In this tool the user has to define each feature class and then mark features on each image in the training set. This technique is used to find object defects like scratches or color changes, and for detecting image parts trained on a selected pattern.

1. Defining feature classes (Marking class)

First, the user has to define classes of defects. Generally, they should be features which user would like to detect on images. Multiple different classes can be defined but it is not recommended to use more than a few.

Class editor is available under sprocket wheel icon on the bottom bar.

To manage classes, Add, Remove or Rename buttons can be used. To customize appearance, color of each class can be changed using Change Color button.

The current class for editing is displayed on the left, the user can select different class after click.

Use drawing tool to mark features on the input images. Tools such as Brush or Rectangle can be used for selecting features.

In addition, class masks can be imported from external files. There are buttons for Import and Export created classes so that the user can create a image of mask automatically prior to Deep Learning model.

The image mask should have the same size as the selected image in the input set. When importing an image mask, all non-black pixels will be included in the current mask.

The user can also load multiple images and masks at the same time, using Add images and masks button.

Directory containing input images should be selected first. Then, directories for each feature class can be selected below. Images and masks are matched automatically using their file names. For example, let's assume that "images" directory contains images 001.png, 002.png, 003.png; "mask_class1" directory contains 001.png, 002.png, 003.png; and "mask_class2" directory contains 001.png, 002.png, 003.png. Then "images\001.png" image will be loaded together with "mask_class1\001.png" and "mask_class2\001.png" masks.

2. Reducing region of interest

The user can reduce the input image size to speed up the training process. In many cases, number of features on an image is very large and most of them are the same. In such case the region of interest also can be reduced.

On the bottom bar there are tools for applying current ROI to all images as well as for resetting the ROI.

3. Setting training parameters

  • Network depth – chooses one of several predefined network architectures varying in their complexity. For bigger and more complex image patterns a higher depth might be necessary.
  • Iteration count – the maximal number of times that a training set of samples is fully processed to train the model.
  • Patch size – the size of an image part that will be analysed with one pass through the neural network. It should be significantly bigger than any feature of interest, but not too big – as the bigger the patch size, the more difficult and time consuming is the training process.

For more details please refer to Deep Learning – Setting parameters and Deep Learning – Augmentation.

4. Model training

The chart contains two series: training and validation score. Higher score value leads to better results.

5. Result analysis

Image scores (heatmaps) are presented in blue-yellow-red colors palette after using the model to evaluation of image. The color represents the probability of the element belonging to the currently selected feature class.

Classify and Classify All buttons can be used to evaluate images. It can be useful after adding new images to the data set or after changing the area of interest.

Image after classification performed using reduced region of interest.

In the top left corner of the editor, a small rectangle visualizes the selected patch size.

Classifying objects

In this too, the user only has to label images with respect to a desired number of classes. Labeled images will allow to train model and determine features which will be used to evaluate new samples and assign them to proper classes.

1. Editing number of classes

By default two classes are defined. If the problem is more complex than that, the user can edit classes and define more if needed. Once the user is ready with definition of classes, images can be labeled.

Using Class Editor.

2. Labeling samples

Labeling of samples is possible after adding training images. Each image has a corresponding drop-down list which allows for assigning a specific class. It is possible to assign a single class to multiple images by selecting desired images in Deep Learning Editor.

Labeling images with classes.

3. Reducing region of interest

Reduce the region of interest to focus only on the important part of the image. Reducing region of interest will speed up both training and classification. By default the region of interest contains the whole image.

To get the best classification results, use the same region of interest for training and classification.

Changed region of interest.

4. Setting training parameters

  • Iteration count – defines the maximal number of times that all samples will be used to train the network.
  • Network depth – predefined network architecture parameter. For more complex problems higher depth might be necessary.
  • Detail level – level of detail needed for a particular classification task. For majority of cases the default value of 1 is appropriate, but if images of different classes are distinguishable only by small features, increasing value of this parameter may improve classification results.

For more details please refer to Deep Learning – Setting parameters and Deep Learning – Augmentation.

5. Performing training

During training two series are visible: training accuracy and validation accuracy. Both charts should have a similar pattern.

More detailed information is displayed below the chart:

  • current training statistics (training and validation accuracy),
  • number of processed samples (depends on the number of images),
  • estimated training time.

Training object classification model.

Training process can take a couple of minutes or even longer. It can be manually finished if needed. The final result of one training is one of the partial models that achieved the highest validation accuracy (not necessarily the last one). Consecutive training attempts will prompt the user whether to save a new model or keep the old one.

6. Analysing results

The window shows a confusion matrix which indicates how well the training samples have been classified.

The image view contains a heatmap which indicates which part of the image contributed the most to the classification result.

Classify and Classify All buttons can be used to classify training images. It can be useful after adding new images to the data set or after changing the area of interest.

Confusion matrix and class assignment after the training.

Sometimes it is hard to guess the right parameters in the first attempt. The picture below shows confusion matrix that indicates inaccurate classification during the training (left).

Confusion matrices for model that needs more training (left) and for model well trained (right).

It is possible that confusion matrix indicates that trained model is not 100% accurate with respect to training samples (numbers assigned exclusively on main diagonal represent 100% accuracy). User needs to properly analyze this data, and use to his advantage.

Confusion matrix indicating good generalization

Too many erroneous classifications indicate poor training. Few of them may indicate that model is properly focused on generalization rather than exact matching to training samples (possible overfitting). Good generalization can be achieved if images used for training are varied (even among single class). If provided data is not varied within classes (user expects exact matching), and still some images are classified outside main diagonal after the training, user can:

  • increase the network depth,
  • prolong training by increasing number of iterations,
  • increase amount of data used for training,
  • use augmentation,
  • increase the detail level parameter.

Segmenting instances

In this tool a user needs to draw regions (masks) corresponding to the objects in the scene and specify their classes. These images and masks are used to train a model which then in turn is used used to locate, segment and classify objects in the input images.

1. Defining object classes

First, a user needs to define classes of objects that the model will be trained on and that later it will be used to detect. Instance segmentation model can deal with single as well as multiple classes of objects.

Class editor is available under the Class Editor button.

To manage classes, Add, Remove or Rename buttons can be used. To customize appearance, color of each class can be changed using the Change Color button.

Using Class Editor.

2. Labeling objects

After adding training images and defining classes a user needs to draw regions (masks) to mark objects in images.

To mark an object the user needs to select a proper class in the Current Class drop-down menu and click the Add Instance button.

Use drawing tool to mark objects on the input images. Multiple tools such as brush and shapes can be used to draw object masks. Masks are the same color as previously defined for the selected classes.

The Marked Instances list in the top left corner displays a list of defined objects for the current image. If an object does not have a corresponding mask created in the image, it is marked as "(empty)". When an object is selected, a bounding box is displayed around its mask in the drawing area. Selected object can be modified in terms of a class (Change Class button) as well as a mask (by simply drawing new parts or erasing existing ones). Remove Instance button allows to completely remove a selected object.

Labeling objects.

3. Reducing region of interest

Reduce region of interest to focus only on the important part of the image. By default region of interest contains the whole image.

Changing region of interest.

4. Setting training parameters

  • Iteration count – defines maximal number of times that all samples will be used to train network.
  • Network depth – predefined network architecture parameter. For more complex problems higher depth might be necessary.

For more details read Deep Learning – Setting parameters.

Details regarding augmentation parameters. Deep Learning – Augmentation

5. Performing training

During training two main series are visible: training error and validation error. Both charts should have a similar pattern. If a training was run before the third series with previous validation error is also displayed.

More detailed information is displayed below the chart:

  • current iteration number,
  • current training statistics (training and validation error),
  • number of processed samples,
  • estimated training time.

Training instance segmentation model.

Training may be a long process. During this time, training can be stopped. If no model is present (first training attempt) model with best validation accuracy will be saved. Consecutive training attempts will prompt user whether to replace the old model.

6. Analysing results

The window shows results of instance segmentation. Detected objects are displayed on top of the images. Each detection consists of following data:

  • class (identified by a color),
  • bounding box,
  • model-generated instance mask,
  • confidence score.

Classify and Classify All buttons can be used to perform instance segmentation on the provided images. It can be useful after adding new images to the data set or after changing the area of interest.

Instance segmentation results visualized after the training.

Instance segmentation is a complex task therefore it is highly recommended to use data augmentations to improve network's ability to generalize learned information. If results are still not satisfactory the following standard methods can be used to improve model performance:

  • providing more training data,
  • increasing number of training iterations,
  • increasing the network depth.

Locating points

In this tool the user defines classes and marks key points in the image. This data is used to train a model which then is used to locate and classify key points in images.

1. Defining classes

First, a user needs to define classes of key points that the model will be trained on and later used to detect. Point location model can deal with single as well as multiple classes of key points.

Class editor is available under the Class Editor button.

To manage classes, Add, Remove or Rename buttons can be used. Color of each class can be changed using Change Color button.

Using Class Editor.

2. Marking key points

After adding training images and defining classes a user needs to mark points in images.

To mark an object a user needs to select a proper class in the Current Class drop-down menu and click the Add Point button. Points have the same color as previously defined for the selected class.

The Selected Points list in the top left corner displays a list of defined points for the current image. A point can be selected either from the list of directly on the image area. A selected point can be moved, removed (Remove Point button) or has its class changed (Change Class button).

Marking points.

3. Reducing region of interest

Reduce region of interest to focus only on the important part of the image and to speed up the training process. By default region of interest contains the whole image.

Changing region of interest.

4. Setting training parameters

  • Iteration count – defines maximal number of times that all samples will be used to train network.
  • Network depth – predefined network architecture parameter. For more complex problems higher depth might be necessary.
  • Feature size – the size of an small object or of a characteristic part. If images contain objects of different scales, it is recommended to use feature size slightly larger than the average object size, although it may require experimenting with different values to achieve optimal results.

For more details read Deep Learning – Setting parameters.

Details regarding augmentation parameters. Deep Learning – Augmentation

5. Performing training

During training, two main series are visible: training error and validation error. Both charts should have a similar pattern. If a training was run before the third series with previous validation error is also displayed.

More detailed information is displayed below the chart:

  • current iteration number,
  • current training statistics (training and validation error),
  • number of processed samples,
  • estimated training time.

Training point location model.

Training may be a long process. During this time, training can be stopped. If no model is present (first training attempt) model with best validation accuracy will be saved. Consecutive training attempts will prompt user whether to replace the old model.

6. Analysing results

The window shows results of point location. Detected points are displayed on top of the images. Each detection consists of following data:

  • visualized point coordinates,
  • class (identified by a color),
  • confidence score.

Classify and Classify All buttons can be used to perform point location on the provided images. It may be useful after adding new training or test images to the data set or after changing the area of interest.

Point location results visualized after the training.

It is highly recommended to use data augmentations (appropriate to the task) to improve network's ability to generalize learned information. If results are still not satisfactory the following standard methods can be used to improve model performance:

  • changing the feature size,
  • providing more training data,
  • increasing number of training iterations,
  • increasing the network depth.

See also:

Previous: Keyboard Shortcuts Next: Creating User Filters