Dataset Editor

Command: Editors > Dataset Editor

In order to be able to create classifiers or estimators, one needs to have sets of data which contain knowledge about individual areas in an image (i.e. class attributes or concentrations). Epina ImageLab solves this assignment of information by providing a tool to select individual pixels of an image and assign them to one of several classes and/or quantities. Both class attributes and quantities can be combined in a single dataset.

The dataset editor provides the following controls:

Load spectral descriptors A set of spectral descriptors has to be loaded in order to be able to add data points. Clicking a particular line of the spectral descriptor list displays the corresponding image.
Load a dataset Loads a dataset from the dataset directory.
Store the dataset Stores the current dataset on disk. After clicking this button you have to specify a filename. Please note that all datasets should be saved to the default dataset directory. Storing it outside this folder makes it less convenient to select datasets later-on.
Delete all data points Deletes the entire set of datapoints after confirmation.
No. Classes Defines the number of classes used in the particular dataset. Epina ImageLab supports up to 25 different classes.
No. Quant. Defines the number of user-defined quantities used in the particular dataset. Epina ImageLab supports up to 10 different quantitative values.
Mask Selection The dataset editor provides a tool to create random points either inside or outside a masked region (see below for details, buttons and ). Further, the random selection of data points by using the lasso tool is restricted to non-masked regions.
Sample Size Controls the number of random data points to be generated when either adding points inside or outside of a mask, or within a lasso-enclosed region.
Random data inside mask Creates a set of random data within the masked region. The number of created data points is determined by the "sample size" control.
Random data outside mask Creates a set of random data without the masked region. The number of created data points is determined by the "sample size" control.
Add selected data to the Spectral Collection Appends selected data points to the spectral collection. Be careful when using this command as it may easily spoil your spectral collection.
Export dataset Stores the current dataset in one of several formats on disk. You can select from CSV (Excel) format, plain text format (.TXT) and ASC format (DataLab). The lines of the exported dataset are the data points, the columns contain the descriptor values and the class flags.
Setup Image Properties Adjust the image settings (colors, pixel display, isolines, contour plot). Please note that the color scale and color palette can be adjusted by direct interaction with the color scale.
Zoom 1:1 Adjust the zoom factor to display the entire image.
Zoom in Zooms into the image
Zoom out Zooms out of the image
Activate pan mode Activates the pan mode to allow to drag the image across its viewport.
Zoom into a rectangular area. Zoom into the image by drawing a rectangular area which is enlarged to fill the data window after releasing the mouse button.
Spatial cursor Activating this button resets any previously selected function.
Add new points Click the image to add a point of the current class.
Add current location as data point Create a new data point at the current position of the cursor.
Add a random sample of points Fills the region enclosed by a lasso line by a number of randomly selected points. The number of points is controlled by the "Sample Size" control. Please note that the random points are only created in non-masked regions.
Add a random sample of points Fills a rectangluar region by a number of randomly selected points. The number of points is controlled by the "Sample Size" control. Please note that the random points are only created in non-masked regions.
Move a point Move a datapoint by dragging it with the mouse.
Delete a point Click a point to delete it. Please note that deletions cannot be undone. Hint: in order to delete several points at once, you can use the dataset list. For this purpose select several points by clicking the corresponding lines while holding the Ctrl-button pressed. Then right-click and select "Delete Selected Data Points"
Change the class number When this option is active you can change the class number of any point to the class number defined by the "current class" control. Click a point in the image to change its class number.
Current Class Determines the class number of new points to be added. Further it controls the target class when using the "change class number" tool ().

 

Further the dataset editor provides a few commands via the "File" menu:1)

Load Spectral Descriptors A set of spectral descriptors has to be loaded in order to be able to add data points. Clicking a particular line of the spectral descriptor list displays the corresponding image.
Load Dataset Loads a dataset from the dataset directory.
Save DataSet Stores the current dataset on disk. After clicking this button you have to specify a filename. Please note that all datasets should be saved to the default dataset directory. Storing it outside this folder makes it less convenient to select datasets later-on.
Export Dataset Stores the current dataset in one of several formats on disk. You can select from CSV (Excel) format, plain text format (.TXT) and ASC format (DataLab). The lines of the exported dataset are the data points, the columns contain the descriptor values and the class flags.
Import Snapshot Spectra Imports the snapshot spectra. In order to be able to import the snapshot spectra, the spectra have to have class numbers assigned to them which fall into the range of the defined classes (i.e. between 1 and the specified maximum, see the control "No. of Classes").
Import from Spectral Collection Imports data from the current spectral collection.
Export to Spectral Collection Adds selected data points to the current spectral collection.
Create Binary Datasets See Create Binary Datasets for details.
Remove Invalid Data Points Removes all data points which are either located outside the currently loaded image or which belong to an undefined class.
Remove Duplicates Removes all duplicate points from the dataset. The criteria for finding the duplicates are the columns 'x' and 'y' of the dataset list. If two points have equal coordinates the point with the higher index (column '##') is removed.
Split Dataset See Split Datasets for details.

 

How To: Please follow these steps to create a set of test/training data:
  1. Select a list of descriptors. The descriptors are used to create suitable images for the definition of data points.
  2. Set the number of intended classes and specify the class names and their unique identifiers.
  3. Set the number of quantities and specify their names
  4. Select "Add Data Points" and click the image at a location where you know which group this location belongs to. Another option to add data points is to use the lasso tool or the random fill tools for masked regions.
  5. If necessary, adjust the assigned class in the list of data points.
  6. Repeat until a sufficient number of data points have been specified.
  7. Enter the corresponding quantities, if you have specified any quantities
  8. If necessary you can move or delete existing data points (buttons and , respectively)
  9. Finally, store the dataset by clicking the "Save Data" button.

The spectra of the assigned classes are averaged and displayed together with the standard deviation in the "Average Spectra" page. The standard deviation allows you to check the spectral consistency of the grouped data points. Further you may rename the classes by clicking and editing the current class name in the list of defined classes.

Please note that each class should have a unique class identifier, which has to be entered in the column "Unique Class IDs". These class IDs are used on several occasions for identifying classes unambigously.

Hint: You can adjust the class colors by opening the Class Color Editor (button of the main toolbar).



1) The "File" menu is available on the form if the form is in non-MDI style (i.e. if the "MDI Form" option is not ticked). For MDI style the menu is merged with the main menu of Epina ImageLab, the menu item is then called "Dataset Editor".