Here we present an image-based Multi Channel Classification
and Clustering System
(MCCCS). It is a generalized, script-based classification system for processing various kinds of image data. Due to the modular design, individual processing-components can be easily adapted, extended or exchanged by other external commands. The system includes pipeline examples for solving different segmentation, classification and clustering problems. For solving these various tasks we are utilizing common machine learning approaches. The conversion of image pixel data to the common ARFF file format encouraged the usage of a wide variety of classification frameworks.
MCCCS is a system utilizing machine learning techniques for image processing and image analysis.
The system is mainly characterized by:
- The system is generalized to handle a diverse set of input data, RGB images and multi-channel (hyper-spectral) datasets as well.
- The system includes different approaches for image feature extraction (color and texture).
- It is able to solve different classification problems by using supervised and un-supervised machine learning methods provided by exchangeable libraries.
- It includes methods for handling multi-channel data to solve multi-label classification problems in an efficient way.
Due to its modular Bash-script-based design, it is also easily adaptable and extensible by using common image processing, machine learning libraries or own algorithms.
The software is implemented as a set of Bash scripts which have been tested under Linux, Mac and Windows.
The provided commands are mainly implemented using the JAVA (Version 1.8) programming language, due to the advantage of its platform independence and broad support of different libraries and toolboxes like WEKA and ImageJ.
Acknowledgments for funding
This work was supported by IPK institute funds and project funding of the Federal Ministry of Education and Research (BMBF) (DPPN: 031A053B).
Script and Program Files
The system is based on its main Bash-Scripts and related (Java) support programs. It uses the OME Bio-Formats
as its main program libraries. These required libraries are downloaded automatically into the MCCCS lib folder with the corresponding MCCCS script command.
To get an impression how the system works, we bundled the system with several application examples. It is recommended to download and start the application examples, for further information please view the tabs "Installation" and "Examples".
MCCCS System Download (Scripts, Support Library and Pipeline-Examples)
(~221 KB, zip file)
(~1.9 MB, PDF format)
Compatibility Check and Installation of Support Programs
The system has been tested under CentOS (7), Mac OS X (Yosemite, 10.10) using the GNU Bash
shell and Windows 7/8.1/10 using the Cygwin
commandline interface. It is required to install the Java JRE or JDK and eventually (for running the application examples) the Wget
command(s). It may also run under other operating systems, which provide Bash-script
support and the xargs
commands as well as Java support.
Additional commands for running the provided application examples:
- Bash - support: Start the
bash command from the terminal. If no error appears, your system contains the bash shell.
- Java support: Start
java -version command from the terminal, if the reported version is "1.8.0" or newer and the output contains the information that this is a 64-Bit version, all is fine. Otherwise, install the 64-Bit Java Runtime Environment (JRE) or the JDK for your operating system (Download-Link).
- Wget - support: Start the
wget command from the terminal. If no error appears, your system contains the Wget program.
- Unzip - support: Start the
unzip command from the terminal. If no error appears, your system contains the UnZip program.
- Xargs - support: Start the
xargs command from the terminal. If no error appears, your system contains the xargs command.
- Bc - support: Start the
bc command from the terminal. If no error appears, your system contains the bc (arbitrary-precision arithmetic language).
Most of the linux distributions include the missing packages in their repositories, by utilizing a package manager (like yum, apt-get, pacman) they can be easily installed.
The missing support commands can be easily installed by using Rudix
. For running the java
commands the Java Development Kit (jdk) is needed.
During the Cygwin
installation please make sure that you include the required commands. The screenshot below shows an example for adding the wget command, you can find the commands by utilizing the search function (the commands are in the following sub-menus: wget -> Web, bc -> Math, unzip -> Archive). Also, it is recommended to use a user account without any space in the account name.
For further information about system installation and usage, please check the user documentation
The MCCCS contains support scripts, which process two main examples to show the capabilities of the system:
- Three image sets (A1, A2, A3) from the Leaf Segmentation Challenge (LSC) 2014
- A hyperspec example from Purdue Research Foundation.
- Disease classification for detached barley leaves (in preparation, not published yet).
Once the MCCCS system has been downloaded, the example data can be automatically downloaded by running the
command from the terminal. The script downloads and stores the needed data and libraries using the recommended naming and folder structure. The analysis can be started by navigating into a example subfolder, here the processing script has to be executed in a terminal.
Here we show some example for foreground/background segmentation of a top-view plant image by using an supervised classifier. The first image is used as input. The second shows the creation of a foreground label which will used as training data for the classifier. It is not mandatory to label the whole image, it is sufficient to label only parts. The third image shows the classification result.
|Input image (source: LSC 2014)
||Labeling foreground pixels (plant) during ground truth mask creation, by using gimp.
By using a Random Forest Classifier (tree-depth 100) we gain the following qualities in case of foreground/background segmentation. Here, we use no additional image processing filters e.g. to remove noise or artifacts which could be improve the results. Especially for the A3 dataset other plant parts are in focus, these are not considered by the provided ground truth labels and counted as miss-classified. This circumstance slightly decreases the classification result.
Classification and Clustering Example
The system is also able to process hyperspectral data. Here a hyperspectral airbone data set is used to perform an un-supervised clustering and a supervised classification. The first image shows a RGB visualization (composite of 700 nm, 530 nm, 450 nm spectral bands). The second image includes the labeled classes by performing an clustering by using the EM (Expectation Maximization) algorithm. The third image shows the result of supervised classification. For this approach crude labels was prepared before and used for classifier training.
|RGB visualization (data source: Purdue Research Foundation)
||Clustering result (using 7 classes)
||Classification result (using 7 ground truth masks for training)
The machine learning approach generally calculates probabilities for each class. The following images visualize these probabilities for the classification (black indicates P(0), white indicates P(1)).
|Class 1 (vegetation, grass)
||Class 2 (streets, trails)
||Class 3 (buildings)
|Class 4 (shadows)
||Class 5 (streets, trails)
||Class 6 (vegetation, trees)
|Class 7 (water)
Disease rating for detached barley leaves
This work is not published yet, together with our partners we prepare an biological publication.
|Three classes detected: blue - healthy, green - mildew infection, brown - necrosis
Applications by MCCCS users
In this section, related publications are summarized which are using the MCCCS system for different kinds of analysis.
Leaf Segmentation & Leaf Counting Challenges
Both challenges (LSC, LCC) are organized in connection with the Computer Vision Problems in Plant Phenotyping (CVPPP
) workshop (official website
). Here, the MCCCS is used for the prediction of leaf counts and leaf borders. Details can be found in the following conference paper:
Jean-Michel Pape and Christian Klukas. Utilizing machine learning approaches to improve the prediction of leaf counts and individual leaf segmentation of rosette plant images. In S. A. Tsaftaris, H. Scharr, and T. Pridmore, editors, Proceedings of the Computer Vision Problems in Plant Phenotyping (CVPPP), pages 3.1-3.12. BMVA Press, September 2015.
Results for the testing sets
For the testing data sets the ground truth data was not known, the test results (shown below) were supplied by the CVPPP organizers.
The approach by using the MCCCS showed the best results in the 2015 Challenge. The results are presented in the CVPPP at the BMVC 2015 in Swansea.
Results for the LCC
Results for the LSC
Multi Channel Classification and Clustering System
Developed in 2015 at IPK Gatersleben by the Research Group Image Analysis
Development of the presented methods has been performed with equal contribution by:
- Jean-Michel Pape - e-mail
Method development, implementation and documentation.
- Dr. Christian Klukas - homepage, e-mail
Supervision of project, method development and implementation.
(head of group 'Image Analysis' at IPK during the development and from 2010 to April 2015)