tfgRocKeel

Abstrac

Based on the algorithms modified in the first part of the project. A module has been developed that, based on the outputs of these algorithms, creates ROC curves. This module will be later added to the KEEL software of the UGR, giving it a very interesting new functionality.

This website offers the reader a summary of the work done, which can be found in the following link: Full Work.

ROC Evaluation:

The ROC evaluation is based on the calculation of the area under the ROC curve, and besides solving the problems of other evaluation methods when evaluating a classifier in an unbalanced class problem or imbalance learning, offers the following advantages:

The ROC curve is a graphical representation that represents for each of the examples or observations studied its relation between sensitivity and the complement of the specificity, in other words it allows us to see the separation between the distributions of the sensitivity And specificity for a particular problem.

Simplificación del diagrama de clases

Software Development

In the following figure we can find a simplification of the class structure followed in the project development process. The simplification has been carried out to facilitate the understanding of the key points of this structure as well as to discard elements not relevant to the project.

Simplificación del diagrama de clases

The RocKeel class implements the main method of the program by creating an object of the class Roc that by means of its methods will execute the program in its totality. This has been done in this way following the code and methodology already existing in KEEL thus making it easier to understand.

The Roc class incorporates all the necessary code to:

The FileParser class, incorporates all the necessary code to:

The RocCoord class is in charge of obtaining the coordinates for the ROC curves. It incorporates all the necessary code for generating binary problems from multi class problems, obtaining the ROC curves and being able to store in its attributes the coordinates of each curve and its AUC. Since it could be said that all the computational weight falls on this class and its methods, we have detailed in depth the most important ones.

Pgfplots Package

Pgfplots is a package based on TikZ / PGF, a complex graphic system developed for Latex that offers us great functionality when drawing graphics directly in Latex but whose commands are quite complex, what makes its learning curve Quite high.

Pgfplots therefore appears as a level of abstraction on TikZ / PGF and like the latter allows the creation of graphs of functions, points, bars and surfaces among other functionalities. One of its most interesting features is the ease of use, with a convenient interface and scripts directly compilable in Latex. The graphics are inserted into the Latex document in a vectorial way so its resolution and quality is not affected by increasing or decreasing its size. Pgfplots also allows the realization of graphics in 2D or 3D with a lot of adaptable parameters, such as legends, colors and labels with which we can give our graphics the appropriate design characteristics for each problem in question. For our problem, we used 2D graphics and the following Pgfplots commands:

Output

In the following figure we can see the output of the developed module working with a real multi-class problem:

Simplificación del diagrama de clases