Statistical Consultants Ltd


TANAGRA: A free data-mining program

Software Review, Data-Mining


TANAGRA is a free data-mining program.  It was developed in France and released in 2004.  It is the successor of SIPINA, a classification program.
TANAGRA screenshot 
TANAGRA screenshot
TANAGRA screenshot

TANAGRA has three windows: data mining diagram, components and output.  It has a ‘drag-and-drop’ type interface, where the user can drag icons (from the components window) and drop them into a nested diagram that represents a set of processes.  The diagrams can be saved.
TANAGRA diagram
A right click on a component in the diagram, brings up a small menu.  One of the options in that menu is ‘Execute’ which runs each component from the start of the diagram, down the hierarchy to the selected component.
TANAGRA right click menu

The ‘Define Status’ Component

The ‘Define status’ component is used to define variables as being target, input or illustrative variables, for the process that follows it.  For example:
  • In a regression model, the target variable would be the response variable, and the input variables would be the explanatory variables. 
  • In a principal components analysis (PCA), all variables included would be input variables.
  • When building a classifier, the target variable is what is to be classified, and the input variables would be used to construct the classifier.
TANAGRA Define status 

Component Categories

TANAGRA’s has the following categories of components.
Data visualisation
TANAGRA data visualisation components

TANAGRA statistics components
Nonparametric statistics
TANAGRA nonparametric statistics components 

Instance selection

TANAGRA instance selection components 

Feature construction
TANAGRA feature construction components

Feature selection
TANAGRA feature selection components 

TANAGRA regression components 

Factorial analysis
TANAGRA factorial analysis components
TANAGRA PLS components

TANAGRA clustering components
Spv learning
TANAGRA Spv learning components
Meta-spv learning
TANAGRA Meta-spv learning components
Spv learning assessment
TANAGRA Spv learning assessment components
TANAGRA Scoring components
TANAGRA Association components 

French style decimal separation

The data set is assumed to have decimal separators that conform to the French convention that uses commas instead of dots e.g. 1000,00 instead of 1000.00.

TANAGRA website:

See also:

How the world separates its decimals
PSPP: A free alternative to SPSS
Gretl: A free alternative to EViews


Copyright © Statistical Consultants Ltd  2010