Tree Evaluation

Functions that evaluate a tree for a given dataset and calculates its fitness.

Functions

GP_NLS.evaluateMethod

Function that takes any node of a tree (AbstractNode), and an data matrix X (where each row is an observation and each column is a variable), and evaluate the prediction for each observation in X. The function makes a recursive call along the tree node and evaluates the expression using the matrix variable columns that exist in the tree.

If the node is a InternalNode, the recursive call is made with its children and the result is used as arguments of the node function.

If it is a TerminalNode with content Const, a vector with size(X, 1) repeatedly containing the constant is returned.

If it is a TerminalNode with content Var or WeightedVar, the column of the index Var.var_idx of X will be used to extract the value of the variable from the matrix.

evaluate(node::Union{TerminalNode, InternalNode}, X::Matrix{Float64})::Vector{Float64}

Implements a multiple dispatch for the case of TerminalNode and InternalNode.

source
GP_NLS.fitnessMethod

Function that measures the fitness of a given tree, in relation to an training data matrix X::Matrix{Float64} and a vector of expected results y::Vector{Float64}.

fitness(tree::AbstractNode, X::Matrix{Float64}, y::Vector{Float64})::Float64

The fitness is calculated using the RMSE, and this method returns an infinite fitness if the tree fails to evaluate –- forcing the selective pressure to likely eliminate the individual from the population without having to think about protected operations.

source

Genetic Programming algorithm

Mutation, crossover and GP implementation.

Functions

GP_NLS.GPMethod

GP With depth and number of nodes control. The recommended startup is PTC2, but we have the others as well (however, the other methods are based in the koza GP and do not follow restrictions on the maximum number of nodes). To use canonic GP, just disable lm_optimization and choose one of ["ramped", "grow", "full"] initializations. To use GP-NLS, turn on lm_optimization and use "PTC2" as initialization method.

GP(
    X::Matrix{Float64}, 
    y::Vector{Float64},
    fSet::Vector{Func},
    tSet::Vector{Union{Var, WeightedVar, Const, ERC}},
    minDepth::Int64        = 1,
    maxDepth::Int64        = 5,
    maxSize::Int64         = 25,
    popSize::Int64         = 50,
    gens::Int64            = 50,
    mutationRate::Float64  = 0.25,
    elitism::Bool          = false,
    verbose::Bool          = false,
    init_method::String    = "PTC2", #["ramped", "grow", "full", "PTC2"]
    lm_optimization        = false, 
    keep_linear_transf_box = false
)::AbstractNode
source