Tree Evaluation
Functions that evaluate a tree for a given dataset and calculates its fitness.
Functions
GP_NLS.evaluate — MethodFunction that takes any node of a tree (AbstractNode), and an data matrix X (where each row is an observation and each column is a variable), and evaluate the prediction for each observation in X. The function makes a recursive call along the tree node and evaluates the expression using the matrix variable columns that exist in the tree.
If the node is a InternalNode, the recursive call is made with its children and the result is used as arguments of the node function.
If it is a TerminalNode with content Const, a vector with size(X, 1) repeatedly containing the constant is returned.
If it is a TerminalNode with content Var or WeightedVar, the column of the index Var.var_idx of X will be used to extract the value of the variable from the matrix.
evaluate(node::Union{TerminalNode, InternalNode}, X::Matrix{Float64})::Vector{Float64}Implements a multiple dispatch for the case of TerminalNode and InternalNode.
GP_NLS.fitness — MethodFunction that measures the fitness of a given tree, in relation to an training data matrix X::Matrix{Float64} and a vector of expected results y::Vector{Float64}.
fitness(tree::AbstractNode, X::Matrix{Float64}, y::Vector{Float64})::Float64The fitness is calculated using the RMSE, and this method returns an infinite fitness if the tree fails to evaluate –- forcing the selective pressure to likely eliminate the individual from the population without having to think about protected operations.
Genetic Programming algorithm
Mutation, crossover and GP implementation.
Functions
GP_NLS.GP — MethodGP With depth and number of nodes control. The recommended startup is PTC2, but we have the others as well (however, the other methods are based in the koza GP and do not follow restrictions on the maximum number of nodes). To use canonic GP, just disable lm_optimization and choose one of ["ramped", "grow", "full"] initializations. To use GP-NLS, turn on lm_optimization and use "PTC2" as initialization method.
GP(
X::Matrix{Float64},
y::Vector{Float64},
fSet::Vector{Func},
tSet::Vector{Union{Var, WeightedVar, Const, ERC}},
minDepth::Int64 = 1,
maxDepth::Int64 = 5,
maxSize::Int64 = 25,
popSize::Int64 = 50,
gens::Int64 = 50,
mutationRate::Float64 = 0.25,
elitism::Bool = false,
verbose::Bool = false,
init_method::String = "PTC2", #["ramped", "grow", "full", "PTC2"]
lm_optimization = false,
keep_linear_transf_box = false
)::AbstractNode