arfpy package
Submodules
arfpy.arf module
- class arfpy.arf.arf(x, num_trees=30, delta=0, max_iters=10, early_stop=True, verbose=True, min_node_size=5, **kwargs)[source]
Bases:
object
Implements Adversarial Random Forests (ARF) in python Usage: 1. fit ARF model with arf() 2. estimate density with arf.forde() 3. generate data with arf.forge().
- Parameters:
x (pandas.Dataframe) – Input data.
num_trees (int, optional) – Number of trees to grow in each forest, defaults to 30
delta (float, optional) – Tolerance parameter. Algorithm converges when OOB accuracy is < 0.5 + delta, defaults to 0
max_iters (int, optional) – Maximum iterations for the adversarial loop, defaults to 10
early_stop (bool, optional) – Terminate loop if performance fails to improve from one round to the next?, defaults to True
verbose (bool, optional) – Print discriminator accuracy after each round?, defaults to True
min_node_size (int) – minimum number of samples in terminal node, defaults to 5
- forde(dist='truncnorm', oob=False, alpha=0)[source]
This part is for density estimation (FORDE)
- Parameters:
dist (str, optional) – Distribution to use for density estimation of continuous features. Distributions implemented so far: “truncnorm”, defaults to “truncnorm”
oob (bool, optional) – Only use out-of-bag samples for parameter estimation? If True, x must be the same dataset used to train arf, defaults to False
alpha (float, optional) – Optional pseudocount for Laplace smoothing of categorical features. This avoids zero-mass points when test data fall outside the support of training data. Effectively parametrizes a flat Dirichlet prior on multinomial likelihoods, defaults to 0
- Returns:
Return parameters for the estimated density.
- Return type:
dict