Start¶

Load¶

Data_handling import functions :

get_delimiter : identify delimiter for a .csv/.txt file
load_data : import dataset file into dataframe

AutoMxL.Start.Load.get_delimiter(file)[source]¶

Identify the delimiter for a csv/txt file

Parameters:	file (string) – Path and name of the file (Ex : “data/file.csv”)
Returns:	identified delimiter
Return type:	string

AutoMxL.Start.Load.import_data(file, index_col=None, verbose=False)[source]¶

Import dataset as a DataFrame (identify delimiter for txt and csv files)

Available files : .txt, .csv, .xlsx, .xls files

Parameters:	file (string) – Path and name of the file (Ex : “data/file.csv”) If file is .csv, automatically identify delimiter index_col (int, str, sequence of int / str, or False (Default None)) – Column(s) to use as the row labels of the DataFrame, either given as string name or column index. If a sequence of int / str is given, a MultiIndex is used. verbose (boolean (Default False)) – Get logging information
Returns:	imported dataset
Return type:	DataFrame

Encode_Target¶

Target encoding functions :

category_to_target : create a target variable (1/0) from a selected category
range_to_target : create a target variable (1/0) from a selected range

AutoMxL.Start.Encode_Target.category_to_target(df, var, cat)[source]¶

Create a target variable (1/0) from a selected category

Parameters:

df (DataFrame) – input dataset
var (string) – variable containing the target category
cat (string) – target category

Returns:

DataFrame (modified dataset)
string (new target name (var+’_’+cat))

AutoMxL.Start.Encode_Target.range_to_target(df, var, min=None, max=None, verbose=False)[source]¶

Create a target variable (1/0) from a selected range

Parameters:

df (DataFrame) – input dataset
var (string) – variable containing the target range
min (float) – lower limit. If None, no min
max (float) – upper limit. If None, no max
verbose (boolean (Default False)) – Get logging information

Returns:

DataFrame (modified dataset)
string (new target name (var+’_’+lower+’_’+upper))

Explore¶

Global dataset information functions :

explore (func): Identify variables types and gives global information about the dataset (NA, low variance features)
low variance features (func): identify features with low variance
get_features_type (func): get all features per type

AutoMxL.Explore.Explore.explore(df, verbose=False)[source]¶

Identify variables types and gives global information about the dataset

Variables type :
- date
- identifier
- verbatim
- boolean
- categorical
- numerical
variables containing NA values
low variance and unique values variables

See get_features_type function doc for type identification heuristics

Parameters:

df (DataFrame) – input dataset
verbose (boolean (Default False)) – Get logging information

Returns:

{x : variables names list }

date : date features
identifier : identifier features
verbatim : verbatim features
boolean : boolean features
categorical : categorical features
numerical : numerical features
categorical : categorical features
date : date features
NA : features which contains NA values
low_variance : list of the features with low variance

Return type:

dict

AutoMxL.Explore.Explore.get_features_type(df, l_var=None, th=0.95)[source]¶

Get all features per type :

date : try to apply to_datetime
identifier :
- #(unique values)/#(total values) > threshold (default 0.95)
- AND length is the same for all values (for non NA)
verbatim :
- #(unique values)/#(total values) >= threshold (default 0.95)
- AND length is NOT the same for all values (for non NA)
boolean : #(distinct values) = 2
categorical :
- not a date
- #(unique values)/#(total values) < threshold (default 0.95)
- AND #(uniques values)>2
- AND for num values #(unique values)<30
numerical : others

Parameters:	df (DataFrame) – input dataset l_var (list (Default : None)) – variable names th (float (Default : 0.95)) – threshold used to identify identifiers/verbatims variables
Returns:	{ type : variables name list}
Return type:	dict

AutoMxL.Explore.Explore.low_variance_features(df, var_list=None, threshold=0, rescale=True, verbose=False)[source]¶

Identify numerical features with low variance : (< threshold). Possible to rescale feature before computing.

Parameters:	df (DataFrame) – input DataFrame var_list (list (default : None)) – names of the variables to check variance if None : all the numerical features threshold (float (default : 0)) – variance threshold rescale (bool (default : true)) – enable MinMaxScaler before computing variance

verbose : boolean (Default False): Get logging information

Returns:	Names of the variables with low variance
Return type:	list

Features_Type¶

Variables type identification function

features_from_type (func): get all features for a selected type
is_date (func): test if a variable is a date
is_identifier (func): test if a variable is an identifier
is_verbatim (func): test if a variable is a verbatim
is_boolean (func): test if a variable is a boolean
is_categorical (func): test if a variable is a categorical one (with more than 2 categories)

AutoMxL.Explore.Features_Type.features_from_type(df, typ, l_var=None, th=0.95)[source]¶

Get features of a selected type :

date : try to apply to_datetime
identifier :
- #(unique values)/#(total values) > threshold (default 0.95)
- AND length is the same for all values (for non NA)
verbatim :
- #(unique values)/#(total values) >= threshold (default 0.95)
- AND length is NOT the same for all values (for non NA)
boolean : #(distinct values) = 2
categorical :
- not a date
- #(unique values)/#(total values) < threshold (default 0.95)
- AND #(uniques values)>2
- AND for num values #(unique values)<30

Parameters:	df (DataFrame) – input dataset typ (string) – selected type to get features: ’date’ ’identifier’ ’verbatim’ ’boolean’ categorical l_var (list (Default : None)) – variables names. If None, all dataset columns th (float (Default : 0.95)) – threshold used to identify identifiers/verbatims variables
Returns:	identified variables names
Return type:	list

AutoMxL.Explore.Features_Type.is_boolean(df, col)[source]¶

Test if a variable is a boolean.

#(distinct values) = 2

Parameters:	df (DataFrame) – input dataset col (string) – variable name
Returns:	res – test result
Return type:	boolean

AutoMxL.Explore.Features_Type.is_categorical(df, col, th=0.95)[source]¶

Test if a variable is a categorical one (with more than 2 categories).

not a date
#(unique values)/#(total values) < threshold (default 0.95
AND #(uniques values)>2
AND for num values #(unique values)<30

Parameters:	df (DataFrame) – input dataset col (string) – variable name th (float (Default : 0.95)) – threshold
Returns:	res – test result
Return type:	boolean

AutoMxL.Explore.Features_Type.is_date(df, col)[source]¶

Test if a variable is a date.

Method : try to apply to_datetime

Parameters:	df (DataFrame) – input dataset col (string) – variable name
Returns:	res – test result
Return type:	boolean

AutoMxL.Explore.Features_Type.is_identifier(df, col, th=0.95)[source]¶

Test if a variable is an identifier.

#(unique values)/#(total values) > threshold (default 0.95)
AND length is the same for all values (for non NA)
AND not date

Parameters:	df (DataFrame) – input dataset col (string) – variable name th (float (Default : 0.95)) – threshold rate
Returns:	res – test result
Return type:	boolean

AutoMxL.Explore.Features_Type.is_verbatim(df, col, th=0.95)[source]¶

Test if a variable is a verbatim.

#(unique values)/#(total values) >= threshold (default 0.95)
AND length is NOT the same for all values (for non NA)

Parameters:	df (DataFrame) – input dataset col (string) – variable name th (float (Default : 0.95)) – threshold rate
Returns:	res – test result
Return type:	boolean

Preprocessing¶

Missing_Values¶

Missing values handling functions :

NAEncoder (class): encoder that replaces missing values
fill_numerical (func): replace missing values for numerical features
fill_categorical (func): replace missing values for categorical features
get_NA_features (func): get features containing NA values

class AutoMxL.Preprocessing.Missing_Values.NAEncoder(replace_num_with='median', replace_cat_with='NR', track_num_NA=True)[source]¶

Missing values filling

Available methods to replace missing values

num : metdian/mean/zero
cat : ‘NR’

Parameters:	replace_num_with (string) – method used to replace numerical missing values replace_cat_with (string) – method used to replace categorical missing values

fit(df, l_var, verbose=False)[source]¶

fit encoder

Parameters:	df (DataFrame) – input dataset l_var (list) – features to encode. If None, all features verbose (boolean (Default False)) – Get logging information

fit_transform(df, l_var=None, verbose=False)[source]¶

fit and transform dataset with encoder

Parameters:	df (DataFrame) – input dataset l_var (list) – features to encode. If None, all features identified as dates (see Features_Type module) verbose (boolean (Default False)) – Get logging information

transform(df, verbose=False)[source]¶

transform dataset categorical features using the encoder. Can be done only if encoder has been fitted

Parameters:	df (DataFrame) – dataset to transform verbose (boolean (Default False)) – Get logging information

AutoMxL.Preprocessing.Missing_Values.fill_categorical(df, l_var=None, method='NR', verbose=False)[source]¶

Fill missing values for selected/all categorical features.

Parameters:	df (DataFrame) – Input dataset l_var (list (Default : None)) – list of the features to fill. If None, contains all the categorical features method (string (Default : 'NR')) – Method used to fill the NA values : NR : replace NA with ‘NR’ verbose (boolean (Default False)) – Get logging information
Returns:	Modified dataset
Return type:	DataFrame

AutoMxL.Preprocessing.Missing_Values.fill_numerical(df, l_var=None, method='median', track_num_NA=True, verbose=False)[source]¶

Fill missing values for selected/all numerical features. top_var_NA parameter allows to create a variable to keep track of missing values.

Available methods : replace with zero, median or mean (Default = median)

Parameters:	df (DataFrame) – Input dataset l_var (list (Default : None)) – names of the features to fill. If None, all the numerical features method (string (Default : 'median')) – Method used to fill the NA values : zero : replace with zero median : replace with median mean : replace with mean track_num_NA (boolean (Defaut : True)) – If True, create a boolean column to keep track of missing values verbose (boolean (Default False)) – Get logging information
Returns:	Modified dataset
Return type:	DataFrame

AutoMxL.Preprocessing.Missing_Values.get_NA_features(df)[source]¶

identify features containing NA values

Parameters:	df (DataFrame) – input dataset
Returns:	list
Return type:	features containing missing values

Categorical Data¶

Categorical features processing

CategoricalEncoder (class) : Encode categorical features
dummy_all_var (func) : get one hot encoded vector for each category of a categorical features list
get_embedded_cat (func) : get embedding representation with NN
mca (func) : to do

class AutoMxL.Preprocessing.Categorical.CategoricalEncoder(method='deep_encoder')[source]¶

Encode categorical features

Available encoding methods :

one hot encoding
deep_encoder : Build and train a Neural Network for the creation of embeddings for categorical variables.

(https://www.fast.ai/2018/04/29/categorical-embeddings/)

Default NN model parameters are stored in param_config.py file

Parameters:	method (string (Default : deep_encoder)) – method used to get categorical encoding Available methods : “one_hot”, “deep_encoder”

fit(df, l_var=None, target=None, verbose=False)[source]¶

Fit encoder on dataset following method

Parameters:	df (DataFrame) – input dataset l_var (list (Default None)) – names of the variables to encode. If None, all the categorical and boolean features target (string (Default None)) – name of the target for deep_encoder method verbose (boolean (Default False)) – Get logging information

fit_transform(df, l_var=None, target=None, verbose=False)[source]¶

fit and transform dataset categorical features

Parameters:	df (DataFrame) – input dataset l_var (list (Default None)) – names of the variables to encode. If None, all the categorical and boolean features target (string (Default None)) – name of the target for deep_encoder method verbose (boolean (Default False)) – Get logging information
Returns:	DataFrame
Return type:	modified dataset

transform(df, verbose=False)[source]¶

transform dataset categorical features using the encoder. Can be done only if encoder has been fitted

Parameters:	df (DataFrame) – dataset to transform verbose (boolean (Default False)) – Get logging information
Returns:	DataFrame
Return type:	modified dataset

AutoMxL.Preprocessing.Categorical.dummy_all_var(df, var_list=None, prefix_list=None, keep=False, verbose=False)[source]¶

Get one hot encoded vector for selected/all categorical features

Parameters:	df (DatraFrame) – Input dataset var_list (list (Default : None)) – Names of the features to dummify If None, all the num features prefix_list (list (default : None)) – Prefix to add before new features name (prefix+’_’+cat). If None, prefix=variable name keep (boolean (Default = False)) – If True, delete the original feature verbose (boolean (Default False)) – Get logging information
Returns:	Modified dataset
Return type:	DataFrame

AutoMxL.Preprocessing.Categorical.get_embedded_cat(df, var_list, target, batchsize, n_epochs, lr, verbose=False)[source]¶

Get embedded representation for categorical features using NN encoder

Parameters:	df (DataFrame) – input Dataset var_list (list of strings) – features names target (string) – target name batchsize (int) – batch size for encoder training n_epochs (int) – number of epoch for encoder training lr (float) – encoder learning rate verbose (boolean (Default False)) – Get logging information
Returns:	DataFrame
Return type:	modified dataset

Date Data¶

Date Features processing functions:

DateEncoder (class) : encode date features
all_to_date (func): detect dates from num/cat features and transform them to datetime format.
date_to_anc (func): transform datetime features to timedelta according to a ref date

class AutoMxL.Preprocessing.Date.DateEncoder(method='timedelta', date_ref=None)[source]¶

Encode categorical features

Available methods :

timedelta : compute time between date feature and parameter date_ref

Parameters:	method (string (Default : timedelta)) – method used to encode dates Available methods : “timedelta” date_ref (string '%d/%m/%y' (Default : None)) – Date to compute timedelta. If None, today date

fit(df, l_var=None, verbose=False)[source]¶

fit encoder

Parameters:	df (DataFrame) – input dataset l_var (list) – features to encode. If None, contains all features identified as dates (see Features_Type module) verbose (boolean (Default False)) – Get logging information

fit_transform(df, l_var=None, verbose=False)[source]¶

fit and transform dataset with encoder

Parameters:	df (DataFrame) – input dataset l_var (list) – features to encode. If None, all features identified as dates (see Features_Type module) verbose (boolean (Default False)) – Get logging information

transform(df, verbose=False)[source]¶

transform dataset date features using the encoder. Can be done only if encoder has been fitted

Parameters:	df (DataFrame) – dataset to transform verbose (boolean (Default False)) – Get logging information

AutoMxL.Preprocessing.Date.all_to_date(df, l_var=None, verbose=False)[source]¶

Detect dates from selected/all features and transform them to datetime format.

Parameters:	df (DataFrame) – Input dataset l_var (list (Default : None)) – Names of the features If None, all the features verbose (boolean (Default False)) – Get logging information
Returns:	Modified dataset
Return type:	DataFrame

AutoMxL.Preprocessing.Date.date_to_anc(df, l_var=None, date_ref=None, verbose=False)[source]¶

Transform selected/all datetime features to timedelta according to a ref date

Parameters:

df (DataFrame) – Input dataset
l_var (list (Default : None)) – List of the features to analyze. If None, contains all the datetime features
date_ref (string '%d/%m/%y' (Default : None)) – Date to compute timedelta. If None, today date
verbose (boolean (Default False)) – Get logging information

Returns:

DataFrame – Modified dataset
list – New timedelta features names

Process Outliers¶

Outliers handling functions

OutliersEncoding (class) : identify and replace outliers
get_cat_outliers (funct): identify categorical features containing outliers
get_num_outliers (func): identify numerical features containing outliers
replace_category (func): replace categories of a categorical variable
replace_extreme_values (func): replace extreme values (oh!)

class AutoMxL.Preprocessing.Outliers.OutliersEncoder(cat_threshold=0.02, num_xstd=4)[source]¶

Identify et replace outliers for categorical dang numerical features

num : x outlier <=> abs(x - mean) > xstd * var
cat : x outlier category <=> with frequency <x% (Default 5%)

Parameters:	cat_threshold (float (default 0.02)) – Minimum modality frequency num_xstd (int (Default : 3)) – Variance gap coef

fit(df, l_var, verbose=False)[source]¶

Fit encoder

Parameters:	df (DataFrame) – input dataset l_var (list) – features to encode. If None, all features verbose (boolean (Default False)) – Get logging information

fit_transform(df, l_var=None, verbose=False)[source]¶

Fit and transform dataset with encoder

Parameters:	df (DataFrame) – input dataset l_var (list) – features to encode. If None, all features identified as dates (see Features_Type module) verbose (boolean (Default False)) – Get logging information

transform(df, verbose=False)[source]¶

Transform dataset features using the encoder. Can be done only if encoder has been fitted

Parameters:	df (DataFrame) – dataset to transform verbose (boolean (Default False)) – Get logging information

AutoMxL.Preprocessing.Outliers.get_cat_outliers(df, l_var=None, threshold=0.05, verbose=False)[source]¶

Outliers detection for selected/all categorical features.

Method : Modalities with frequency <x% (Default 5%)

Parameters:	df (DataFrame) – Input dataset l_var (list (Default : None)) – Names of the features If None, all the categorical features threshold (float (Default : 0.05)) – Minimum modality frequency verbose (boolean (Default False)) – Get logging information
Returns:	{variable : list of categories considered as outliers}
Return type:	dict

AutoMxL.Preprocessing.Outliers.get_num_outliers(df, l_var=None, xstd=3, verbose=False)[source]¶

Outliers detection for selected/all numerical features.

Method : x outlier <=> abs(x - mean) > xstd * var

Parameters:	df (DataFrame) – Input dataset l_var (list (Default : None)) – Names of the features If None, all the num features xstd (int (Default : 3)) – Variance gap coef verbose (boolean (Default False)) – Get logging information
Returns:	{variable : [lower_limit, upper_limit]}
Return type:	dict

AutoMxL.Preprocessing.Outliers.replace_category(df, var, categories, replace_with='outliers', verbose=False)[source]¶

Replace categories of a categorical variable

Parameters:	df (DataFrame) – Input dataset var (string) – variable to modify categories (list(string)) – categories to replace replace_with (string (Default : 'outliers')) – word to replace categories with verbose (boolean (Default False)) – Get logging information
Returns:	Modified dataset
Return type:	DataFrame

AutoMxL.Preprocessing.Outliers.replace_extreme_values(df, var, lower_th=None, upper_th=None, verbose=False)[source]¶

Replace extrem values : > upper threshold or < lower threshold

Parameters:	df (DataFrame) – Input dataset var (string) – variable to modify lower_th (int/float (Default=None)) – lower threshold upper_th (int/float (Default=None)) – upper threshold verbose (boolean (Default False)) – Get logging information
Returns:	Modified dataset
Return type:	DataFrame

Features Selection¶

Features selection

select_features (func) : features selection following method

class AutoMxL.Select_Features.Select_Features.FeatSelector(method='pca')[source]¶

features selection following method

pca : use pca to reduce dataset dimensions
no_rescale_pca : use pca without rescaling data

Parameters:	method (string (Default pca)) – method use to select features

fit(df, l_var=None, verbose=False)[source]¶

fit selector

Parameters:	df (DataFrame) – input dataset l_var (list) – features to encode. If None, all features identified as numerical verbose (boolean (Default False)) – Get logging information

fit_transform(df, l_var, verbose=False)[source]¶

fit and apply features selection

Parameters:	df (DataFrame) – input dataset l_var (list) – features to encode. If None, all features identified as dates (see Features_Type module) verbose (boolean (Default False)) – Get logging information
Returns:	DataFrame
Return type:	modified dataset

transform(df, verbose=False)[source]¶

apply features selection on a dataset

Parameters:	df (DataFrame) – dataset to transform verbose (boolean (Default False)) – Get logging information
Returns:	DataFrame
Return type:	modified dataset

AutoMxL.Select_Features.Select_Features.select_features(df, target, method='pca', verbose=False)[source]¶

features selection following method

pca : use pca to reduce dataset dimensions
no_rescale_pca : use pca without rescaling data

Parameters:	df (DataFrame) – input dataset containing features target (string) – target name method (string (Default pca)) – method use to select features verbose (boolean (Default False)) – Get logging information
Returns:	modified dataset
Return type:	DataFrame

Modelisation¶

Bagging¶

Bagging algorithm class. Methods :

Bagging (class) : generate new training more balanced and train model for each
Bagging_sample (func) : generate bagging sample

class AutoMxL.Modelisation.Bagging.Bagging(clf=<sphinx.ext.autodoc.importer._MockObject object>, n_sample=5, pos_sample_size=1.0, replace=True)[source]¶

Meta-algo designed to improve the stability and accuracy of ML classif/regression algos or to face an “imbalanced target distribution” issue.

Bagging generates m new training sets more balanced. Then, a model is fitted on each sample and outputs are combined by averaging (for regression) or voting (for classification).

Available classifiers : Random Forest and XGBOOST

Parameters:

clf (Model fitted on samples (Default : RandomForestClassifier(n_estimators=100, max_leaf_nodes=100)) – Model fitted on the samples
n_sample (int (Default : 5)) – number a samples
pos_sample_size (int/float (Default : 1.0)) –
Number/rate of target=1 observations in each sample (filled with 3 times more target=0 )
- if int : number of target=1
- if float : rate of total target=1
replace (Boolean (Default : False)) – Enable sampling with replacement
list_model (list (Default : None)) – Fitted models (created with fit method)

bag_feature_importance(X)[source]¶

Get features importance of the model by averaging importance of models fitted on the samples

Parameters:	X (DataFrame) – Input Dataset
Returns:	{feature : importance}
Return type:	dict

fit(df_train, target)[source]¶

Create bagging samples from a DataFrame and fit the model (self.clf) on each sample

Parameters:	df_train (DataFrame) – Training dataset target (String) – Target name
Returns:	self.list_model – Fitted models
Return type:	list

get_params()[source]¶

Get bagging object parameters

Returns:	{param : value}
Return type:	dict

predict(df)[source]¶

Apply models fitted on sample to a dataset. Combine models by averaging the outputs (for regression) or voting (for classification)

Parameters:	df (DataFrame) – Dataset to apply the model
Returns:	numpy.ndarray (float) – Averaged classification probabilities numpy.ndarray (int) – Predictions for each observation

AutoMxL.Modelisation.Bagging.create_sample(df, target, pos_target_nb, replace=False)[source]¶

Generate a DataFrame sample with selected number of target=1

Parameters:	df (DataFrame) – Input dataset target (String) – Target name pos_target_nb (int) – Number of target=1 observations in the sample replace (Boolean (défaut : False)) – If True, create samples with replacement
Returns:	sample dataset
Return type:	DataFrame

Hyperoptimisation¶

Hyperopt class : Model hyper-optimisation with random search

Hyperopt (class) : Model hyper-optimisation with random search

class AutoMxL.Modelisation.HyperOpt.HyperOpt(classifier='RF', grid_param=None, n_param_comb=10, bagging=False, bagging_param={'n_sample': 5, 'pos_sample_size': 1.0, 'replace': False}, comb_seed=None)[source]¶

Model hyper-optimisation with random search :

From a hyper-parameters grid, creates random HPs combinations
train a model for each combination
apply the model

Parameters:

classifier (string (Default : 'RF')) – classifier for modelisation
grid_param (dict (Default : Default_RF_grid_param)) – HP grid
n_param_comb (int (Default : 10)) – number of HP combinations
bagging (Boolean (Default = False)) – use bagging method
bagging_param (n-uple) – bagging parameters (Default : default_bagging_param (Bagging module))
(created with fit method) (train_model_dict) – {model_index : {‘HP’, ‘probas’, ‘model’, ‘features_importance’, ‘train_metrics’}
bagging_object (Bagging) – bagging object
comb_seed (int) – seed for randomized HP combinations

fit(df_train, target, verbose=False)[source]¶

Fit a model for each HP combination

Parameters:	df_train (DataFrame) – Training dataset target (string) – Target name verbose (boolean (Default False)) – Get logging information
Returns:	self.train_model_dict (created with fit method) – {model_index : {‘HP’, ‘probas’, ‘model’, ‘features_importance’, ‘train_metrics’}
Return type:	dict

get_best_model(d_model_info, metric='F1', delta_auc_th=0.03, verbose=False)[source]¶

Identify valid models according to delta auc (test/train). Get the best model in respect of a selected metric among valid model

Parameters:

d_model_info (dict) – {model_index : {‘HP’, ‘probas’, ‘model’, ‘features_importance’, ‘train_metrics’, ‘metrics’, ‘output’}
metric (string (default = F1-score)) – Metric used to get the best model
delta_auc_th (float) – Threshold for valid models : abs(auc(train) - auc(test))
verbose (boolean (Default False)) – Get logging information

Returns:

int – Best model index
list – Valid model indexes

get_params()[source]¶

Return Hyperopt object parameters

Returns:	{param : value}
Return type:	dict

model_res_to_df(d_model_infos, sort_metric='F1')[source]¶

Store models summary in DataFrame

Parameters:	d_model_info (dict) – {model_index : {‘HP’, ‘probas’, ‘model’, ‘features_importance’, ‘train_metrics’, ‘metrics’, ‘output’} sort_metric (string (default = 'F1')) – metric to sort models (descendant)
Returns:	model infos and metrics
Return type:	DataFrame

predict(df, target, delta_auc, verbose=False)[source]¶

Apply the models

Parameters:	df (DataFrame) – Dataset to apply the models target (string) – Target name delta_auc_th (float) – Threshold for valid models : abs(auc(train) - auc(test)) verbose (boolean (Default False)) – Get logging information
Returns:	{model_index : {‘HP’, ‘probas’, ‘model’, ‘features_importance’, ‘train_metrics’, ‘metrics’, ‘output’}
Return type:	dict