convst.classifiers.R_DST_Ridge¶
- class convst.classifiers.R_DST_Ridge(transform_type='auto', phase_invariance=False, alpha=0.5, normalize_output=False, n_samples=None, n_shapelets=10000, shapelet_lengths=[11], shapelet_lengths_bounds=None, lengths_bounds_reduction=0.5, prime_dilations=False, proba_norm=0.8, percentiles=[5, 10], n_jobs=1, random_state=None, min_len=None, class_weight=None, fit_intercept=True, alphas_ridge=[9.999999999999999e-05, 0.0002636650898730358, 0.0006951927961775606, 0.0018329807108324356, 0.004832930238571752, 0.012742749857031334, 0.033598182862837805, 0.08858667904100824, 0.23357214690901212, 0.615848211066026, 1.623776739188721, 4.281332398719395, 11.288378916846883, 29.76351441631313, 78.47599703514607, 206.913808111479, 545.5594781168514, 1438.4498882876599, 3792.690190732246, 10000.0])¶
Bases:
BaseEstimator,ClassifierMixinA wrapper class which use R_DST as a transformer, followed by a Ridge Classifier.
- Parameters:
- transform_typestr, optional
Type of transformer to use. Based on the characteristics of the input time series, different class of transformer must be used, for example the tranformer for univariate series is not the same as for multivariate ones for run-time optimization reasons. The default is ‘auto’, which automatically select the transformer based on the data passed in the fit method.
- phase_invariancebool, optional
Wheter to use phase invariance for shapelet sampling and distance computation. The default is False.
- alphafloat, optional
The alpha similarity parameter, the higher the value, the lower the allowed number of common indexes with previously sampled shapelets when sampling a new one with similar parameters. It can cause the number of sampled shapelets to be lower than n_shapelets if the whole search space has been covered. The default is 0.5.
- normalize_outputboolean, optional
Wheter to normalize the argmin and shapelet occurrence feature by the length of the series from which it was extracted. This is mostly useful for variable length time series. The default is False.
- n_samplesfloat, optional
Proportion (in ]0,1]) of samples to consider for the shapelet extraction. The default is None, meaning that all samples are used.
- n_shapeletsint, optional
The maximum number of shapelet to be sampled. The default is 10_000.
- shapelet_lengthsarray, optional
The set of possible length for shapelets. The values can be integers to specify an absolute length, or a float, to specify a length relative to the input time series length. The default is [11].
- shapelet_lengths_boundsarray, optional
An 1D array with two elements containing the min and max possible length for shapelet candidate, can be int or float. The default is None, meaning that shapelet_lengths parameter is used.
- lengths_bounds_reductionfloat, optional
A float in ]0,1], quantifying the proportion of lengths to explore between the min and max bounds of shapelet_lengths_bounds. The default is 0.5. For example, with bounds as [4,10], and a reduction of 0.5, only [4,6,8,10] will be considered as possible lengths.
- prime_dilationsbool, optional
If True, only dilation with prime values will be considered for shapelet candidates. This will greatly speed-up the algorithm for long time series and/or short shapelet length, possibly at the cost of some accuracy.
- proba_normfloat, optional
The proportion of shapelets that will use a normalized distance function, which induce scale invariance. The default is 0.8.
- percentilesarray, optional
The two perceniles used to select the lambda threshold used to compute the Shapelet Occurrence feature. The default is [5,10].
- n_jobsint, optional
The number of threads used to sample and compute the distance vectors. The default is 1, -1 means all available cores.
- random_stateobject, optional
The seed for the random state. The default is None.
- max_channelsint, optional
The maximum number of feature possibly considered by a multivariate shapelet. The default is None, meaning max_chanels=n_features.
- min_lenint, optional
The minimum length of an input time series for variable length input. The default is None, meaning min_len=min(n_timestamps) on the training data. This can cause error if a shorter serie sis present in the test set.
- class_weightobject, optional
Class weight option of Ridge Classifier, either None, “balanced” or a custom dictionnary of weight for each class. The default is None.
- fit_interceptbool, optional
If True, the intercept term will be fitted during the ridge regression. The default is True.
- alphasarray, optional
Array of alpha values to try which influence regularization strength, must be a positive float. The default is np.logspace(-4,4,20).
- Attributes:
- classifierobject
A sklearn pipeline for RidgeClassifierCV with L2 regularization.
- transformerobject
An instance of R_DST.
Methods
__init__([transform_type, phase_invariance, ...])fit(X, y)Fit method.
get_params([deep])Get parameters for this estimator.
predict(X)Transform the input time series with R_DST and predict their classes using the fitted Ridge Classifier.
score(X, y)Perform the prediction on input time series and return the accuracy score based on the class information.
set_params(**params)Set the parameters of this estimator.
- fit(X, y)¶
Fit method. Random shapelets are generated using the parameters supplied during initialisation. Then, input time series are transformed using R_DST before classification with a Ridge classifier.
- Parameters:
- Xarray, shape=(n_samples, n_features, n_timestamps)
Input time series.
- yarray, shape=(n_samples)
Class of the input time series.
- get_params(deep=True)¶
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- predict(X)¶
Transform the input time series with R_DST and predict their classes using the fitted Ridge Classifier.
- Parameters:
- Xarray, shape=(n_samples, n_features, n_timestamps)
Input time series.
- Returns:
- array, shape=(n_samples)
Predicted class for each input time series
- score(X, y)¶
Perform the prediction on input time series and return the accuracy score based on the class information.
- Parameters:
- Xarray, shape=(n_samples, n_features, n_timestamps)
Input time series.
- yarray, shape=(n_samples)
Class of the input time series.
- Returns:
- float
Accuracy score on the input time series
- set_params(**params)¶
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.