Reproducibility

convst provides algorithms for time series classification that have been published in the literature. To allow user to verify those results, we provide instructions and scripts to generate the same experimental setup that was used to obtain the results.

Obtain the UEA & UCR Datasets

The UEA & UCR Time Series Classification Repository is an ongoing project to develop a comprehensive repository for research into time series classification providing datasets as well as code and results for many algorithms.

Convenience functions are provided in convst to download a dataset from this repository by simply specifying its name:

  • Original Train & Test splits for a dataset: convts.utils.load_sktime_dataset_split(),

  • Full dataset: convts.utils.load_sktime_dataset().

Obtain the UEA & UCR Resamples

On the UEA & UCR Time Series Classification Repository you can also find published results, using a 30 resamples validation. It is useful to use the same resamples to be able to directly compare yourself to the state of the art algorithms.

The published results are obtained using the tsml java implementation, and the function used to generate the resample is resampleTrainAndTestInstances . What you want to do is to download tsml and use the DataHandling example to generate the resamples from the Train and Test arff files for each datasets that you previously downloaded from the UCR archive .

We provide two classes that can be used in sklearn cross validation tools, note that if using the random one, the comparaison to published results will not be valid, but you won’t need to perform the previous steps with tsml.

  • Using resamples from tsml : convts.utils.UCR_stratified_resample(),

  • Using random resamples : convts.utils.stratified_resample().

Running the cross validation script

WIP