K-mean alignment and variants for functional data

kma(
  x,
  y,
  n_clusters = 1L,
  warping_class = c("affine", "dilation", "none", "shift", "srsf"),
  seeds = NULL,
  maximum_number_of_iterations = 100L,
  centroid_type = c("mean", "medoid"),
  distance = c("l2", "pearson"),
  warping_options = c(0.15, 0.15),
  number_of_threads = 1L,
  parallel_method = 0L,
  distance_relative_tolerance = 0.001,
  use_fence = FALSE,
  check_total_dissimilarity = TRUE,
  use_verbose = TRUE,
  compute_overall_center = FALSE
)

Arguments

x: A numeric matrix of shape nObs x nPts specifying the evaluation grid of each observation.
y: A numeric array of shape nObs x nDim x nPts specifying the observation values.
n_clusters: An integer value specifying the number of clusters. Defaults to 1L.
warping_class: A string specifying the warping class Choices are "affine", "dilation", "none", "shift" or "srsf". Defaults to "affine". The SRSF class is the only class which is boundary-preserving.
seeds: An integer vector of length n_clust specifying the indices of the initial templates. Defaults to NULL, which boils down to randomly sampled indices.
maximum_number_of_iterations: An integer specifying the maximum number of iterations before the algorithm stops (default: 100L).
centroid_type: A string specifying the type of centroid to compute. Choices are "mean" or "medoid". Defaults to "mean". This is used only when warping_class != "srsf". When warping_class = "srsf, the mean is systematically used.
distance: A string specifying the distance used to compare curves. Choices are "l2" or "pearson". Defaults to "l2". This is used only when warping_class != "srsf".
warping_options: A numeric vector supplied as a helper to the chosen warping_class to decide on warping parameter bounds. This is used only when warping_class != "srsf".
number_of_threads: An integer value specifying the number of threads used for parallelization. Defaults to 1L. This is used only when warping_class != "srsf".
parallel_method: An integer value specifying the type of desired parallelization for template computation, If 0L, templates are computed in parallel. If 1L, parallelization occurs within a single template computation (only for the medoid method as of now). Defaults to 0L. This is used only when warping_class != "srsf".
distance_relative_tolerance: A numeric value specifying a relative tolerance on the distance update between two iterations. If all observations have not sufficiently improved in that sense, the algorithm stops. Defaults to 1e-3. This is used only when warping_class != "srsf".
use_fence: A boolean specifying whether the fence algorithm should be used to robustify the algorithm against outliers. Defaults to FALSE. This is used only when warping_class != "srsf".
check_total_dissimilarity: A boolean specifying whether an additional stopping criterion based on improvement of the total dissimilarity should be used. Defaults to TRUE. This is used only when warping_class != "srsf".
use_verbose: A boolean specifying whether the algorithm should output details of the steps to the console. Defaults to TRUE. This is used only when warping_class != "srsf".
compute_overall_center: A boolean specifying whether the overall center should be also computed. Defaults to FALSE. This is used only when warping_class != "srsf".

Value

An object of class kma, which is a list with the following components:

original_curves: A numeric matrix of shape \(N \times L \times M\)

storing the original sample of \(N\)

\(L\)-dimensional curves observed on grids of size \(M\). original_grids: A numeric matrix of shape \(N \times M\) storing the original grids of size \(M\) on which wer evaluated the \(N\) curves;

x: As input; y: As input; seeds: Indices used in the algorithm; iterations: Number of iterations before the KMA algorithm stops; n_clust: As input; overall_center_grid: Overall center grid if compute_overall_center is set; overall_center_values: Overall center values if compute_overall_center is set; distances_to_overall_center: Distances of each observation to the overall center if compute_overall_center is set; x_final: Aligned observation grids; n_clust_final: Final number of clusters. Note that n_clust_final may differ from initial number of clusters n_clust if some clusters are empty; x_centers_final: Final center grids; y_centers_final: Final center values; template_grids: List of template grids at each iteration; template_values: List of template values at each iteration; labels: Cluster memberships; final_dissimilarity: Distances of each observation to the center of its assigned cluster; parameters_list: List of estimated warping parameters at each iteration; parameters: Final estimated warping parameters; warping_method: As input; dissimilarity_method: As input; center_method: As input; optimizer_method: As input.

Examples

res <- kma(
  simulated30$x,
  simulated30$y,
  seeds = c(1, 21),
  n_clust = 2,
  center_method = "medoid",
  warping_method = "affine",
  dissimilarity_method = "pearson"
)
#> Error in kma(simulated30$x, simulated30$y, seeds = c(1, 21), n_clust = 2,     center_method = "medoid", warping_method = "affine", dissimilarity_method = "pearson"): unused arguments (center_method = "medoid", warping_method = "affine", dissimilarity_method = "pearson")