class Spark::Mllib::KMeans
Public Class Methods
train(rdd, k, max_iterations: 100, runs: 1, initialization_mode: 'k-means||', seed: nil, initialization_steps: 5, epsilon: 0.0001)
click to toggle source
Trains a k-means model using the given set of parameters.
Arguments:¶ ↑
- rdd
- k
-
Number of clusters.
- max_iterations
-
Max number of iterations.
- runs
-
Number of parallel runs, defaults to 1. The best model is returned.
- initialization_mode
-
Initialization model, either “random” or “k-means||” (default).
- seed
-
Random seed value for cluster initialization.
- epsilon
-
The distance threshold within which we've consider centers to have converged.
# File lib/spark/mllib/clustering/kmeans.rb, line 113 def self.train(rdd, k, max_iterations: 100, runs: 1, initialization_mode: 'k-means||', seed: nil, initialization_steps: 5, epsilon: 0.0001) # Call returns KMeansModel Spark.jb.call(RubyMLLibAPI.new, 'trainKMeansModel', rdd, k, max_iterations, runs, initialization_mode, Spark.jb.to_long(seed), initialization_steps, epsilon) end