class Spark::Mllib::LogisticRegressionWithSGD

Constants

DEFAULT_OPTIONS

Public Class Methods

train(rdd, options={}) click to toggle source

Train a logistic regression model on the given data.

Arguments:

rdd

The training data, an RDD of LabeledPoint.

iterations

The number of iterations (default: 100).

step

The step parameter used in SGD (default: 1.0).

mini_batch_fraction

Fraction of data to be used for each SGD iteration.

initial_weights

The initial weights (default: nil).

reg_param

The regularizer parameter (default: 0.01).

reg_type

The type of regularizer used for training our model (default: “l2”).

Allowed values:

  • “l1” for using L1 regularization

  • “l2” for using L2 regularization

  • nil for no regularization

intercept

Boolean parameter which indicates the use or not of the augmented representation for training data (i.e. whether bias features are activated or not). (default: false)

validate

Boolean parameter which indicates if the algorithm should validate data before training. (default: true)

Calls superclass method
# File lib/spark/mllib/classification/logistic_regression.rb, line 145
def self.train(rdd, options={})
  super

  weights, intercept = Spark.jb.call(RubyMLLibAPI.new, 'trainLogisticRegressionModelWithSGD', rdd,
                                     options[:iterations].to_i,
                                     options[:step].to_f,
                                     options[:mini_batch_fraction].to_f,
                                     options[:initial_weights],
                                     options[:reg_param].to_f,
                                     options[:reg_type],
                                     options[:intercept],
                                     options[:validate])

  LogisticRegressionModel.new(weights, intercept)
end