Gradient descent - Revision history

Zeno Gantner at 17:18, 6 June 2011

2011-06-06T17:18:25Z

Zeno Gantner at 17:18, 6 June 2011

2011-06-06T17:18:10Z

Alan at 12:59, 22 February 2011

2011-02-22T12:59:17Z

Zeno Gantner at 21:39, 21 February 2011

2011-02-21T21:39:50Z

Zeno Gantner: new page

2011-02-21T21:38:20Z

new page

New page

'''Gradient descent''' ('''GD''')is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. Stochastic gradient descent ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models.

== External links ==
* [[Wikipedia: Gradient descent]]

[[Category:Methods]]

← Older revision		Revision as of 17:18, 6 June 2011
Line 1:		Line 1:
−	'''Gradient descent''' ('''GD''') is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. Stochastic gradient descent ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[Recommender System\|recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models.	+	'''Gradient descent''' ('''GD''') is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. '''Stochastic gradient descent''' ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[Recommender System\|recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models.

	== External links ==		== External links ==

← Older revision		Revision as of 17:18, 6 June 2011
Line 4:		Line 4:
	* [[Wikipedia: Gradient descent]]		* [[Wikipedia: Gradient descent]]

−	[[Category:~~Methods~~]]	+	[[Category:Method]]

← Older revision		Revision as of 12:59, 22 February 2011
Line 1:		Line 1:
−	'''Gradient descent''' ('''GD''') is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. Stochastic gradient descent ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models.	+	'''Gradient descent''' ('''GD''') is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. Stochastic gradient descent ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[Recommender System\|recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models.

	== External links ==		== External links ==

← Older revision		Revision as of 21:39, 21 February 2011
Line 1:		Line 1:
−	'''Gradient descent''' ('''GD''')is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. Stochastic gradient descent ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models.	+	'''Gradient descent''' ('''GD''') is a general [[optimization algorithm]], can be used to find a (possibly local) minimum of a differentiable function. Stochastic gradient descent ('''SGD''') performs updates for single data points (or batches), whereas complete GD computes the complete gradient and then performs an update. In [[recommender systems]], methods based on gradient descent are popular for fitting the parameters of a prediction model, e.g. [[matrix factorization]] models.

	== External links ==		== External links ==