linear regression

Linear Regression with Adam Optimizer

Alexey Novakov published on

8 min, 1488 words

Adam is one more optimization algorithm used in neural networks. It is based on adaptive estimates of lower-order moments. It has more hyper-parameters than classic Gradient Descent to tune externally

Good default settings for the tested machine learning problems are:

  • α = 0.001, // learning rate. We have already seen this one in classic Gradient Descent.
  • β1 = 0.9,
  • β2 = 0.999
  • eps = 10−8.
Read More

Linear Regression with Gradient Descent

Alexey Novakov published on

9 min, 1670 words

In this article we are going to use Scala mini-library for Deep Learning that we developed earlier in order to study basic linear regression task. We will learn model weights using perceptron model, which will be our single unit network layer that emits target value. This model will predict a target value yHat based on two trained parameters: weight and bias. Both are scalar numbers. Weights optimization is going to be based on implemented Gradient descent algorithm:

Model equation:

y = bias + weight * x
Read More