Advance Gradient Descent

In Simple BGD (Batch Gradient Descent) the parameters are updated by some of all the square error of data set. However, BGD convergence much accurately but when there are large data set then the convergence will take lot of time and memory. Therefore, to overcome these problems, following evolution were made on optimizing parameters. Stochastic … Continue reading Advance Gradient Descent