For a function 𝑧 = 𝑓 (𝑥, 𝑦), the gradient is a vector in the 𝑥𝑦-plane that points in the direction for which 𝑧 gets its greatest instantaneous rate of change at a given point on the graph, i.e. the gradient points in the direction of steepest ascent.
The gradient is a way of packing together all the partial derivative information of a function. So let's just start by computing the partial derivatives of this guy.
Take a moment to delight in the fact that one single operation, the gradient, packs enough information to compute the rate of change of a function in every possible direction!
When you expand it, the gradient would have five components, and the vector itself would have five components. So, this is the directional derivative and how you calculate it.
Here we go over many different ways to extend the idea of a derivative to higher dimensions, including partial derivatives , directional derivatives, the gradient, vector derivatives, divergence, curl, and more!
A directional derivative describes how a graph changes when the input is nudged slightly in the direction of a vector and the gradient is the vector field defined by the partial derivatives of f with respect to each variable in the input space.
Gradient is another word for slope. That means that finding the gradient (slope) involves a function and outputs a vector using nabla, which can be considered a function
Instead of finding minima by manipulating symbols, gradient descent approximates the solution with numbers. Furthermore, all it needs in order to run is a function's numerical output, no formula required. This distinction is worth emphasizing because it's what makes gradient descent useful.
So the gradient of f gives you a vector field and the divergence of that gives you a scalar field. And what I want to show you here is another formula that you might commonly see for this Laplacian. So first let's kind of abstractly write out what the gradient of f will look like.