derivative as linear approximation

CPL.Luke · July 28, 2006

hmm so my new vector calc book essentially uses linear approximation and derivative interchangably, this seems entirely contrary to everything I've ever known about derivatives and everything else.

is this the way I should be thinking about them or am I being led astray?

I'm a bit nervous about it all as some of the "derivatives" seem like they would hold true in general however others seem a bit hodgepodge

take this for example.

what is the derivative of the mapping of a matrix A to...

[MATH] (AA^t+A^t)^-1 [/math]

now I struggled with this one for a while and apparently went way off track. the solution manual says that the problem should work like this (I think I can get better if I just get some clarification on whether or not this is an actual derivative or just an approximation)

break it up into two functions

f(B)=B^-1

and

[MATH] g(A)=AA^t+A^tA[/MATH]

that seems all well and good but then the author does a few things that seem a bit iffy.

in order to find the derivative of g(a) he writes g(A+H)-g(A), he essentially calls this the derivative, but in order to simplify the expression he distributes the transpose operator as shown below

[MATH] (A+H)(A+H)^t=(A+H)(A^t+H^t)[/MATH]

is that a valid operation?

well anywho he goes on to set the value for the derivative he got to be equal to the increment in the derivative of F(B) so essentially the derivative boils down to

-F(B)H*F(B)

or -B^-1 H B^-1

with B being equal to g(A) and H being equal to his derivative of g(A), so is this all right or....

Dave · July 28, 2006

I think the problem is that without a certain degree of formality, these things can appear to be very fudged. Certainly intuition can play a part, but I think you need to be a little more precise. Are we talking about a function, [math]f:M(n,n) \to M(n,n)[/math] defined by [math]f(A) = (AA^t + A^t)^{-1}[/math] and then taking the derivative? (M(n,m) is the space of real n by m matrices).

I don't know an awful lot about matrix derivatives (other than the fact that since M(n,m) is isomorphic to R^nm we can use some results from that), but to answer the question it'd be useful to know what he regards as the derivative.

Dave · July 28, 2006

Just realized I haven't really answered your question about exchanging linear approximations and derivatives. Ignoring matrices for a moment, we say that a function [math]f:\mathbb{R}^n \to \mathbb{R}^m[/math] is differentiable at a point x₀, if there exists a linear map [math]d_{x_0} f: \mathbb{R}^n \to \mathbb{R}^m[/math] (called the differential) satisfying:

[math]\lim_{||h||\to 0} \frac{||f(x+h) - f(x) - d_{x_0}f(h)||}{||h||} = 0[/math]

With a little effort, one can prove that [imath]d_{x_0}f[/imath] can be chosen to be the Jacobian matrix of f, evaluated at x₀. It should be clear to see from the definition that the differential is a linear approximation to f around x₀, so I suppose you can interchange derivative with linear approximation as long as you're careful about the phrasiology.

matt grime · July 28, 2006

but in order to simplify the expression he distributes the transpose operator as shown below

[MATH] (A+H)(A+H)^t=(A+H)(A^t+H^t)[/MATH]

is that a valid operation?

Yes. Think about the ij'th entry if you need to.

There is no harm in thinking 'mod h^2', that is put x+hy, where x is the point you're differentiating at, y is a matrix and h is a real number, in to the defining equations and set h^2 (and higher order terms) to be zero. *If* the function is differentiable you will be able to read off the derivative from it. You can then, if you must justify that this is indeed the derivative.

For example, differentiate the function det(X) at the identity matrix we put in det(I+hy) and we get det(1)+htr(y)+ terms in h^2 (this is just expanding the determinant). Thus the derivative of det (at the origin) is trace.

CPL.Luke · July 28, 2006

the function is defined as mat(n,n) to mat (n,n)

I was a bit dubious of having the H's stick around in the derivative, but if you guys say its ok than I'm sure it is, I guess the thing that annoys me about this book is that it handles everything in a very abstract way without using physical examples to illustrate the point (like the motion of a particle)

matt grime · July 28, 2006

I guess the thing that annoys me about this book is that it handles everything in a very abstract way without using physical examples to illustrate the point (like the motion of a particle)

Sorry to be blunt, but get over it. There is nothing to stop you making your own toy examples either (such as 1x1 matrices) and playing about with things by hand.

Sign In

derivative as linear approximation

Recommended Posts

CPL.Luke

Link to comment

Share on other sites

Dave

Link to comment

Share on other sites

Dave

Link to comment

Share on other sites

matt grime

Link to comment

Share on other sites

CPL.Luke

Link to comment

Share on other sites

matt grime

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity

Important Information