Jump to content

derivative as linear approximation


CPL.Luke

Recommended Posts

hmm so my new vector calc book essentially uses linear approximation and derivative interchangably, this seems entirely contrary to everything I've ever known about derivatives and everything else.

 

is this the way I should be thinking about them or am I being led astray?

 

 

I'm a bit nervous about it all as some of the "derivatives" seem like they would hold true in general however others seem a bit hodgepodge

 

take this for example.

 

what is the derivative of the mapping of a matrix A to...

 

[MATH] (AA^t+A^t)^-1 [/math]

 

now I struggled with this one for a while and apparently went way off track. the solution manual says that the problem should work like this (I think I can get better if I just get some clarification on whether or not this is an actual derivative or just an approximation)

 

 

break it up into two functions

 

f(B)=B^-1

and

[MATH] g(A)=AA^t+A^tA[/MATH]

 

that seems all well and good but then the author does a few things that seem a bit iffy.

 

in order to find the derivative of g(a) he writes g(A+H)-g(A), he essentially calls this the derivative, but in order to simplify the expression he distributes the transpose operator as shown below

 

[MATH] (A+H)(A+H)^t=(A+H)(A^t+H^t)[/MATH]

 

is that a valid operation?

 

well anywho he goes on to set the value for the derivative he got to be equal to the increment in the derivative of F(B) so essentially the derivative boils down to

 

 

-F(B)H*F(B)

 

or -B^-1 H B^-1

 

with B being equal to g(A) and H being equal to his derivative of g(A), so is this all right or....

Link to comment
Share on other sites

I think the problem is that without a certain degree of formality, these things can appear to be very fudged. Certainly intuition can play a part, but I think you need to be a little more precise. Are we talking about a function, [math]f:M(n,n) \to M(n,n)[/math] defined by [math]f(A) = (AA^t + A^t)^{-1}[/math] and then taking the derivative? (M(n,m) is the space of real n by m matrices).

 

I don't know an awful lot about matrix derivatives (other than the fact that since M(n,m) is isomorphic to Rnm we can use some results from that), but to answer the question it'd be useful to know what he regards as the derivative.

Link to comment
Share on other sites

Just realized I haven't really answered your question about exchanging linear approximations and derivatives. Ignoring matrices for a moment, we say that a function [math]f:\mathbb{R}^n \to \mathbb{R}^m[/math] is differentiable at a point x0, if there exists a linear map [math]d_{x_0} f: \mathbb{R}^n \to \mathbb{R}^m[/math] (called the differential) satisfying:

 

[math]\lim_{||h||\to 0} \frac{||f(x+h) - f(x) - d_{x_0}f(h)||}{||h||} = 0[/math]

 

With a little effort, one can prove that [imath]d_{x_0}f[/imath] can be chosen to be the Jacobian matrix of f, evaluated at x0. It should be clear to see from the definition that the differential is a linear approximation to f around x0, so I suppose you can interchange derivative with linear approximation as long as you're careful about the phrasiology.

Link to comment
Share on other sites

but in order to simplify the expression he distributes the transpose operator as shown below

 

[MATH] (A+H)(A+H)^t=(A+H)(A^t+H^t)[/MATH]

 

is that a valid operation?

 

Yes. Think about the ij'th entry if you need to.

 

There is no harm in thinking 'mod h^2', that is put x+hy, where x is the point you're differentiating at, y is a matrix and h is a real number, in to the defining equations and set h^2 (and higher order terms) to be zero. *If* the function is differentiable you will be able to read off the derivative from it. You can then, if you must justify that this is indeed the derivative.

 

For example, differentiate the function det(X) at the identity matrix we put in det(I+hy) and we get det(1)+htr(y)+ terms in h^2 (this is just expanding the determinant). Thus the derivative of det (at the origin) is trace.

Link to comment
Share on other sites

the function is defined as mat(n,n) to mat (n,n)

 

 

I was a bit dubious of having the H's stick around in the derivative, but if you guys say its ok than I'm sure it is, I guess the thing that annoys me about this book is that it handles everything in a very abstract way without using physical examples to illustrate the point (like the motion of a particle)

Link to comment
Share on other sites

I guess the thing that annoys me about this book is that it handles everything in a very abstract way without using physical examples to illustrate the point (like the motion of a particle)

 

Sorry to be blunt, but get over it. There is nothing to stop you making your own toy examples either (such as 1x1 matrices) and playing about with things by hand.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.