Jump to content

Recommended Posts

I was refreshing my memory of the derivation of ridge solution and realised I haven't got some of the intuition quite right in my head.

Specifically - by using regularisation when fitting a very high order polynomial, intuitively the regularisation part of the objective function should kill off some of the higher order terms.

However since the penalty is on the L2 norm of the weights, it will penalise larger weights preferentially and on first glance this would appear to penalise the lower order terms first.

Since the data should be normalised this is not a problem in reality (ie. higher order terms in the data matrix are reduced, so their weights larger). But in that case  by intuition would be the weight penalty reduces all weights at roughly similar rates, ie. unrrelated to the exponent.

Hope that makes roughly sense ... any thoughts?

Thanks in advance.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.