The Bitter Lesson

Prometheus · March 13, 2021

The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective, and by a large margin.

According to Deepmind's Richard Sutton the bitterness is that injecting human knowledge into an AI systems results only in short term gains and that general statistical methods that leverage computational power perform much better in the the mid to long term. GPT-3 has 175 billion parameters and doesn't seem to be approaching the limit of what a simply bigger model can do. Some speculate GPT-4 will have trillions of parameters.

A debate between Yann Lecun and Gary Marcus shows there is a split in the community regarding the wisdom of this approach.

I wonder whether there is a hard limit to how much computation that can be thrown into a system. Some say we are already coming up to the limit of Moore's law, and that parallelisation has its limit in Amdahl's law. We also know that the human brain doesn't need to consume nearly as many resources to achieve far more complex behaviour than current AI (based on the relative energy efficiency of brains vs computers).

We also have reason to want to model the human brain with AI to deepen our understanding of the mind. And if we are to create AGI, presumably it will easier to align our values to an agent if it is in some sense based on human biology (although Sutton's argument would be that actually, more computation will achieve that end).

My instinct is that at some point the vast complexity of the world will require that AI systems parse things down into simpler units of understanding in order for them to navigate it. Does anyone have any opinions one way or another?

iNow · March 14, 2021

6 hours ago, Prometheus said:

I wonder whether there is a hard limit to how much computation that can be thrown into a system. Some say we are already coming up to the limit of Moore's law, and that parallelisation has its limit in Amdahl's law.

To me, it’s less about how much computation power we throw at it, and is more about how we/it use that power. I suspect quantum computing will add to the flops data these neural learning models go through.

I’m also barely a novice in this space.

6 hours ago, Prometheus said:

My instinct is that at some point the vast complexity of the world will require that AI systems parse things down into simpler units of understanding in order for them to navigate it.

I tend to agree. The phrase “paradigm shift” is overused, but seems to apply well here.

MigL · March 14, 2021

I predict that, when AI gains 'imagination', and starts believing it has a 'soul', we'll be close.

Of course, by then, it will have ALL human failings, and send Terminators back in time to kill me.

Prof Reza Sanaye · March 14, 2021

Quote from Prometheus :

" My instinct is that at some point the vast complexity of the world will require that AI systems parse things down into simpler units of understanding in order for them to navigate it. Does anyone have any opinions one way or another? "

I have gotten an opinion , but I'm afraid another way .. ..

Your respectable "instinct" adjudicates that universal complexity has to be parsed and divisible for it to be(come) understandable ,,,, Simply because your "instinct" has been reared in the reductivistic rut ;;;;;;;;;; Hhhmmm ??

Prometheus · March 21, 2021

On 3/14/2021 at 2:33 AM, iNow said:

To me, it’s less about how much computation power we throw at it, and is more about how we/it use that power. I suspect quantum computing will add to the flops data these neural learning models go through.

I agree, but the counter argument is that so far the biggest improvements have simply been bigger models, not more structured models. There was a more subtle point at the end of the article suggesting that we shouldn't be trying to inject human knowledge directly into AI because the mind is complex beyond our understanding and if we don't know how knowledge is codified in the brain, we don't have a basis to codify it in AI.

On 3/14/2021 at 4:27 PM, Prof Reza Sanaye said:

Your respectable "instinct" adjudicates that universal complexity has to be parsed and divisible for it to be(come) understandable ,,,, Simply because your "instinct" has been reared in the reductivistic rut ;;;;;;;;;; Hhhmmm ??

I don't understand what you're trying to say.

iNow · March 21, 2021

2 hours ago, Prometheus said:

There was a more subtle point at the end of the article suggesting that we shouldn't be trying to inject human knowledge directly into AI because the mind is complex beyond our understanding and if we don't know how knowledge is codified in the brain, we don't have a basis to codify it in AI.

Indeed. This is where ML really thrives. We just expose it to more and more examples and archetypes and it does the rest.

Unfortunately, this is also where it struggles in context of things like automating recruiting and scanning of applicant resumes or giving input on sentencing decisions in the courts, for example (algorithms that help judges decide how to rule).

The AI tends to pick up and amplify past biases humans have shown and will do things like ignore females or more heavily sentence blacks, etc. In these cases, it becomes required that we DO proactively codify more neutral approaches into the AI simply to overcome how codified bias has been historically in our culture.

https://www.sciencedaily.com/releases/2017/04/170413141055.htm

Prof Reza Sanaye · March 21, 2021

3 hours ago, Prometheus said:

I agree, but the counter argument is that so far the biggest improvements have simply been bigger models, not more structured models. There was a more subtle point at the end of the article suggesting that we shouldn't be trying to inject human knowledge directly into AI because the mind is complex beyond our understanding and if we don't know how knowledge is codified in the brain, we don't have a basis to codify it in AI.

I don't understand what you're trying to say.

I'm doing my best to caution you against over-reductionism . . . . . .

Prometheus · March 23, 2021

On 3/21/2021 at 1:42 PM, iNow said:

The AI tends to pick up and amplify past biases humans have shown and will do things like ignore females or more heavily sentence blacks, etc. In these cases, it becomes required that we DO proactively codify more neutral approaches into the AI simply to overcome how codified bias has been historically in our culture.

Interesting point. But those biases exist because the training data itself was biased - misclassifying people of minority groups because the data was trained on the majority group. Essentially it was shown unrepresentative training data given tasks it had not be trained on. Additional architecture shouldn't be required to remedy this bias, just appropriate data curation, and thorough testing before deployment.

On 3/21/2021 at 1:59 PM, Prof Reza Sanaye said:

I'm doing my best to caution you against over-reductionism . . . . . .

I wasn't aware i was doing so. Does that mean you advocate for more flat architectures?

Prof Reza Sanaye · March 23, 2021

4 minutes ago, Prometheus said:

Interesting point. But those biases exist because the training data itself was biased - misclassifying people of minority groups because the data was trained on the majority group. Essentially it was shown unrepresentative training data given tasks it had not be trained on. Additional architecture shouldn't be required to remedy this bias, just appropriate data curation, and thorough testing before deployment.

I wasn't aware i was doing so. Does that mean you advocate for more flat architectures?

Very Dear Prometheus ;

I'm afraid you ARE doing so. What alternative do I propound ? " more flat architectures " is , of course , one way of calling it. I myself prefer to designate it as more holistic. And I do not hesitate to add that this IS possible. Remember : This does NOT spell that advocating for the Anthropic/Humanizing protocol.

iNow · March 23, 2021

7 minutes ago, Prometheus said:

those biases exist because the training data itself was biased - misclassifying people of minority groups because the data was trained on the majority group. Essentially it was shown unrepresentative training data given tasks it had not be trained on.

We’re largely on the same page here and philosophically we align on a desire for more hands off approaches. The main point I think is crucial to understand, though, is that the historical data on which AI training is based is both biased AND representative. This poses a real problem, one which mandates human intervention and programming choices up front

Sure... We don’t want the historical data from which the AI learns to represent us. It represents what we used to be, but not who we are today nor who we’re becoming. We’ve learned and evolved and finally begun to overcome those various prejudices, but those prejudices are still a massive and representative part of the historical datasets being used to train the AI.

If we simply tell the AI to go learn from past data... we turn it on then walk away all laissez-faire so as not to impose our own biases and limitations on it... then this where even bigger problems ensue. It’s the very fact that it’s using an unfiltered dataset that leads the AI to repeat and amplify any past mistakes.

The only way to do this correctly IMO is to proactively filter said dataset... to implement certain guardrails and rules... and ensure a different future path is taken. We need to install various dams and levies on the data since a truly representative past sample isn’t representative of our future preferred selves.

Prometheus · March 25, 2021

On 3/23/2021 at 4:00 PM, iNow said:

The only way to do this correctly IMO is to proactively filter said dataset... to implement certain guardrails and rules... and ensure a different future path is taken. We need to install various dams and levies on the data since a truly representative past sample isn’t representative of our future preferred selves.

If you're not proposing a change in architecture to deal with this, i think it's slightly off topic, but it's an interesting tangent.

Say you have a language model and you start talking about a doctor and ask the model to complete the paragraph. A fair model might be one that goes on to use male and female pronouns in equal amounts. But how to achieve that end?

I don't think filtering the dataset is practical. GPT-3 was trained on words scraped from the internet, something like 300 billion 'tokens' (token ~= word). There's no way even a team of humans could curate that. You could try to gatekeep what goes into the model, but that has a similar problem and the added side effect of excluding (or re-writing) some of the world's greatest literature - it's just baked into the language. Even something as recent as Lord of the Rings would probably reinforce these sorts of gender stereotypes.

That leaves adding something to the architecture to try to fix things. Maybe something that changes gendered pronouns such that they occur 50/50 - whilst also ensuring they don't get mixed up - a single person referred to she and he a different times. It seems an inelegant solution, and i'm wary of unintended consequences - the way AI fails is very different to the way humans fail.

There is also a question of whether we really want AI to be 'moralising'. These large models are developed by google, facebook, tesla etc... and they don't necessarily want to make their models transparent or optimise for 'fairness'. OpenAI might be a reasonable vendor given their mission statement but they're a minority, but even then it's a tricky technical task - we're essentially talking about training some notion of morality into AI.

You might enjoy this video on the topic.

iNow · March 25, 2021

50 minutes ago, Prometheus said:

If you're not proposing a change in architecture to deal with this, i think it's slightly off topic, but it's an interesting tangent.

I definitely think a change is needed to deal with this, but I'm also not currently well enough read on the subject to describe in any detail what specifically that might/should look like. Apologies for the tangent or being off-topic. Didn't intend to derail.

1 hour ago, Prometheus said:

You might enjoy this video on the topic.

Yes, thanks. That was a good overview. I would take minor issue with the suggestion that the motivation of some is to social engineer reality itself, but I can see the point he's making and appreciate the need for caution here.

Sign In

The Bitter Lesson

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Important Information