Ghideon Posted September 5, 2023 Share Posted September 5, 2023 Large Language Models (LLMs) like GPT-4 and its predecessors are, as far as I know, built upon a foundation of mathematical and computational concepts, some of which were established long ago. I've been asked to do a short presentation about LLMs and I'm thinking of including a timeline of mathematical concepts to give some context to the audience. Can you suggest significant discoveries that could be included? There is likely no exact answer and I would value your opinion. For a short list I have these as a starting point: -Probability Theory -Foundations of Calculus -Vectors and Matrices (Linear Algebra) -Neural Networks -Information Theory (entropy) (and maybe some recent things like Word Embeddings and Transformer Architecture ) I'll need to do some research to assign reasonable time stamps to the concepts. Link to comment Share on other sites More sharing options...
studiot Posted September 6, 2023 Share Posted September 6, 2023 11 hours ago, Ghideon said: Large Language Models (LLMs) like GPT-4 and its predecessors are, as far as I know, built upon a foundation of mathematical and computational concepts, some of which were established long ago. I've been asked to do a short presentation about LLMs and I'm thinking of including a timeline of mathematical concepts to give some context to the audience. Can you suggest significant discoveries that could be included? There is likely no exact answer and I would value your opinion. For a short list I have these as a starting point: -Probability Theory -Foundations of Calculus -Vectors and Matrices (Linear Algebra) -Neural Networks -Information Theory (entropy) (and maybe some recent things like Word Embeddings and Transformer Architecture ) I'll need to do some research to assign reasonable time stamps to the concepts. I will be interested to see the response from other member to this question but you may also find some useful source material for your lecture in this book Despite the title here is a contents list as to why it may be relevant. (Hofstadter has written some other titles which I can't comment on.) Link to comment Share on other sites More sharing options...
Prometheus Posted September 6, 2023 Share Posted September 6, 2023 I'd also add cellular automata (maybe rule 30) to invoke the idea that we can have simple and precisely known rules which generate unpredictable iterations in order to communicate an intuition as to why at a high level we don't know how deep learning architectures produce their outputs. 1 Link to comment Share on other sites More sharing options...
studiot Posted September 6, 2023 Share Posted September 6, 2023 1 hour ago, Prometheus said: I'd also add cellular automata (maybe rule 30) to invoke the idea that we can have simple and precisely known rules which generate unpredictable iterations in order to communicate an intuition as to why at a high level we don't know how deep learning architectures produce their outputs. Like it +1 Link to comment Share on other sites More sharing options...
Genady Posted September 6, 2023 Share Posted September 6, 2023 (edited) 2 hours ago, Prometheus said: I'd also add cellular automata (maybe rule 30) to invoke the idea that we can have simple and precisely known rules which generate unpredictable iterations in order to communicate an intuition as to why at a high level we don't know how deep learning architectures produce their outputs. We cannot predict the result, but we know how to get to it step-by-step. Thus, I don't think it explains "why at a high level we don't know how deep learning architectures produce their outputs." Edited September 6, 2023 by Genady Link to comment Share on other sites More sharing options...
Ghideon Posted September 6, 2023 Author Share Posted September 6, 2023 (edited) 12 hours ago, studiot said: Despite the title here is a contents list as to why it may be relevant. Thanks for the list and the book suggestion! 11 hours ago, Prometheus said: cellular automata That is a good suggestion as well. I'm also thinking of adding "Optimization" (one example: gradient descent). Note: I've not added Turing machine to the list; I see Turing as more foundational to computing in general and not a top candidate in the context of LLMs. But I'm open for suggestions and opinions. Edited September 6, 2023 by Ghideon Link to comment Share on other sites More sharing options...
iNow Posted September 7, 2023 Share Posted September 7, 2023 Whether the sample dataset is representative of the population being described plays a sizable role, too. Selection of the source data and how that’s accomplished. What feeds it largely dictates what it poops out. Link to comment Share on other sites More sharing options...
Ghideon Posted November 2, 2023 Author Share Posted November 2, 2023 Update: I wanted to share that I didn't (yet) get a chance to use the insights on the history of mathematics we discussed in this thread. An external AI expert covered the background and history of AI in their speech, so I shifted my focus to the current risks, opportunities, and guidelines related to Generative AI. I believe my presentation was well-received, as I've been asked to speak again. Hopefully I can include my view on the history of AI. A big thank you to everyone who contributed! Link to comment Share on other sites More sharing options...
studiot Posted November 2, 2023 Share Posted November 2, 2023 Thank you for the update. 1 Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now