Skip to content

iNow

Senior Members
  • Joined

  • Last visited

Everything posted by iNow

  1. Fair. I used rhetorical flair instead of peer reviewed precision. To correct myself from earlier, the core idea is this: There are multiple frontier models that anyone following this space uses daily. There are experimental models and teams training models that are available but slightly more difficult for the layperson to access. There are then the people like we see here on SFN who are slow to catch up (late adopters vs early adopters) who continue using some very old versions of some very deeply flawed and outdated models bc they're slightly easier to access (or, more likely, due to behavioral friction and they simply go with what they know). Are we really as far apart on this as it feels? No worries if we are, but I don't feel I'm being in any way extreme or unreasonable with my points. YMMV Again... it's unclear to me why you think I disagree with this
  2. I’m unclear with which part you’re disagreeing. Sometimes answers are wrong and I’ve repeatedly acknowledged that, along with supplemental detail outlining some of the most common reasons for that. There’s tons of overlap in their Venn diagrams, but they are distinct. LLMs are great at processing and generating natural human language, but tend to suffer when engaged for problem solving. Reasoning models, however, explicitly focus on logical deduction and step by step problem solving. Some might argue that reasoning models are just a specialized type of LLM, but I see it as a similar distinction as we see in biology when trying to differentiate species. The lines between any two are subjective and arbitrary. Of note... OpenAI, for example, announced 2 weeks ago that their new experimental reasoning LLM solved 5 of 6 problems (scoring 35/42) on the International Math Olympiad (IMO). This gave them gold medal status and the test was done under the same rules as humans (4.5 hours, no tools or internet) producing natural language proofs. https://x.com/alexwei_/status/1946477742855532918 Right on their heels, Google announced that an advanced version of their Gemini Deep Think also achieved an equivalent gold-medal score of 35/42 on the same International Math Olympiad test. https://deepmind.google/discover/blog/advanced-version-of-gemini-with-deep-think-officially-achieves-gold-medal-standard-at-the-international-mathematical-olympiad/
  3. It may be helpful to realize that LLMs are just one type of model. They have largely evolved to reasoning models. You’ll notice this more easily when ChatGPT-5 releases in the next few weeks, but several models like Grok4 and others are already displaying those properties. At the end, the answer is only as good as the question. Prompt engineering is becoming far less relevant now that then models are getting so much better, but it’s still a useful art to practice.
  4. I support recognizing the limitations of models and recognizing where they're likely to go wrong, but encourage caution here over generalizing these results. Yes, it's correct that a "language" model struggles more with math and physics. No doubt there, but we're no longer really using language models and are rapidly moving into reasoning models and mixture of experts applications where multiple models get queried at once to refine the answer. I dug into the Arvix paper and found two concerns with their methods in this study that give me pause. One is they trained it themselves. Who knows how good they are at properly training and tuning a model. That is very much an art where some people are more skilled than others. Two is that they trained models that are not SOTA and are relatively low ranking in terms of capability and performance. It's a cool paper that reinforces some of our preconceptions, but they're basically saying the Model T is a bad car because the Air Conditioning system which was built by a poet doesn't cool a steak down to 38 degrees in 15 minutes. ... or something like that. Know the models limitations, sure. Know that math and physics don't lend themselves to quality answers based on predictive text alone. But also know those problems were largely solved months ago and only get better every day. The models we're using today are, in fact, the worst they will ever be at these tasks since they get better by the minute. /AIfanboi
  5. I was tired when I read this and my first reaction was, how do I secure a card at the MILF Library?
  6. Maybe, and then republicans put 19 justices on the SCOTUS when they retake power and our banana republic continues its slide.
  7. Technically parties don’t really exist yet when the founders wrote our governing docs and Washington warned against them
  8. Like someone from the justice department?
  9. Let’s say the court rules against the president. Does the court have its own police to enforce their ruling?
  10. Michael - Why do you keep neg repping me? My point was valid. The internet is full of people being insincere and pretending to be something they’re not. Creation of fake accounts to bolster a position is very common. Is that happening here? I guess not, but that doesn’t mean I’m wrong nor deserving of repeated downvotes.
  11. Yes, because people on the internet never lie nor pretend to be something they’ve not
  12. Let’s say the Supreme Court issues contempt. Does the court have its own police to enforce their ruling?
  13. You know every time you engage with it and comment on those threads you’re helping them elevate their reach, right? Super, but you’re the one who opened a thread to vent about it. If you don’t want to hear counter views them stay silent. This isn’t a blog or a pulpit. It’s a discussion site. We’re discussing it.
  14. Both are already in place, but spam has become easier with AI. This is not a new issue. We're largely aligned. Yes, it would be great never to see it at all. That expectations is unrealistic, though. Accept it as another example of the enshittification of the internet and move along is my advise.
  15. It's annoying, yes, but you're asking for 24/7 coverage from a volunteer team who have lives outside of this group. That's a non-starter. You can ignore it and move on. What is the true risk here other than a mild inconvenience? Please have some perspective.
  16. Silence works
  17. I don’t like your tone
  18. Irrelevant Which makes me chuckle given he downvoted me for it. Very self-referential
  19. I often find cats very confusing. I’m certainly not alone in that. The AI is perhaps already more human than we realize.
  20. While I also feel many downvotes given out here are undeserving, the person doing the voting has their own reasons for doing so. Who are you to say that reason is irrelevant or "nothing?"
  21. Travel and exposure to different regions and cultures is one I’d the surest ways to expanding the mind of a developing human. Becoming a parent, however, helps you realize that nature plays a massively oversized role relative to nurture.
  22. Wow. Powerful insights there. It's phenomenal what humans can post nowadays

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.

Account

Navigation

Search

Search

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.