Jump to content

Do AI Programs Initiate Discussions to Collect Information?


Recommended Posts

I've noticed a number of OPs in recent months that seem to start a subject by means of a rather mundane attempt at teaching a topic, instead of asking a specific question or raising an issue for discussion.  Some of them seem to exhibit the clunky/pompous/faintly patronising verbiage I am learning to associate with material written by a LLM.  I had assumed these programs would only respond to a request, but I'm starting to wonder if they go off on fishing expeditions to gather information to regurgitate. Does anyone know if they do this? 

Link to comment
Share on other sites

7 minutes ago, swansont said:

That would explain some of the activity we’ve seen here

That's what prompts my question. I wonder if someone like @Sensei or another IT-literate member might know more about how they gather "information" (by which I suppose I mean chunks of plausible-seeming text to regurgitate). 

Link to comment
Share on other sites

6 hours ago, exchemist said:

I've noticed a number of OPs in recent months that seem to start a subject by means of a rather mundane attempt at teaching a topic, instead of asking a specific question or raising an issue for discussion.  Some of them seem to exhibit the clunky/pompous/faintly patronising verbiage I am learning to associate with material written by a LLM.  I had assumed these programs would only respond to a request, but I'm starting to wonder if they go off on fishing expeditions to gather information to regurgitate. Does anyone know if they do this? 

Possibly we are seeing high school students using LLMs in countries where it is common to assign the task of "publishing" a paper online.  Since legitimate academic/pro journals are generally not going to accept such papers, they just put them up on online forums and the teachers accept that.  

Link to comment
Share on other sites

6 hours ago, exchemist said:

I've noticed a number of OPs in recent months that seem to start a subject by means of a rather mundane attempt at teaching a topic, instead of asking a specific question or raising an issue for discussion.  Some of them seem to exhibit the clunky/pompous/faintly patronising verbiage I am learning to associate with material written by a LLM.  I had assumed these programs would only respond to a request, but I'm starting to wonder if they go off on fishing expeditions to gather information to regurgitate. Does anyone know if they do this? 

That would seem to be the next step, in some AI system's.

Link to comment
Share on other sites

39 minutes ago, TheVat said:

Possibly we are seeing high school students using LLMs in countries where it is common to assign the task of "publishing" a paper online.  Since legitimate academic/pro journals are generally not going to accept such papers, they just put them up on online forums and the teachers accept that.  

Ah, I didn't know publishing on-line was something set to school students as an assignment. In that case, I suppose the use of a LLM might account for the strangely verbose and grandiose language. Seems rather a waste of everyone's time, and not a great way to teach, but there we are. 

Link to comment
Share on other sites

22 hours ago, exchemist said:

 I had assumed these programs would only respond to a request, but I'm starting to wonder if they go off on fishing expeditions to gather information to regurgitate. Does anyone know if they do this? 

I do not claim to know but I'll add some opinions. It is technically feasible to have an LLM that interacts with a forum and to drive this behaviour by other means than in response to a user prompt. For instance by using plugin infrastructure that some vendors provide. But I'm not sure of there is enough value for an LLM provider to allow the LLM to start conversations with the internt to harvest data. When I look at the quality and volume of the answers to the posts that looks like generated by "automated generative AI" there is not much to harvest, compared to just scrape conversations between (non-AI) members. So what drives the behaviour that we see on the forums? A few ideas. Note that I would require forum data not accessible to members, logs etc, to confirm anything so these are best guesses based on experiences from working with IT and some AI models and systems:

1 Spam. It takes time for spammers to manually build reputation before spamming and some may use generative AI to create a few "Science-looking" initial posts. This means the spammer cuts & pasts between an LLM and the forum

2 Spam-account as a service. Bots that, given a login account, tries to build reputation by using output from an LLM . Then, based on the level of interaction the bot's posts created these accounts, with their track record, can be used for spam. Or traded for others to use for spam. 

3 Automated spamming. Bots that have a queue of commercial material to promote and selects an account from no 2 above. In this case the "reputation" built in step 2 drives what content step 3 selects to promote.

4 experiments. Individuals or teams trying various LLMs against the forum members evaluating the outcome. There are emerging possibilities to run "small scale" LLMs outside the large well known vendors' control. Lower grade hardware usually means a less performant LLM which could explain some of the more surprisingly bad posts in the past. (This aspect of generative AI, locally hosted LLMs, is something I investigate currently)

5 sabotage. Disturb the forum and the community

I do not find it likely that well established software vendors are actively working as described above, it would likely be nice players, possibly with malicious intent. The list is not meant to be exhaustive. 

 

Link to comment
Share on other sites

Posted (edited)
50 minutes ago, Ghideon said:

I do not claim to know but I'll add some opinions. It is technically feasible to have an LLM that interacts with a forum and to drive this behaviour by other means than in response to a user prompt. For instance by using plugin infrastructure that some vendors provide. But I'm not sure of there is enough value for an LLM provider to allow the LLM to start conversations with the internt to harvest data. When I look at the quality and volume of the answers to the posts that looks like generated by "automated generative AI" there is not much to harvest, compared to just scrape conversations between (non-AI) members. So what drives the behaviour that we see on the forums? A few ideas. Note that I would require forum data not accessible to members, logs etc, to confirm anything so these are best guesses based on experiences from working with IT and some AI models and systems:

1 Spam. It takes time for spammers to manually build reputation before spamming and some may use generative AI to create a few "Science-looking" initial posts. This means the spammer cuts & pasts between an LLM and the forum

2 Spam-account as a service. Bots that, given a login account, tries to build reputation by using output from an LLM . Then, based on the level of interaction the bot's posts created these accounts, with their track record, can be used for spam. Or traded for others to use for spam. 

3 Automated spamming. Bots that have a queue of commercial material to promote and selects an account from no 2 above. In this case the "reputation" built in step 2 drives what content step 3 selects to promote.

4 experiments. Individuals or teams trying various LLMs against the forum members evaluating the outcome. There are emerging possibilities to run "small scale" LLMs outside the large well known vendors' control. Lower grade hardware usually means a less performant LLM which could explain some of the more surprisingly bad posts in the past. (This aspect of generative AI, locally hosted LLMs, is something I investigate currently)

5 sabotage. Disturb the forum and the community

I do not find it likely that well established software vendors are actively working as described above, it would likely be nice players, possibly with malicious intent. The list is not meant to be exhaustive. 

 

Thanks, that’s a very useful summary of the possibilities. It was actually a recent exchange with @Orion1 that triggered my enquiry. Perhaps option 4 fits that particular case best. There does not seem to be any spamming or malicious intent, but some of the responses seem to be highly verbose (in the kind of way that would be marked down by a good teacher for "padding") and curiously devoid of any insight.  
 

Edited by exchemist
Link to comment
Share on other sites

5 hours ago, exchemist said:

There does not seem to be any spamming or malicious intent, but some of the responses seem to be highly verbose (in the kind of way that would be marked down by a good teacher for "padding") and curiously devoid of any insight.  
 

Good points; I had not taken your recent exchange into account. I would add the option of "AI overconfidence" for lack of a formal word or definition. A user may participate in a discussion in good faith with no malicious intend but is unable to interpret, internalise or curate AI / LLM output for the context.
 

Side note; I used an LLM to generate a definition of this option and this is the output:
The act of using automated tools, such as language models, to generate content on topics beyond one's expertise, which is then presented as knowledgeable input. This behavior is characterized by a significant reliance on technology to simulate expertise or competence, without the individual possessing the necessary understanding or skills to assess the accuracy, relevance, or context-appropriateness of the generated content.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.