AI and it's effect on your music, your job and the future

crushingpetal · 2024-08-06T23:16:01-0400

TedEH said:
I missed this earlier, but I'm not sure all the text of the internet quite portrays a meaningful or complete view of the world. It certainly portrays a good chunk of how online folks like to talk about the world, but if we're scraping the reddits and x/twitters of the world for a large chunk of that, we're not exactly starting from a very reliable source of "understanding" the world to begin with.

If the goal is to plausibly sound like someone who would post on the internet, that might be enough, but I don't think that satisfies a model of understanding the world.

Again, +1. You're hitting it out of the park today. (Or pick your favorite sports analogy.)

StevenC · 2024-08-07T06:06:23-0400

The bad parts of AI: planet is burned because of capitalist hellscape

The good parts of AI: given an infinite amount of time a monkey hitting a keyboard randomly will almost surely solve cancer

narad · 2024-08-07T07:07:29-0400

crushingpetal said:
Agree to disagree on these points.

>I think we have basically all the text-of-the-internet data we're ever going to need when it comes from gleaning an >understanding of the world from raw text.

Try a thought experiment on a scenario where we grabbed the same amount of text from people in the 1970s.

I don't quite understand. People in the 1970s had human level intelligence, no?

TedEH said:
I missed this earlier, but I'm not sure all the text of the internet quite portrays a meaningful or complete view of the world. It certainly portrays a good chunk of how online folks like to talk about the world, but if we're scraping the reddits and x/twitters of the world for a large chunk of that, we're not exactly starting from a very reliable source of "understanding" the world to begin with.

If the goal is to plausibly sound like someone who would post on the internet, that might be enough, but I don't think that satisfies a model of understanding the world.

I didn't say that we had enough text to have an understanding of the world. I said we had enough text to gain as much of an understanding of the world as we can "from raw text", i.e., as much as we're going to need from that modality. The current / next-gen of models are getting a lot of their comparatively new data from other modalities directly (audio data in the case of GPT4o, transcriptions of audio in the case of a lot of LLMs, and video data in the case of sora and some gemini models). So if you're talking about model collapse, there's this assumption that the internet keeps growing in size, AI-generated content makes up a lot of it, and we still train naively and uniformly on that web scrape. In reality, the web scrape's usefulness is not going to keep growing proportional to its size, future models will be trained on huge scrapes of data in other modalities, models will be used to score the likelihood of data coming from previous models, and data will be sampled selectively to maximize the utility to the new model.

AI and it's effect on your music, your job and the future

crushingpetal

SS.org Regular

StevenC

Needs a hobby

narad

Progressive metal and politics

Latest posts