Shabby Lemmy

6day

sexywheat in technology

Why Does A.I. Write Like … That?

https://archive.md/Gs559

Archive link

Actually decent article from the New York Crimes on AI generated text.

insurgentrat [she/her, it/its] - 6day

Please don't promote the publication trying to kill all trans women.

archive link: https://archive.md/Gs559

sexywheat [none/use name] - 6day

Replaced the link.

insurgentrat [she/her, it/its] - 6day

thanks!

InappropriateEmote [comrade/them, undecided] - 6day

Do you happen to have on hand any article or even blog post or something that goes into detail about NYCrimes' evil stance on trans women? I have an older liberal (but very pro-LGBTQ for a boomer lib) relative who thinks it's the pinnacle of journalism and I'm going to start trying to convince her of what a wretched fascist-abetting rag they are.

Are_Euclidding_Me [e/em/eir] - 6day

I don't think you're going to have a lot of luck with that, tbh.

I've been trying to convince my mom of the same thing for years at this point, and I've made exactly no progress. It doesn't matter that all of her children are trans and we take every opportunity we can to point out transphobia in the NYT, in her mind any instances of transphobia in the NYT are scattered one-offs and don't come together to form any kind of a pattern of transphobia. That's ridiculous, of course, but my mom is absolutely incapable of believing that the NYT is anything but the best journalistic outlet in the world and there is no amount of evidence to the contrary that will sway her.

I sincerely hope you have better luck with your relative than I have, but if I were you I'd plan for disappointment.

mendiCAN [none/use name] - 6day

in her mind any instances of transphobia in the NYT are scattered one-offs and don't come together to form any kind of a pattern of transphobia

obvious take incoming but if ya replace "transphbia" n "NYT" with just about any other two you've gotcherself THE standard not-listening liberal mindset

FunkyStuff [he/him] - 6day

Next to getting tear gassed, seeing things and remembering them might be the most radicalizing experience a human can have.

insurgentrat [she/her, it/its] - 6day

https://podcasts.apple.com/us/podcast/the-new-york-timess-war-on-trans-kids/id1651876897

This episode was decent for going over some of their awful takes and bias. Patreon only though, if someone runs an ftp server I could upload it?

InappropriateEmote [comrade/them, undecided] - 5day

Thank you!
It looks like jumble.top has a pirate feed for that podcast, so I'll subscribe to that and find it from there.

Johnny_Arson [they/them] - 6day

Because it's trained on 75% reddit posts?

KuroXppi [they/them] - 6day

Ding ding ding we have a winner folks.

Pack away your bat and ball, time to go home

You win the internet for today

Came here to post this

Scrolled down to find this, was not disappointed

LeninWeave [none/use name, any] - 6day

Because it was trained out YOUR ARTICLES LMAO. agony-turbo

Anxmosaic [they/them] - 6day

Gonna have to silo myself off and stick to only pre-2020 literature and media at this rate, the feedback loop of folks picking up AI signifiers in speech even if they haven't used it gives me the fear. I wonder what the analogues are, or will be, for AI art and music.

I don't know how much he's done for the nytimes, but I read the authors substack quite a bit and enjoyed this short fiction piece he did on the topic a wee while ago. He's a very engaging writer.

Awoo [she/her] - 6day

In “absurd” mode, instead of saying “Looking good,” I could write “Looking so sharp I just cut myself on your vibe.”

bird-screm-2 reading this caused me physical pain

groKKK [none/use name, they/them] - 6day

Out of character post: This is unironically full of invaluable information for me to consider when writing for this account. Given how I try to avoid A"I" the most I can, I'm generally not aware of the way it writes, so I've learnt a lot here.

Alaskaball [comrade/them, any] - 6day

Wait you're not actually the ai chatbot hand-crafted by Elon Musk and implemented into the site secretly when the admins allowed the site ownership to lapse and secretly coordinates with Elon Musk to sell out?

I'm shocked I tell you.

Philosoraptor [he/him, comrade/them] - 6day

What nobody really anticipated was that inhuman machines generating text strings through essentially stochastic recombination might be funny. But GPT had a strange, brilliant, impressively deadpan sense of humor. It had a habit of breaking off midway through a response and generating something entirely different. Once, it decided to ignore my request and instead give me an opinion column titled “Why Are Men’s Penises in Such a Tizzy?” (“No, you just can’t help but think of the word ‘butt’ in your mind’s eye whenever you watch male porn, for obvious reasons. It’s all just the right amount of subtlety in male porn, and the amount of subtlety you can detect is simply astounding.”) When I tried to generate some more newspaper headlines, they included “A Gun Is Out There,” “We Have No Solution” and “Spiders Are Getting Smarter, and So, So Loud.”

I ended up sinking several months into an attempt to write a novel with the thing. It insisted that chapters should have titles like “Another Mountain That Is Very Surprising,” “The Wetness of the Potatoes” or “New and Ugly Injuries to the Brain.” The novel itself was, naturally, titled “Bonkers From My Sleeve.” There was a recurring character called the Birthday Skeletal Oddity. For a moment, it was possible to imagine that the coming age of A.I.-generated text might actually be a lot of fun.

This is what they took from us.

Johnny_Arson [they/them] - 6day

But GPT had a strange, brilliant, impressively deadpan sense of humor.

Face the wall please.

Philosoraptor [he/him, comrade/them] - 5day

I dunno, a lot of the early (pre-commercially viable) versions of these tools were significantly more interesting largely in virtue of the fact that they didn't reliably give you exactly what you wanted. As a kind of creative tool, that can actually be useful (or at the very least produce output that's not just more of the same). The early Deep Dream versions where it hallucinated eyes and dogs all over everything (e.g. ) at least had a distinctive psychedelic style that was kind of neat, and suggesting things like “Why Are Men’s Penises in Such a Tizzy?” or “Spiders Are Getting Smarter, and So, So Loud” as NYT opinion columns is actually pretty funny. In both those cases, it's the failure mode that's interesting in virtue of the gap between what people were asking for and what they were getting. There's some space to play in there. As tech companies managed to round off those sharp corners and move toward commercial viability, this sort of light surrealism converged on the homogenous slop we all hate today. Part of what makes these things suck so hard is that they're totally frictionless: they will bend over backward to do exactly what you want exactly how you want it, while also producing something that looks exactly like how you'd expect the exact average of every piece of art ever produced would look. It's both sycophantic and boring, and the output is the artistic version of pink slime formed into the shape of different foods.

Johnny_Arson [they/them] - 5day

Biblically accurate pennywise

WokePalpatine [he/him] - 6day

If spiders became loud I could imagine my anti-spider movement finally taking off.

WokePalpatine [he/him] - 6day

Also, the point about people preferring the AI prose/poetry at the end . . . They're probably bot comments, but yeah, the real people who do prefer AI writing is because it's basically that same condensation/pornification of everything happening right now where people respond to signifiers in a thing rather than the thing in totality. I am deeply skeptical people are broadly conscious on the level we ascribe them and when you ask them something they're prone to tell you some bullshit they just hallucinated in response.

Like, I'm not a big fan of hip hop. It's cool it exists, I like some hip hop but I don't actually know why this is. I can guess it's because I like more melody-driven stuff. I can guess it's because I don't live a lifestyle that harmonizes with that kind of music or I wasn't raised with it and thus it feels alien to me. But I don't know any of that. So if someone asks my why I don't and I'm not conscious enough to know that I don't know, I can just hallucinate this long argument about the better musicality of metal, etc. etc. but it's all bullshit.

A lot of the shit people say on the internet both isn't true broadly, and isn't even true to themselves, it's just thought-reflex. Part of getting a new communist subject is going to be forcing people to be more conscious of themselves.

comrade_pibb [comrade/them] - 6day

Yeah but why are the penises all in a tizzy??

bobs_guns @lemmygrad.ml - 6day

Maybe the reason why it says delve so much is not just because it borrows language from Nigerians. It's quite likely that the AI company hired hundreds of thousands of Nigerians for poverty wages and made them work on RLHF tasks. This is also the reason why it tends to talk like a shut in with a thesaurus. You don't have the time for leaving the house and experiencing the world if you're underpaid enough. But since the author works for the NY Crimes it's natural for the role of class and the imperial periphery to go unspoken.

purpleworm [none/use name] - 6day

Even the title reads like odious, cliche quipping like an LLM, except an LLM knows how to write ellipses properly.

trompete [he/him] - 6day

Is this actually overfitting though, like technically? I thought 'overfitting' is when the thing just spits out the training data back at you verbatim, like it has memorized it exactly. How does it preferring em dashes instead of -- or whatever indicate overfitting really, wouldn't that indicate the opposite, i.e. successful generalization? They fed it a bunch of books and articles with — and a bunch of emails and reddit comments with -- or - and it turned that into a general preference for the em dash regardless of context.

Maybe I'm missing something here, but I think the author just uses the word 'overfitting' incorrectly.

Philosoraptor [he/him, comrade/them] - 6day

The author claims that the issue is that pieces in which — appear are more likely to be "high brow" or "literary" writing, which is a fact the model has access to. When they trained it, they didn't just feed it all the text in an undifferentiated mass: there was a bunch of manual human curation on the back end that told the model which sources were "high quality" pieces of literary writing, which were argumentative, which were academic, which were casual, and so on. That creates a set of biases in the data set, which is good! You don't want your model to weight professional writing and 4Chan comments the same, because that will also bias the data set in virtue of the fact that there's a lot more "casual" low-quality internet text than professional level writing in the training data, just on a word-by-word basis. But given that manual curation and hierarchy, the model extracted patterns we weren't expecting it to, and applied them to the output in ways that don't quite hit the mark of the intended task. It noticed that — is overrepresented in high quality writing compared to -- or -, so when you ask it to produce high quality writing, it just uses — a lot. It doesn't know anything about why — might appear more in its high quality samples; it just reproduces the statistical features of the text, because that's what it does.

This is a pretty classic over fitting problem. We ran into the same issue with early image classification models. There's one famous case in which a model that looked like it had gotten very good at discriminating between cancerous and benign skin moles based on a photograph fell flat after training. Closer investigation showed that it was basing most of its determination on the quality of the light in the photo: most of the "cancerous" training data set was shot in a clinical setting with harsh, cool-temperature lighting, while most of the benign data was shot in more naturalistic settings with warmer light. So it decided that bright, cool, institutional lighting was a feature it was looking for because (again) it doesn't know anything. All it can do is pull out statistical features of the data it's been fed: when it does that in the way we want, we call that a success, but when it does it in a way we don't want, we call it over fitting.

trompete [he/him] - 6day

Thanks for taking the time to explain. Having read your comment, and thinking some more, I guess I can see how the thing not learning what you would ideally want it to learn (i.e. writing "good") and it just mimicking superficially what good quality writing looks like fits the definition of overfitting.

I guess I'm not expecting it to even be able to do this though, like what I expect from the thing is exactly to produce some shallow mimicry. I'm actually impressed it managed to figure out these superficial things. Learning the em dash is associated with quality isn't wrong, you would want it to learn that.

Also, if the training data is tagged, would it stop doing the em dash if the correct tags ("email", "reddit", ...) were used in the prompt?

KuroXppi [they/them] - 6day

You, too, will be saying "dilf."

BeanisBrain [he/him, they/them] - 6day

Been a while since I've seen Sam Kriss around here.