rule

thejoker954@lemmy.world · 2 months ago

Subtitles is a perfect use case for LLMs.

Jack@slrpnk.net · 2 months ago

No, what you are thinking of is speech to text software, it is much older than LLMs and works in a very different way.

thejoker954@lemmy.world · 2 months ago

While speech to text software indeed predates LLMs - LLMs do it as well. I’ve only tried a few basic (aka free) options so no idea how well they do en masse, but the generated results were at least on par if not better than YouTubes’ auto caption.

It might not technically be LLMs though. It could be a different type of “ai”. I Just cant stand the “ai” marketing when nothing they are making is actually ai so until they pull their heads out their asses all “ai” models are LLMs to me.

Jack@slrpnk.net · 2 months ago

Understandable, AI marketing now is a shitshot, but they are not even AI I think. Just people forget that tech used to do magic before AI existed.

LwL@lemmy.world · 1 month ago

It’s kind of the other way around, we’ve always had AI, it used to just basically mean a computer making some decision based on data. Like a thermostat changing the heating in response to a temperature change.

Then we got LLMs and because they are good at pretending to have complex reasoning ability, AI as a term started to always mean “computer with near human level intelligence” which of course they are absolutely not.

Jack@slrpnk.net · 1 month ago

There was a book I can’t remember, the whole thesis was exactly that. “AI is whatever automates the decision making process” not any group of algos

ButteryMonkey@piefed.social · 2 months ago

This is a big part of it. Back when ai was first becoming big, my manager said they needed to run all my kb articles through an ai to generate link clouds or some such.

I was like umm… that’s a service this platform has always offered…? Like just because you don’t know what the kb tools do, or what our rock bottom subscription gets us, doesn’t mean I haven’t looked into it… but that also isn’t worth doing because now we only have a handful of articles in any given category because I’m good at my job…

unexposedhazard@discuss.tchncs.de · 2 months ago

Yeah speech to text models have nothing to do with LLMs and their use for captioning is perfectly fine imo

oplkill@lemmy.world · 2 months ago

Nope, they still not good. I using YouTube auto gen subs and they 100% need LLM to fix mistakes.

AnarchoEngineer@lemmy.dbzer0.com · 2 months ago

Large language models are designed to generate text based on previous text. Translation from audio to text can be done via a neural net but it isn’t a Large Language Model.

Now, could you combine the two to say reduce error on words that were mumbled by having a generative model predict the words that would fit better in that unclear sentence. However you could likely get away with a much smaller and faster net than an LLM in fact you might be able to get away with using plain-Jane markov chains, no machine learning necessary.

Point is that there is a difference between LLMs and other neural nets that produce text.

In the case of audio to text translation, using an LLM would be very inefficient and slow (possibly to the point it isn’t able to keep up with the audio at all), and using a very basic text generation net or even just a probabilistic algorithm would likely do the job just fine.

Ziglin (it/they)@lemmy.world · 2 months ago

How would an llm fix a mistake equivalent to something being misheard? I feel like you’re misunderstanding something and could probably also use some help with your English.

Norah (pup/it/she)@lemmy.blahaj.zone · 1 month ago

[…]could probably also use some help with your English.

what the actual fluff is up with lemmy.world accounts in this thread acting like jerks?

Lily [she/her, pup/pup's]@lemmy.blahaj.zone · 1 month ago

lemmy.world accounts acting like jerks

many such cases

RushLana@lemmy.blahaj.zone · 2 months ago

As someone who use a screen reader daily, absolutly the fuck not.

LLMs will invent things out of tin air and ruin any comprehesion. It waste my time rather than help me.

thejoker954@lemmy.world · 2 months ago

If you use any generic LLM then yes, but there are LLMs (like i said in another reply - its prrobably not a LLM - but as there is no ‘real’ ai that’s what I’m calling all this ai bullshit) That are trained specifically for captioning/transcripts, just not necessarily done in real time.

Doing it “live” is what increases the error rate.

leftytighty@slrpnk.net · 2 months ago

LLMs are large language models, they’re a specialized category of artificial neural network, which are a way of doing machine learning. All of those topics are under the academic computer science discipline of artificial intelligence.

AI, neural net, or ML model are all way more accurate to say than LLM in this case.

spujb@lemmy.cafe · edit-2 1 month ago

to clarify we are talking about a post caption, not closed captions.

that is, the text you put in the description of an image or video post.

forkDestroyer@infosec.pub · 2 months ago

Crunchyroll really messed up their subs with AI. Not sure if they mean LLMs and are just calling it AI but still:

https://www.animenewsnetwork.com/news/2024-02-27/crunchyroll-confirms-testing-a.i-for-subtitling/.208086

Kept wondering why subtitles were so obviously off when I was watching some stuff. It was horrid.

Norah (pup/it/she)@lemmy.blahaj.zone · 1 month ago

Automatic subtitles like on YouTube use Machine Learning, NOT a Large Language Model.

Natanox@discuss.tchncs.de · 1 month ago

Fuck no.