

Мой адрес – не дом и не улица, мой адрес – Советский Союз.
an 'orrific necktie
dibs on this being the name of the post-reunification dengism, instead of boring socialism with korean characteristics
comrade gangnam style
slander
stolen valor
genuinely, saw the bold and the bad prose and just skipped over it like all the other ai slop
SDF to free 250 jihadists
they kicking their feet in arlington and riyadh
polish anticommunists 🤝 weaponizing identities
wild how easy it is to get licensed in the first place
many people would not be able to live because of car dependent shitholes being car dependent.
chapo dot chat
Most people say distilled model, distillate sounds right as well. The process is called distillation. I’ve just fried my brain on the local LLM subreddit because I was trying to get the transformers library working, probably why I phrased it like that.
yes
It’s just sensationalism from a journalist who can’t even be bothered to multiply two numbers.
I was just looking at this rule: https://en.wikipedia.org/wiki/Yǒu_biān_dú_biān
Usually you’d rely on educated guesswork like this - and in many cases the character isn’t pronounced exactly the same because of drift (https://en.wikipedia.org/wiki/Chinese_character_classification#Sound_change), but Chinese isn’t as precise as many people make it out to be: “When one encounters such a two-part character and does not know its exact pronunciation, one may take one of the parts as the phonetic indicator. For example, reading 詣 (pinyin: yì) as zhǐ because its “side” 旨 is pronounced as such. Some of this kind of “folk reading” have become acceptable over time – listed in dictionaries as alternative pronunciations, or simply become the common reading. For example, people read the character 町 ting in 西門町 (Ximending) as if it were 丁 ding”.
The advice is meant for the majority of phonetic-semantic characters, which is 80% of the language. It requires a good base, of course, so it’d be applied in middle-school level and up.
Your example is equivalent to saying you don’t know how to pronounce “baa” because you know the letter “a” but not the “b”. If you know 冫 then you know 冰.
They’ve had distills before this, a more accurate title would be “Newest DeepSeek R1 distill runs on a single GPU like all the previous ones”.
Also it’s not accurate to say that a Qwen3 distill is the same as the DeepSeek R1 running in the datacenter - that one is still 85x larger than the Qwen3 distill.
What stands out about DeepSeek-R1-0528-Qwen3-8B is that it only requires a GPU with 40GB to 80GB of RAM to run
This is just inaccurate. It runs in 16GB of VRAM… because, you know, 8B parameters x 2 bytes (needed to store each parameter) = 16x10^9 bytes = 16GB…
Learners are told to sound out new characters, because often enough they sound close enough to their components.