Fascinating! I often wondered if corporations used hyper-specific prompts in an effort to get an image as close as possible to the original so they could blame the image generator for plagiarism, (then sue them and naturally get a crap ton of money from doing so), but the prompts used here seem very generic, yet beat an uncanny resemblance to these screencaps.
There is some debate about the ethics of it, but supposedly there should be no legal problem with using copyrighted images for a dataset so long as the outputs are transformative (i.e. don’t resemble any one image too closely). I wonder if there’s anything the developers can do anything to prevent it, or if it’s just something an image model will inevitably do.
I started looking for alternatives when they added the weird character voices and I started noticing inaccurate pronunciation of kanji in my Japanese course. A lot of people on the message boards recommended Memrise, and it’s been great! The official courses contain actual video and audio of native speakers, so I knew for a fact the pronunciation would be correct—even better than the old Duo voices!
There’s also user-generated content, too, some of which might not be accurate, but most of the user courses I’ve found are pretty good. You can even make your own set and publish it.
(I haven’t visited the site in a few months, so I can’t guarantee it’ll be exactly as I found it, but I doubt it has changed much)
And depending on what languages you’re studying, you might be able to find some good ones dedicated to your language if you do some digging. For Spanish, I used SpanishDict, and for Japanese, I used Kanshudo (both are freemium, with more restrictions than Memrise)