Is anyone actually surprised by this?

  • JOMusic@lemmy.ml
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    1
    ·
    6 days ago

    This article is what US propaganda looks like folks. Mashable should be ashamed.

    Literally all AI companies do this to run their services. Except you can actually download Deepseek and run it completely securely on your own devices. You know who doesn’t allow that security? OpenAI and the other US companies currently being screwed.

    • zeca@lemmy.eco.br
      link
      fedilink
      arrow-up
      1
      ·
      6 days ago

      every google site has been doing this for years too. every comment we write in youtube and discard before posting, its being recorded. this isnt news at all.

  • Zip2@feddit.uk
    link
    fedilink
    arrow-up
    65
    arrow-down
    1
    ·
    7 days ago

    Did the American technology giants think they had the monopoly on capturing human input too?

  • ArchRecord@lemm.ee
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    2
    ·
    6 days ago

    the company states that it may share user information to "comply with applicable law, legal process, or government requests.

    Literally every company’s privacy policy here in the US basically just says that too.

    Not only does DeepSeek collect “text or audio input, prompt, uploaded files, feedback, chat history, or other content that [the user] provide[s] to our model and Services,” but it also collects information from your device, including “device model, operating system, keystroke patterns or rhythms, IP address, and system language.”

    Breaking news, company with chatbot you send messages to uses and stores the messages you send, and also does what practically every other app does for demographic statistics gathering and optimizations.

    Companies with AI models like Google, Meta, and OpenAI collect similar troves of information, but their privacy policies do not mention collecting keystrokes. There’s also the added issue that DeepSeek sends your user data straight to Chinese servers.

    They didn’t use the word keystrokes, therefore they don’t collect them? Of course they collect keystrokes, how else would you type anything into these apps?

    In DeepSeek’s privacy policy, there’s no mention of the security of its servers. There’s nothing about whether data is encrypted, either stored or in transmission, and zero information about safeguards to prevent unauthorized access.

    This is the only thing that seems disturbing to me, compared to what we’d like to expect based on the context of what DeepSeek is. Of course, this was proven recently in practice to be terrible policy, so I assume they might shore up their defenses a bit.

    All the articles that talk about this as if it’s some big revelation just boil down to “company does exactly what every other big tech company does in America, except in China”

    • tux@lemmy.world
      link
      fedilink
      arrow-up
      8
      arrow-down
      8
      ·
      6 days ago

      Collecting keystrokes is very different from collecting text inputted into fields. Keystroke rhythms is even more alarming as that is often used to identify users despite them using privacy settings, or used to collect what’s typed via audio collection.

      Your argument that this is no different than other apps is complete crap. Don’t trust any app that collects that information

      • Ferk@lemmy.ml
        link
        fedilink
        arrow-up
        5
        ·
        edit-2
        6 days ago

        The argument stands, though.

        Yes, not ALL other apps do that, but the comment was specifically talking about companies like Google and Meta… they definitely do collect incomplete strings from search forms (down to individual characters) when they display search suggestions, for example. They might not mention “keystrokes” in the legal text, but I don’t see why they wouldn’t be able to extrapolate your typing pattern since they do have the timing information which should be enough data to, at some level, profile it.

  • Treczoks@lemmy.world
    link
    fedilink
    arrow-up
    67
    ·
    7 days ago

    “We store the information we collect in secure servers located in the People’s Republic of China”

    Now you Americans know how we Europeans feel when Google, Amazon and Facebook store our information on American servers. Hint: The protective wall between Chinese servers and their government are about as good as the one between American servers and their government - at least for non-US citizens. The last thin veil of privacy for Eurpeans has been ripped to shreds by Trump last week.

    • Ferk@lemmy.ml
      link
      fedilink
      arrow-up
      1
      ·
      edit-2
      6 days ago

      The last thin veil of privacy for Eurpeans has been ripped to shreds by Trump last week.

      What did he do? I know Trump does not like the GDPR, but did he sign something affecting it last week?

  • grey_maniac@lemmy.ca
    link
    fedilink
    arrow-up
    60
    arrow-down
    1
    ·
    7 days ago

    I’m confused. Isn’t “collecting keystroke data” just an alarmist way to describe text entry?

    • noisefree@lemmy.world
      link
      fedilink
      arrow-up
      14
      arrow-down
      1
      ·
      7 days ago

      Maybe. They could also be doing things like paying attention to input cadence and typos/pre-send typo corrections to use as part of a fingerprint associated with the identifying information a user gives them when creating an account so that they can then attempt to detect the user elsewhere on the web whether they are using an identifying account or not.

    • uis@lemm.ee
      link
      fedilink
      arrow-up
      7
      arrow-down
      3
      ·
      6 days ago

      Not exactly. Timing between key presses can be used to identify people.

      • grey_maniac@lemmy.ca
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        6 days ago

        I am literally so paranoid I regularly vary my keysteoke rhythms and explore polyrhytmic techniques to create variations. Not even joking.

    • tux@lemmy.world
      link
      fedilink
      arrow-up
      2
      arrow-down
      1
      ·
      6 days ago

      Not usually. Keystroke info is different than text input, like if you didn’t click onto any field and typed it would only be captured if keystroke are all being grabbed. It’s especially scary if you keep the app running in the bg and then type something and it still captures it. Not saying they’re doing that, but the privacy policy says they might.

      The rhythm part is annoying, it’s commonly used to ID people even through things like ad blocks and dns blocks. Could also (in theory) be used to capture what people are typing just by hearing how they type.

  • geneva_convenience@lemmy.ml
    link
    fedilink
    arrow-up
    21
    ·
    6 days ago

    They should store the data in US servers like OpenAI does. Apparently then Mashable won’t write an article about it.

    The criticism thrown at DeepSeek in the past days is just as applicable to American AI models. But when that was brought up it in the past it was “making things political”.

    At least I can run DeepSeek locally.

    • smb@lemmy.ml
      link
      fedilink
      arrow-up
      5
      ·
      6 days ago

      I think its called a data lake, so they don’t “store” it, its rather floating around there 🤪

      • howrar@lemmy.ca
        link
        fedilink
        arrow-up
        8
        ·
        6 days ago

        These lakes are formed when the cloud is saturated and gives us data precipitation.

        • smb@lemmy.ml
          link
          fedilink
          arrow-up
          1
          ·
          edit-2
          6 days ago

          thanks for the great picture 👍

          so here is the current cloud clima forecast:

          The saturated clouds will rain into the data lakes that are already overspilling here and there into the ransomstreams already taking all soil in their way with them. During the day there will be security clouds preventing from visible rain only while during the night those same security clouds rain themselves all collected data to their homelake while their homelake security already is corrupted and spills over regulary.

          As soon as the fort-cisc-pal-ocstricken-redm-ondams breach it’ll gonna have floods with multi-exabyte waveheights and the ripples of the release will be felt over to far east china and the currents will circulate around the world multiple times causing damage and devastation in their wake around the world and eventually even reach connected orbit.

          The floods will have the potential to also wash away and /or drown or choke all the big tech dinosaurs. Only small foss mammals and deep sea amphibics will survive this historic event.

          … you kinda asked for it 😉 same as “they” kinda asked for it too. 🤔

  • ozoned@lemmy.world
    link
    fedilink
    arrow-up
    53
    ·
    7 days ago

    Chinese company does what American companies have done for 25+ years now!

    Is it time for REAL data privacy laws or are we just gonna keep playing whack-a-mole with Chinese tech companies that get us nowhere?

    • Someonelol@lemmy.dbzer0.com
      link
      fedilink
      English
      arrow-up
      4
      ·
      7 days ago

      Our data’s just too valuable for these parasites. Data privacy laws may eventually pass to compel software companies to store everything in US servers only.

      • ozoned@lemmy.world
        link
        fedilink
        arrow-up
        4
        ·
        7 days ago

        Excellent Point. If that’s the case though, then wouldn’t other countries follow suit which still limits big tech’s reach and makes them less profitable and less powerful? Idk. Guess we’ll see how it plays out. Either way, I’m staying as far from those ecosystems as possible to at least try to mitigate some of what they do. I’ll never be totally successful, genie is put of the bottle, but we can at least attempt.

    • quant@leminal.space
      link
      fedilink
      English
      arrow-up
      15
      ·
      6 days ago

      By extension, anything that’s not self hosted means 3rd party actors snooping. American, Chinese, whoever happens to operate that machine.

    • UnderpantsWeevil@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      6 days ago

      Building my entire data model around the Tienanmen Square copypasta. I can run this thing on a Raspberry Pi plugged into a particularly starchy potato and it reliably returns the only answer I’ve thought to ask it.

  • Jhex@lemmy.world
    link
    fedilink
    arrow-up
    15
    ·
    6 days ago

    as opposed to OpenAI which also stores keystrokes and then sells them to anyone who’d pay?

  • conditional_soup@lemm.ee
    link
    fedilink
    arrow-up
    36
    arrow-down
    3
    ·
    7 days ago

    Yeah, uh… If you think that American companies aren’t doing this same thing and handing your data over to the government without a warrant among other bad uses, I have some bad news for you. This is pretty much par for the course, and I’m pretty sure that we’re witnessing a well financed negative media blitz happening to try and keep OpenAI from getting all of its spaghetti spilled. Watch for the government to try and ban deepseek for “national security” reasons soon.

    • Duamerthrax@lemmy.world
      link
      fedilink
      arrow-up
      3
      arrow-down
      4
      ·
      7 days ago

      Not gonna happen. Someone in China gave to Trump’s inauguration fund, so nothing’s getting banned.

  • Naia@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    35
    arrow-down
    1
    ·
    7 days ago

    I swear people do not understand how the internet works.

    Anything you use on a remote server is going to be seen to some degree. They may or may not keep track of you, but you can’t be surprised if they are. If you run the model locally, there is no indication it is sending anything anywhere. It runs using the same open source LLM tools that run all the other models you can run locally.

    This is very much like someone doing surprised pikachu when they find out that facebook saves all the photos they upload to facebook or that gmail can read your email.