• 6 Posts
  • 47 Comments
Joined 1 year ago
cake
Cake day: July 19th, 2023

help-circle

  • I wasn’t going to explain my downvote, but it’s been a few days and apparently everybody here is thinking about MRAs when there’s more at stake.

    I see Nixon in Trump: somebody who starts and prolongs wars for their own political gain. Of my three uncles who qualified to go to Vietnam, one was permanently disabled during basic training, one didn’t come back home, and one fell apart before I was born. I had to “voluntarily” register as a potential servicemember in order to access various standard government services as a young man in the 2000s, while the USA was invading Iraq and Afghanistan. Under a sufficiently fascist government, the USA has shown itself capable of sending its men to death. This system is explicitly misandrist; only men are required to register and only my uncles suffered this hate.

    Misandry isn’t equal and opposite to misogyny. Our society was never obligated to hate men and women in ways that are nicely symmetric and amenable to analysis; indeed, critical theory suggests that society deliberately structures itself to obfuscate its hate.




  • I haven’t done a headcount yet and the election’s not fully tallied, but I think that the Senate still has around 70% support for NATO, and historically we can expect to see a “blue dog” phenomenon in the House as a reaction to Republicans gaining seats. Effectively, both the Democrats and Republicans will function as big tents of two distinct parties, and there is usually tripartisan support (everybody but the far-right Republicans) for imperialism. We may well see votes where the legislators override presidential vetoes to force weapons sales and otherwise fulfill NATO obligations.

    And yes, you read that correctly; Democrats move right as a reaction to Republicans doing well. Go back to bed, America…










  • It’s almost completely ineffective, sorry. It’s certainly not as effective as exfiltrating weights via neighborly means.

    On Glaze and Nightshade, my prior rant hasn’t yet been invalidated and there’s no upcoming mathematics which tilt the scales in favor of anti-training techniques. In general, scrapers for training sets are now augmented with alignment models, which test inputs to see how well the tags line up; your example might be rejected as insufficiently normal-cat-like.

    I think that “force-feeding” is probably not the right metaphor. At scale, more effort goes into cleaning and tagging than into scraping; most of that “forced” input is destined to be discarded or retagged.




  • Even better, we can say that it’s the actual hard prompt: this is real text written by real OpenAI employees. GPTs are well-known to easily quote verbatim from their context, and OpenAI trains theirs to do it by teaching them to break down word problems into pieces which are manipulated and regurgitated. This is clownshoes prompt engineering done by manager-first principles like “not knowing what we want” and “being able to quickly change the behavior of our products with millions of customers in unpredictable ways”.


  • That’s the standard response from last decade. However, we now have a theory of soft prompting: start with a textual prompt, embed it, and then optimize the embedding with a round of fine-tuning. It would be obvious if OpenAI were using this technique, because we would only recover similar texts instead of verbatim texts when leaking the prompt (unless at zero temperature, perhaps.) This is a good example of how OpenAI’s offerings are behind the state of the art.


  • corbin@awful.systemstoTechTakes@awful.systemsChatGPT spills its prompt
    link
    fedilink
    English
    arrow-up
    6
    arrow-down
    1
    ·
    5 months ago

    Not with this framing. By adopting the first- and second-person pronouns immediately, the simulation is collapsed into a simple Turing-test scenario, and the computer’s only personality objective (in terms of what was optimized during RLHF) is to excel at that Turing test. The given personalities are all roles performed by a single underlying actor.

    As the saying goes, the best evidence for the shape-rotator/wordcel dichotomy is that techbros are terrible at words.

    NSFW

    The way to fix this is to embed the entire conversation into the simulation with third-person framing, as if it were a story, log, or transcript. This means that a personality would be simulated not by an actor in a Turing test, but directly by the token-predictor. In terms of narrative, it means strictly defining and enforcing a fourth wall. We can see elements of this in fine-tuning of many GPTs for RAG or conversation, but such fine-tuning only defines formatted acting rather than personality simulation.