"just got doxxed to within 15 miles by a vision model, from only a single photo of some random trees. the implications for privacy are terrifying. i had no idea we would get here so soon. holy shit"

CoderSupreme@programming.dev · edit-2 11 months ago

"just got doxxed to within 15 miles by a vision model, from only a single photo of some random trees. the implications for privacy are terrifying. i had no idea we would get here so soon. holy shit"

ricecake@sh.itjust.works · 11 months ago

Geo guessing is related to open source intelligence techniques, and it’s pretty easy to get surprisingly good at it.
People who are good at it can take a picture of someone’s room and deduce enough about them (sometimes) to be able to get their name, address and phone number.

It being automatic is pretty cool, but you were already leaking the information to anyone interested.

https://www.sans.org/blog/geolocation-resources-for-osint-investigations/

https://youtu.be/p7_2ZA1HHMo?si=O19_7LA3SoyvZEm1

PipedLinkBot@feddit.rocks · 11 months ago

Here is an alternative Piped link(s):

https://piped.video/p7_2ZA1HHMo?si=O19_7LA3SoyvZEm1

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

geoma@lemmy.ml · 11 months ago

Yep. If you play geoguessr.com or others you wont find it that surprising.

geoma@lemmy.ml · 11 months ago

Yep. If you play geoguessr.com or others you wont find it that surprising.

bleistift2@feddit.de · edit-2 11 months ago

The tweet: tweet (Is the preview working for you? For me, it’s not).

The game is called geoguessing and those who do this regularly are crazy good at it, taking into account the kind of trees you see, where the sun and shadows are, even the color of the dirt and the pavement.

Tom Scott did something similar and was frightened too: https://www.youtube.com/watch?v=cGqEBvlmFAQ&pp=ygUSdG9tIHNjb3R0IGZvdW5kIHVz

telep@lemmy.ml · edit-2 11 months ago

important second frame for context!

1000018451

& no it isnt. quite sure twitter broke link previews a long time ago alongside guest accounts.

bleistift2@feddit.de · 11 months ago

I didn’t find that in the Twitter UI and wondered why OP thought it was an AI. Thanks for sharing.

Optional@lemmy.world · 11 months ago

Andrew Gao why are you still on the fascist site

merde alors@sh.itjust.works · 11 months ago

do you think elonMusk is fascist or do you mean that twitter is fascist?

youmaynotknow@lemmy.ml · 11 months ago

Yes.

merde alors@sh.itjust.works · 11 months ago

what do you think fascism is?

This is fine🔥🐶☕🔥@lemmy.world · 11 months ago

Removed by mod

youmaynotknow@lemmy.ml · 11 months ago

A word used to tell someone you disagree with them when you have no idea how to express why.

merde alors@sh.itjust.works · 11 months ago

apparently 🤷

VirtualOdour@sh.itjust.works · 11 months ago

Anyone with different opinions, obviously.

marcie (she/her)@lemmy.ml · 11 months ago

which llm does he use

invisiblegorilla@sh.itjust.works · edit-2 11 months ago

Looks like Pigeon or Pigeotto https://huggingface.co/papers/2307.05845

marcie (she/her)@lemmy.ml · 11 months ago

tragic that it doesnt include a gguf or safetensors file for easy access. ill load it up eventually. this would be very useful for invasive animal research

invisiblegorilla@sh.itjust.works · 11 months ago

https://huggingface.co/geolocal/StreetCLIP/tree/main

Streetclip seems to be the public release. Or a version of it.

marcie (she/her)@lemmy.ml · edit-2 11 months ago

thanks, that lets me load it into my setup much quicker. i do environmental research so this will be useful

still a bit of a shame no safetensors or gguf

Murdoc@sh.itjust.works · 11 months ago

“A couple of trees…”

And a body of water, and a road, possibly some mountains… (smh)

Blisterexe@lemmy.zip · 11 months ago

The embed works for me

taladar@sh.itjust.works · edit-2 11 months ago

https://www.youtube.com/@GeoWizard has a couple videos in a series where he guesses historic photo locations quite accurately too.

PipedLinkBot@feddit.rocks · 11 months ago

Here is an alternative Piped link(s):

https://www.piped.video/@GeoWizard

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

Snot Flickerman@lemmy.blahaj.zone · 11 months ago

This Just In: Most photos uploaded to the internet are not stripped of their metadata, and one of the common things kept in metadata is… (drumroll please)… your GPS coordinates.

This is a lot less interesting than it seems to be at first glance, imho.

kromem@lemmy.world · edit-2 11 months ago

Literally just after talking about how people are spouting confident misinformation on another thread I see this one.

Twitter: Twitter retains minimal EXIF data, primarily focusing on technical details, such as the camera model. GPS data is generally stripped.

Source

Yes, this is a privacy thing, we strip the EXIF data. As long as you’re not also adding location to your Tweet (which is optional) then there’s no location data associated with the Tweet or the media.

Source (9 years ago)

People replying to a Twitter thread with photos are automatically having the location data stripped.

God, I can’t wait for LLMs to automate calling out well intentioned total BS in every single comment on social media eventually. It’s increasing at a worrying pace.

Gutless2615@ttrpg.network · 11 months ago

deleted by creator

smeenz@lemmy.nz · 11 months ago

I mean… that’s pre-musk information

ricecake@sh.itjust.works · 11 months ago

I mean, yes, but that’s not what they’re doing.

https://arxiv.org/abs/2307.05845 https://github.com/LukasHaas/PIGEON

It’s a Stanford project that does what it looks like is happening in the screenshot.

firefly@neon.nightbulb.net · 11 months ago

@SnotFlickerman@lemmy.blahaj.zone @CoderSupreme@programming.dev

Some digital cameras and phone cameras can also embed the GPS coordinates in the pixel data so that even if you delete the EXIF metadata the GPS location and device serial number are still present in the image. Many document printers also embed device serial number and other data on printed documents by using nearly invisible dot encodings.

Hazzia@discuss.tchncs.de · 11 months ago

Ah shit. Any easy way to determine if your camera’s doing that? Would that normally be in manufacterer specs?

firefly@neon.nightbulb.net · 11 months ago

No easy way at all. The specs would be in-house manufacturer docs. Recall that digital cameras used to embed date and time visibly in images in a corner. The logical progression was to embed other data such as device serial number, geotag data, etc.

Regarding the schemes for steganographic identification in devices such as cameras and printers, this information is usually kept a trade secret. The Secret Service would probably already have the spec docs for data hiding. Many manufacturers already have working agreements to provide back door assistance and documentation for the hardware surveillance economy. Ink chemistry profiles are registered with the Secret Service. The subterfuge is to ‘investigate counterfeiting’ but it is also used to identify whistleblowers and objective targets by their printer serial number or ink chemistry, or the data embedded in any images they are naive enough to publish.

If you are a undercover reporter secretly video recording, unbeknownst to you the video could have metadata encoded using a secret scheme. If you registered that product for a warranty, or bought it online and had it shipped, or paid with a credit card or check, or walked beneath the electronics store cameras without a hat and sunglasses to pay cash, it is easy for the state organs to then follow the breadcrumbs and identify the videographer.

Almost all ‘free’ wifi hotspots offered by chain restaurants and hotels are logged with the data being stored indefinitely, showing your mac address. It takes only a little bit of investigation and process of elimination to find the user on a camera feed history, to see who was connected when a certain message or leak was sent. If you use a wifi hotspot in a McDonalds, Wendy’s, Starbucks, etc. smile for the surveillance camera which will also have your device’s unique MAC address in the wifi history. This MAC address data is automatically sent to a central station, for example at the Wandering Wifi company, and God only knows how long they store it.

None of this nonsense makes anyone safer. These people hate us.

Legend@lemmy.sdf.org · 11 months ago

Using something like open camera avoids the risc tho right ?

firefly@neon.nightbulb.net · 11 months ago

Try Polaroid.

planish@sh.itjust.works · 11 months ago

I think we can trust that most phone camera apps do in fact obey the toggle they provide for whether or not to embed the GPS location data in the image.

MonkderDritte@feddit.de · 11 months ago

Don’t use propritary camera software then, got it.

acetanilide@lemmy.world · 11 months ago

That’s crazy. Just read this and I’m just mystified

Seasm0ke@lemmy.world · 11 months ago

Back in like 2006 or 7 steganography was used in obscure corners of the internet ( like insurgen.cc, an early anonymous holdout that got broken up by the feds) to pass around hacking tools. You’d unzip the dangerous kitten photo with winrar and extract a set of hacking tools. One I remember passed around widely was the low orbiting ion cannon the /b used to ddos scientologists.

MonkderDritte@feddit.de · 11 months ago

Wasn’t there some online service to hide documents in your images?

Swedneck@discuss.tchncs.de · 11 months ago

i’m sure there are an endless amount, and there are certainly client-side software that makes it easy as well.

https://en.wikipedia.org/wiki/Steganography

acetanilide@lemmy.world · 11 months ago

No idea, but I found this wikihow https://www.wikihow.com/Hide-a-File-in-an-Image-File

Pantherina@feddit.de · edit-2 11 months ago

Software that doesnt store private metadata

grapheneOS cam
opencamera (not by default!)
KDE spectacle
~~android~~ GrapheneOS screenshots

ReversalHatchery@beehaw.org · 11 months ago

android screenshots

I think I have read that on some versions it can store the app’s package name in the metadata. Not sure if that counts private but if and when it does so, it’s good to be aware of

Pantherina@feddit.de · 11 months ago

For sure, edited it. GrapheneOS screenshots have no metadata afaik

jonne@infosec.pub · 11 months ago

Pretty sure Twitter strips it out by default.

filister@lemmy.world · 11 months ago

What about X?

MonkderDritte@feddit.de · 11 months ago

Don’t have the manpower to change that.

SchmidtGenetics@lemmy.world · 11 months ago

I’m sure most people who would put this to test would strip that data or screen grab the image to do the same thing…. If you know about meta data, so does a large amount of other people mate…

The people would be labeled as a fraud very fast if this wasn’t actually a real thing dude.

sexy_peach@beehaw.org · 11 months ago

GPS coordinates in metadata isn’t common

JohnDClay@sh.itjust.works · 11 months ago

I think Lemmy strips it, right? That’s why pictures were uploading sideways for a while?

dislocate_expansion@reddthat.com · edit-2 11 months ago

Lemmy does not remove exif data (unless the code has changed), you need to remove it yourself (also a good practice in general)

MonkderDritte@feddit.de · 11 months ago

Yeah, disable gps metadata in your camera settings. Wondering why it often is default on?

weker01@feddit.de · 11 months ago

Because people that don’t care about privacy find this to be a nice feature.

There are gallery apps that let’s you sort by location and it’s nice if you want to search for the cool thing you saw once again.

pingveno@lemmy.ml · edit-2 11 months ago

Yeah, I have it for personal photos that will never be shared. If I am traveling, I want a record of where a given photo was. But those aren’t photos I am sharing, and the ones I do share get their metadata stripped.

bionicjoey@lemmy.ca · 11 months ago

So it has nothing to do with the trees?

Legend@lemmy.sdf.org · 11 months ago

4channers have been doing that since a long time .

Renegade_roosteR@lemmy.world · edit-2 11 months ago

telep@lemmy.ml · edit-2 11 months ago

this is extremely scary if true. are these algorithms obtainable by every day people? do they work only in heavily photographed areas or do they infer based on things like climate, foliage, etc? I would love some documentation on these tools if anyone has any.

umami_wasabi@lemmy.ml · edit-2 11 months ago

If I’m the dev, I would scrape off Google Street View with cords as data source.

jonne@infosec.pub · 11 months ago

deleted by creator

ricecake@sh.itjust.works · edit-2 11 months ago

https://github.com/LukasHaas/PIGEON

https://arxiv.org/abs/2307.05845

Basically a combination of what the game geoguesser does, and public geotagged images to be able to get a decent shot at approximate location for previously unseen areas.

It’s more ominous when automated, but with only a little practice it’s easy enough for a human to get significantly better.

EDIT: yup, looks like this is the guy from the Twitter: https://andrewgao.dev/ and he’s Stanford affiliated with the same department that made the above paper and system.

photonic_sorcerer@lemmy.dbzer0.com · 11 months ago

Are you sure? The paper you linked mentioned the model beating a top geoguesser player six times in a row.

ricecake@sh.itjust.works · 11 months ago

I am not sure it’s the same software, but it’s a fairly good guess I think. Same software capabilities and same lab, with the same area of research.

Geoguesser is a subset of the skills used for general image geo location for open source intelligence.
In the specific cases of only using the data present in the image and relying on geographic information, it certainly does better.
Humans still do better, and can reach decent skill with minimal training, at placing images that require spatial reasoning or referencing multiple data sources.
AI tools will likely be able to learn those extra skills, but it doesn’t change that it’s the photo that’s the data leak, and not the tool. The tool just makes it vastly more accessible, and part of the task easier for curious human.

underisk@lemmy.ml · edit-2 11 months ago

There are tons of machine learning algorithm libraries easily usable by any relatively amateur programmer. Aside from that all they would need is access to a sufficient quantity of geographically tagged photographs to train one with. You could probably scrape a decent corpus from google street view.

The obtainability of any given AI application is directly proportional to the availability of data sets that model the problem. The algorithms are all packed up into user friendly programs and apis that are mostly freely available.

taladar@sh.itjust.works · 11 months ago

It might be easier to train the AI to the specific things Geoguessr players have collected as signs that give away a location instead of letting the AI figure all those out again.

ricecake@sh.itjust.works · 11 months ago

https://arxiv.org/html/2307.05845v4

I believe this is the paper

MicrowavedTea@infosec.pub · 11 months ago

Rainbolt has a couple of videos playing against AI. I don’t remember what they said it was trained on but it’s possible it was based on that.

Match!!@pawb.social · 11 months ago

ooh baby I love a good supervised learning

reddithalation@sopuli.xyz · 11 months ago

reminds me of geowizards episodes geolocating vacation photos for fun. this one was insane, similar in detail to the photo in the tweet

PipedLinkBot@feddit.rocks · 11 months ago

Here is an alternative Piped link(s):

this one

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I’m open-source; check me out at GitHub.

taladar@sh.itjust.works · 11 months ago

It really isn’t that hard if anything like a silhouette of mountains are in the background and you have a couple of rough hints that give you an idea where to start or how to narrow down possible locations, no AI needed.

ISOmorph@feddit.de · edit-2 11 months ago

You’re misunderstanding the post. It’s not about whether or not someone could guess your location from a picture. It’s about the automation thereof. As soon as that is possible it becomes another viable vector to compromise your privacy.

taladar@sh.itjust.works · edit-2 11 months ago

And you misunderstand my point, it always has been a way to compromise your privacy. Privacy matters most in the individual case, with people who know you. If you e.g. share a picture taken at your home (outside or looking out of the window in the background) with a friend online you always had to assume that they could figure out where you lived from that if there were any of those kinds of features in there.

Sure, companies might be able to do it on a larger scale but honestly, AI is just too inefficient for that right now, as in the energy-cost required to apply it to every picture you share just in case your location might be useful isn’t worth it yet.

ISOmorph@feddit.de · 11 months ago

Privacy matters most in the individual case, with people who know you.

That statement is subjective at best. My friends and coworkers knowing where I live certainly isn’t my concern. In today’s day and age privacy enthusiasts are definitely more scared of corpos and governments.

isn’t worth it yet.

You’re thinking too small. Just in the context of the e2ee ban planned in europe, think what you could do. The new law is set to scan all your messages before/after sending for specific keywords. Imagine you get automatically flagged and now an AI is scanning all your pictures for locations and contacts and what not. Just the thought that might be technically possible is scary as hell.

taladar@sh.itjust.works · 11 months ago

Governments won’t scan all your pictures to figure out who you are, they are just going to ask (read: legally force) the website/hoster where you posted that picture for your IP address and/or payment info and then do the same with your ISP/payment provider to convert that into your RL info to figure out who you are.

And you might not be worried about your RL friends or coworkers but what about people you meet online? Everyone able to see your post on some social media site?

Nobody is going to scan all the pictures you post for some information that is going to be valid for a long time after it is discovered once. Governments and corporations have had the means to discover who you are once for a long time.

helenslunch@feddit.nl · 11 months ago

If I ever upload photos publicly, I will add a background blur first

onlinepersona@programming.dev · 11 months ago

There are techniques to deblur. It’s even how a prolific child sex offender was caught.

Anti Commercial-AI license

helenslunch@feddit.nl · 11 months ago

I mean I’m sure it depends on how it’s blurred.

onlinepersona@programming.dev · 11 months ago

True, but that just turns into a cat an mouse game. Also, one the photo is up, the background doesn’t change how its blurred with time --> wait long enough and a technique to unblur will be developed.

Anti Commercial-AI license

helenslunch@feddit.nl · 11 months ago

wait long enough and a technique to unblur will be developed.

You can’t just program data that doesn’t exist into existence.

umami_wasabi@lemmy.ml · 11 months ago

I do remember 1-2 years ago there is a paper (or model?) that reverse blured images. It’s similar to how ML based object remover and inpainting works. Granted it only works for specific blurring algo.

onlinepersona@programming.dev · 11 months ago

You do realize that a lot of image recognition was done on scaled down images? Some techniques would even blur the images on purpose to reduce the chance of confusion. Hell, anti-aliasing makes text seem more readable by adding targeted blur.

Deblurring is guessing and if you have enough computing power with some brain power (or AI), you can reduce the number of required guesses by erasing improbable guesses.

Anti Commercial-AI license

wagoner@infosec.pub · 11 months ago

That photo was more than just some trees

Match!!@pawb.social · 11 months ago

what should I do if I was already expecting this level of surveillance

firefly@neon.nightbulb.net · 11 months ago

@match@pawb.social @CoderSupreme@programming.dev

What should you do about surveillance technology? Ask a Amish hacker!

just_another_person@lemmy.world · 11 months ago

It’s just sourcing data from Street View or similar. Not that scary. If it picked you out of a crowd in a randomly sourced image from that area, then it’d be scary.