Tell the FTC: Stop tech companies from selling kids’ data
ChatGPT 4 collects your data. Here's what OpenAI does with it and how to make your information a little more secure
Sign the petition
Intern, Don't Sell My Data campaign
Director, Don't Sell My Data Campaign, U.S. PIRG Education Fund
As a college student, I can report firsthand that ChatGPT has taken the school world by storm. Plenty of my classmates use it. During finals this year, a friend was dreading writing an arduous paper about Rosseau. He used ChatGPT to produce an initial draft that he then revised and turned in. Many of my friends have similar stories.
They’re not alone. ChatGPT use is on the rise. A recent survey found that nearly half of students and teachers report using it for work or school at least once a week.
You, like some of my professors and fellow students, may have mixed feelings about that. Some professors at my university allow ChatGPT usage with no conditions, some allow it as long as students are transparent, and some have banned it altogether. Some of my friends trust the information it gives while some don’t. I personally don’t use it for anything I need to submit.
Regardless of how you feel about ChatGPT, it’s likely not going away anytime soon. Given its popularity, I wondered: when you use ChatGPT, what data does it collect about you? What does OpenAI, the company behind ChatGPT, do with your data? And should you be concerned? I decided to take a look.
In the most basic terms, ChatGPT is a chatbot. You can give it a prompt to follow or question to answer and it quickly spits out a response. If I need to use an avocado and some cilantro in my fridge that’s about to go bad, I can ask ChatGPT for a recipe that will use them up. Or if I need it to summarize a calculus idea for me, I can ask it to explain it like I’m 5, or like it’s a Shakespearean tragedy, or like it’s a comedy sketch.
ChatGPT uses data, and a lot of it. It’s like a complicated version of autocomplete. It strings together words based on what it thinks is most likely to come next in a sentence based on all the text it’s ever read. The result is a chatbot that appears to respond to your inquiries in a relevant way. ChatGPT only exists because the internet came first, with tons of blogs, news articles and chat forums out there to serve as the raw data for it to learn how human speech works and what people have to say.
And now that ChatGPT is publicly available for users to talk to, it’s also learning from the conversations people are having with it.
The primary thing ChatGPT collects is your chat logs – what you say when you’re typing into ChatGPT’s text box. If you tell it your address, your religious beliefs, or your mother’s name, it collects and keeps that information. If you upload a file, like a resume—which usually has a lot of information about you—it collects and stores that too.
OpenAI also collects your email address when you make an account. And whenever you use its services, it collects your IP address, information about your device including unique identifiers, information collected by cookies, and your location. It’s a lot of information, and some people may not be comfortable with it.
The biggest thing OpenAI uses your data for is further training GPT models. Your chat data is added to OpenAI’s training data and used in hopes of making ChatGPT more accurate and usable.
Your data is pretty valuable. In general, AI companies are running out of high-quality data that’s freely available on the internet for training their models. Some companies are considering using “synthetic data” – data that AI systems create themselves. There’s a worry that models could get weirder and less reliable if they’re learning from data they’ve created themselves. Being able to use your chats and having real data from actual people for training is very useful for making continued improvements.
OpenAI says it does not use your data for advertising. They may, however, disclose your information to affiliates, law enforcement, and the government.
It depends on what you’re comfortable with. We’d argue there’s some reasons to be concerned.
Anytime a company collects and keeps data about you it comes with security risks. The more data a company collects and the longer it stores it, the more likely it is that your data will eventually be exposed in a breach or a hack. This makes it more likely it will end up with identity thieves or scammers. OpenAI has already run into some trouble on this score. In March 2023, it was the victim of a large data leak, exposing some of its users’ names, addresses, emails, and partial credit card information.
It’s also possible the information you provide ChatGPT can resurface in other people’s conversations. The New York Times is currently suing ChatGPT for producing near-verbatim excerpts of its copyrighted journalism. It’s hard to predict exactly how ChatGPT will string together its responses.
Plus ChatGPT has had vulnerabilities in the past. Google researchers found a way to get ChatGPT to hand over some of its training data – which included people’s names, phone numbers and addresses.
And then there’s the question of bad actors. A company with a high profile like OpenAI can be a tempting target for cybercriminals. In 2023, a quarter million OpenAI logins were found for sale on the dark web. The more sensitive information you’ve provided in your chat logs, the more dangerous it is if your account info ends up in the wrong hands.
It’s also worth noting OpenAI’s privacy policy states it can transfer your data to the government, which some may find concerning.
You can use ChatGPT both with or without an account. You can either use it on its website, or you can download the app on your phone.
There are things to consider when deciding if you want to create a login or not. If you don’t make an account, you don’t need to provide your email address to OpenAI, which is nice. But without an account you also can’t access many of OpenAI’s privacy options that give you more control over your data.
Regardless of whether or not you have an account, there are steps you can take to keep your data and information safer.
As a general practice, the biggest thing you can do to protect your data is to not offer up sensitive information when using ChatGPT. OpenAI stores your conversations with ChatGPT for some amount of time no matter what, making it possible the information you include in your chats could be accessed by bad actors in security breaches.
Consider using fake names for your friends and family. Don’t give it your health info. And avoid uploading documents that include your Social Security Number or contact information. This data, if lost in a breach, gives scammers an absolute goldmine. For example, let’s say you get a text from a scammer impersonating Amazon with a link they want you to click. If they send a generic message, you probably won’t click on it. But if the message contains your name and your shipping address, you’re probably more likely to think it’s real. Any information you can imagine being useful for someone trying to scam you is best to keep off OpenAI’s—and any companies’—servers.
I’ve seen many of my college peers upload PDFs of assignments and syllabuses into ChatGPT to respond to rather than writing prompts from scratch. I wouldn’t recommend this. Assignment PDFs often contain the names of specific professors, TAs, universities, course names, and more. When you write prompts yourself, you can omit unnecessary information and still use ChatGPT.
OpenAI indefinitely stores users’ conversations with ChatGPT by default. In May 2024, OpenAI started offering something called temporary chats for users with accounts. When you turn on this setting, OpenAI will automatically delete your conversations within 30 days. Turning on temporary chats also automatically opts you out of your data being used for training.
Note that if you use temporary chats, ChatGPT will not remember any of your previous interactions. It becomes, as OpenAI puts it, a blank slate.
OpenAI offers a privacy portal where you can exercise other rights, including deleting your data. What you can do depends on whether you have an account or not.
If there is a specific piece of information or chat log that you would like to remove from OpenAI’s training data set, you can submit an OpenAI Personal Data Removal Request. This will remove your personal data from ChatGPT’s training data and from potentially showing up in outputs, but OpenAI does say it may still hold on to it for other purposes.
The only way to delete all of your ChatGPT data is to delete your entire account. This will permanently ban your account’s email address and phone number from being able to use ChatGPT. So be careful with this one!
OpenAI gives you the option to opt-out of your data being used in training its future models. One way to do this is to enable temporary chats, which we walk through above. If you prefer to use normal chats – where your chat log is stored indefinitely – you can also make a separate request that your conversations be omitted from OpenAI’s training data.
Wording it that way makes you sound like a villain, but you’re just protecting your data.
You also have the option to download a report containing all of the data OpenAI has on your account, if you have one. You can do this in two places: the privacy portal,(select “Download my data”), or inside of Settings:
Legally speaking, OpenAI only has to honor these requests for residents of certain countries or U.S. states with privacy laws. Some companies, however, honor requests from everyone. So it’s worth exercising your data rights regardless of where you live.
Intern, Don't Sell My Data campaign
R.J. focuses on data privacy issues and the commercialization of personal data in the digital age. Her work ranges from consumer harms like scams and data breaches, to manipulative targeted advertising, to keeping kids safe online. In her work at Frontier Group, she has authored research reports on government transparency, predatory auto lending and consumer debt. Her work has appeared in WIRED magazine, CBS Mornings and USA Today, among other outlets. When she’s not protecting the public interest, she is an avid reader, fiction writer and birder.
Report ●