LinkedIn, the professional networking giant owned by Microsoft, is facing criticism for allegedly utilizing user data to train its AI models without explicit disclosure in its privacy policy. This practice, which primarily affected users in the U.S., was brought to light by TechCrunch, citing 404 media as the original source. The revelation comes as a surprise, particularly because users in the EU, European Economic Area, and Switzerland were likely exempt due to stricter data privacy regulations in these regions.
Initially, LinkedIn failed to update its privacy policy to reflect this data usage, despite users having an opt-out toggle in their settings. This toggle reveals that LinkedIn collects personal data to train its “content creation AI models,” a practice that was not clearly communicated until recently. Following the controversy, LinkedIn has updated its terms of service to reflect this data usage. However, it’s generally considered best practice to inform users of significant changes, such as using their data for new purposes, before implementing them.
LinkedIn has not yet commented on the situation, but the company uses this data to train its own AI models, including those for writing suggestions and post recommendations. It has also indicated that generative AI models on its platform could be trained by “another provider,” such as its parent company, Microsoft.
Users can opt out of this data scraping by following these steps:
1. Go to the “Data Privacy” section in the settings menu.
2. Select “Data for Generative AI improvement.”
3. Toggle off the option for “Use my data for training content creation AI models.”
This incident echoes a broader trend within the AI industry, where companies prioritize acquiring high-quality data for their models. Earlier this year, AI startups OpenAI and Anthropic were accused of ignoring web scraping rules, highlighting a growing concern about the ethical and legal implications of data collection for AI development. In April, OpenAI received backlash for reportedly transcribing over a million hours of YouTube videos to train its GPT-4 model.
OpenAI, the company behind the popular AI chatbot ChatGPT, also entered into a multi-year partnership with News Corp, granting them access to the media giant’s news content. This deal signaled the AI industry’s significant push to secure high-quality training data.
In July, Elon Musk’s social media platform X (formerly Twitter) was discovered to be sharing user posts with xAI’s Grok for training purposes, further emphasizing the industry’s reliance on user data.
The controversy surrounding LinkedIn’s use of user data underscores the growing importance of transparency and user consent in the age of AI. As AI technology advances, it’s crucial for companies to prioritize ethical data practices and clearly communicate how they utilize user data.