Last Updated:
The company assures that its GenAI tools are not being trained with YouTube videos
Apple Intelligence features will be make their way to iPhones and Mac later this year but was the AI model trained using stolen data?
Apple has recently confirmed that its open-source Efficient Language Models (OpenELM) AI model, which the company released in April, doesn’t power any of its AI or machine learning features including the Apple Intelligence, as per 9To5Mac. This comes days after an investigation found that Apple and other tech giants had used thousands of YouTube subtitles to train their AI models.
As mentioned in the report, Apple said that it developed the OpenELM model to contribute to the research community and advance open-source large language model development. Earlier, Apple researchers have described OpenELM as a state-of-the-art open language model.
According to the brand, the OpenELM was built solely for research purposes and is not intended to power any Apple Intelligence features. The AI model was released as open source and is widely available on Apple’s Machine Learning Research website.
Last month, a research paper suggested that Apple doesn’t use the user’s data to train its Apple Intelligence. The company stated that the AI models were trained on licensed data, including data selected to enhance specific features, as well as publicly available data collected by our web crawler, AppleBot.
A recent investigation by Wired suggested that several big companies including Apple, NVIDIA, Anthropic, Salesforce and others used subtitles from more than 1,70,000 YouTube videos of popular content creators to train their AI models. This dataset is part of a larger collection called The Pile, from the non-profit EleutherAI.
The tech giant also stated that it will not be releasing any new versions of the OpenELM model.
Anthropic’s spokesperson Jennifer Martinez, on the other hand, told Proof News, the publication that conducted the investigation, “The Pile includes a very small subset of YouTube subtitles. YouTube’s terms cover direct use of its platform, which is distinct from the use of the Pile dataset. On the point about potential violations of YouTube’s terms of service, we’d have to refer you to the Pile authors,&rdquo.
Apple Intelligence features were revealed at the company’s WWDC 2024 event which will be available in some form with the iPhone 16 series launch. These GenAI features will be only available on the iPhone 15 Pro, iPhone 15 Pro Max and iPad and Mac with M1 chipset or later.