Microsoft employees accidentally left a storage bucket of open-source data exposed on GitHub. This data leak was paired with another Microsoft gaming data leak involving Xbox documents.
In the former data leak, about 38TB of data has been reported to have been leaked by Microsoft researchers. This Microsoft data leak was first found by Wiz, a cloud security service provider.
Researchers also found that the data released on Microsoft’s AI GitHub repository included nearly 30,000 internal Teams messages. The Microsoft GitHub data leak was due to a misconfiguration of a SAS token.
SAS tokens allow sharing links to members so they can access the data in an Azure Storage account as allowed in its accessibility settings.
Microsoft Gaming Data Leak
A team of Bloomberg reporters uncovered a confidential Xbox data leak, initially prompting speculation about a potential Federal Trade Commission error, although subsequent investigation revealed this not to be the case.
“Microsoft Corp. mistakenly uploaded confidential information about its video-game operators to a federal court website, according to a person familiar with the matter and a post from a Federal Trade Commission employee,” read the Bloomberg report about the Microsoft gaming data leak.
Addressing the increased hacking and the Microsoft gaming data leak, a malware repository service VX-Underground tweeted the images of data likely exposed.
They noted that besides the recent Microsoft gaming data leak, and the Azure public data release, a Chinese cyber espionage group spied on outlook emails and communications of US government employees.
Second Microsoft Data Leak by its Employees
The Artificial Intelligence (AI) research team from Microsoft was publishing a bucket of open-source information related to training on GitHub which was when it was left open. The Microsoft GitHub data was being shared through the Azure feature of SAS tokens.
It has been found that the leaked data from Microsoft contained disk backup from two employee’s workstations. The Microsoft data was in the Azure Storage accounts of the two employees.
Stating the importance of data security pertaining to training in AI, the Wiz blog noted, “This case is an example of the new risks organizations face when starting to leverage the power of AI more broadly, as more of their engineers now work with massive amounts of training data.”
Wiz research team regularly monitors cloud-hosted data which led to the revelation of the accidental Microsoft data leak on GitHub.
Details About the Exposed Microsoft Repository
The Microsoft repository was called robust-models-transfer and it belonged to the technology leader’s AI research division that offered open-source code and AI models. This AI data is used to facilitate image recognition.
Microsoft staff shared the data and asked users to download the models from the Azure Storage URL as shown in the screenshot above.
“However, this URL allowed access to more than just open-source models. It was configured to grant permissions on the entire storage account, exposing additional private data by mistake,” added the Wiz report.
Upon investigation, researchers found that the account sharing the information released documents amounting to 38TB leading to the Microsoft data leak. Employee’s personal computer backup was also said to have been exposed in this incident.
Found data in the Microsoft GitHub leak also included passwords to Microsoft services, secret keys, and Teams messages of 359 office employees.
Configuration Left Unchanged, Microsoft’s Response to Data Leak
“In addition to the overly permissive access scope, the token was also misconfigured to allow “full control” permissions instead of read-only,” read the Wiz blog addressing the configuration that was not properly marked.
Wiz researchers also concluded that since the file format was in ckpt created by the TensorFlow library which was formatted using Python’s pickle format, it could be exploited to run malicious codes.
Microsoft’s Security Response Center responded to the data leak reports by clarifying that no customer data was exposed. Internal services were free from impact due to the Microsoft Azure data release.
The technology giant reassured that following the reports, they have expanded GitHub’s secret spanning service that is accountable for monitoring its public open source code changes for plaintext exposure.
Media Disclaimer: This report is based on internal and external research obtained through various means. The information provided is for reference purposes only, and users bear full responsibility for their reliance on it. The Cyber Express assumes no liability for the accuracy or consequences of using this information.