Microsoft said Monday that it has taken steps to fix a glaring security flaw that led to the exposure of 38 terabytes of private data.
The leak was discovered in the company’s AI GitHub repository and is said to have been inadvertently released while publishing a large amount of open-source training data, Wiz said. It also included a disk backup of two former employees’ workstations containing secrets, keys, passwords and more than 30,000 internal Teams messages.
A repository named “robust-models-transfer,” is no longer accessible. Before it was discontinued, it contained source code and machine learning models related to a 2020 research paper with name “Do Adversely Robust ImageNet Models Transfer Better?”
“The revelation came as a result of over-permissiveness SAS Token – an Azure feature that allows users to share data in a way that is hard to track and revoke,” Wiz he said in message. The issue was reported to Microsoft on June 22, 2023.

Specifically, the README.md file in the repository instructed developers to download models from an Azure Storage URL, which also inadvertently granted access to the entire storage account, exposing additional private data.
“In addition to the overly permissive scope of access, the token was also misconfigured to allow ‘full control’ permissions instead of read-only permissions,” said Wiz researchers Hillai Ben-Sasson and Ronny Greenberg. “This means that an attacker could not only view all files in a storage account, but could also delete and overwrite existing files.”

In response to Microsoft’s findings he said its investigation found no evidence of unauthorized exposure of customer data and that “no other internal services were compromised as a result of this issue.” He also emphasized that customers do not need to take any action on their part.
The Windows developers further noted that they have revoked the SAS token and blocked all external access to the storage account. The problem was solved two after responsible disclosure.

To mitigate these risks going forward, the company has expanded its secret scan service to include any SAS token that may have overly permissive expirations or permissions. It said it also identified a bug in its scanning system that flagged a particular SAS URL in the repository as a false positive.
“Due to the lack of security and management of Account SAS tokens, they should be considered as sensitive as the account key itself,” the researchers said. “Therefore, it is strongly recommended to avoid using Account SAS for external sharing. Errors in token creation can easily go unnoticed and expose sensitive data.”
Identity is the new endpoint: Mastering SaaS security in the modern age
Dive deep into the future of SaaS security with Mao Bin, CEO of Adaptive Shield. Learn why identity is the new endpoint. Secure your spot now.
This is not the first time that misconfigured Azure storage accounts have surfaced. In July 2022 JUMPSEC Labs highlighted a scenario in which a threat actor could take advantage of such accounts to gain access to a company’s on-premise environment.
The development is the latest security flaw at Microsoft and comes nearly two weeks after the company revealed that hackers based in China were able to infiltrate the company’s systems and steal a highly sensitive signature key by compromising an engineer’s company account and possibly accessing a crash dump of the consumer signature system.
“AI is unlocking huge potential for technology companies. But as data scientists and engineers race to bring new AI solutions into production, the vast amount of data they work with requires additional security controls and security,” said CTO Wiz and co-founder of Ami Luttwak statement.
“This emerging technology requires large data sets to train on. As many development teams need to manipulate vast amounts of data, share it with their colleagues, or collaborate on public open-source projects, cases like Microsoft’s are increasingly difficult monitor and avoid them.” “