A former employee at DeepMind, the Google-owned AI research lab, accuses the company’s human resources department of intentionally delaying its response to her complaints about sexual misconduct in the workplace, as first reported by the Financial Times.
In an open letter posted to Medium, the former employee (who goes by Julia to protect her identity) says she was sexually harassed by a senior researcher for months while working at the London-based company. During this time, she was allegedly subject to numerous sexual propositions and inappropriate messages, including some that described past sexual violence against women and threats of self-harm.
Julia got in contact with the company’s HR and grievance team as early as August 2019 to outline her interactions with the senior researcher, and she raised a formal complaint in December 2019. The researcher in question reportedly wasn’t dismissed until October 2020. He faced no suspension and was even given a company award while HR was processing Julia’s complaint, leaving Julia fearing for her — and her other female colleagues’ — safety.
Although the Financial Times’ report says her case wasn’t fully resolved until seven months after she first reported the misconduct, Julia told The Verge that the whole process actually took 10 months. She claims DeepMind’s communications team used “semantics” to “push back” on the Financial Times’ story and shorten the amount of time it took to address her case.
“It was in fact 10 months, they [DeepMind] argued it was ‘only’ 7 because that’s when the appeal finished, though the disciplinary hearing took another 2 months, and involved more rounds of interviews for me,” Julia said. “My point stands: whether it was 10 months or 7, it was far, far too long.”
Besides believing her case was “intentionally dragged out,” Julia also claims two separate HR managers told her she would face “disciplinary action” if she spoke out about it. Her manager allegedly required her to attend meetings with the senior researcher as well, despite being “partially” aware of her report, the Financial Times says. While Julia herself didn’t sign a non-disclosure agreement, many other DeepMind employees have.
In a separate post on Medium, Julia and others offered several suggestions as to how Alphabet (Google and DeepMind’s parent company) can improve its response to complaints and reported issues, such as doing away with the NDA policy for victims and setting a strict two-month time limit for HR to resolve grievances.
The Alphabet Workers Union also expressed support for Julia in a tweet, noting: “The NDAs we sign should never be used to silence victims of harassment or workplace abuse. Alphabet should have a global policy against this.”
In a statement to The Verge, DeepMind interim head of communications Laura Anderson acknowledged the struggles Julia went through but avoided taking accountability for her experiences. “DeepMind takes all allegations of workplace misconduct extremely seriously and we place our employees’ safety at the core of any actions we take,” Anderson said. “The allegations were investigated thoroughly, and the individual who was investigated for misconduct was dismissed without any severance payments… We’re sorry that our former employee experienced what they did and we recognise that they found the process difficult.”
DeepMind has faced concerns over its treatment of employees in the past. In 2019, a Bloomberg report said DeepMind co-founder Mustafa Suleyman, also known as “Moose,” was placed on administrative leave for the controversy surrounding some of his projects. Suleyman left the company later that year to join Google. In 2021, a Wall Street Journal report revealed that Suleyman was deprived of management duties in 2019 for allegedly bullying staff members. Google also launched an investigation into his behavior at the time, but it never made its findings public.
“If anyone finds themselves in a similar situation: first, right now, before anything bad happens, join a union,” Julia said in response to the broader concerns. “Then if something bad happens: Document everything. Know your rights. Don’t let them drag it out. Stay vocal. These stories are real, they are happening to your colleagues.”
Correction April 5th 6:51PM ET: A previous version of the story stated Julia signed an NDA. She did not, but other DeepMind employees have. We regret the error.
A security researcher has discovered key flaws pertaining to popular ransomware and malware — a state of affairs that could lead to their creators entirely rethinking the approach to infiltrate potential victims.
Currently, among the most active ransomware-based groups are the likes of Conti, REvil, Black Basta, LockBit, and AvosLocker. However, as reported by Bleeping Computer, the malware developed by these cyber gangs has been found to come with crucial security vulnerabilities.
These defects could very well prove to be a damaging revelation for the aforementioned groups — ultimately, such security holes can be targeted in order to prevent what the majority of ransomware is created for; the encryption of files contained within a system.
A security researcher, hyp3rlinx, who specializes in malware vulnerability research, examined the malware strains belonging to the leading ransomware groups. Interestingly, he said the samples were exposed to dynamic link library (DLL) hijacking, which is a method traditionally used by attackers themselves that targets programs via malicious code.
“DLL hijacking works on Windows systems only and exploits the way applications search for and load in memory the Dynamic Link Library (DLL) files they need,” Bleeping Computer explains. “A program with insufficient checks can load a DLL from a path outside its directory, elevating privileges or executing unwanted code.”
The exploits associated with the ransomware samples that were inspected by hyp3rlinx — all of which are derived from Conti, REvil, LockBit, Black Basta, LockiLocker, and AvosLocker — authorize code that can essentially “control and terminate the malware pre-encryption.”
Due to the discovery of these flaws, hyp3rlinx was able to design exploit code that is assembled into a DLL. From here, that code is assigned a certain name, thereby effectively tricking the malicious code into detecting it as its own. The final process involves loading said code so that it commences the process of encrypting the data.
Conveniently, the security researcher uploaded a video that shows how a DLL hijacking vulnerability is used (by ransomware group REvil) to put an end to the malware attack before it can even begin.
The significance of the discovery of these exploits
As highlighted by Bleeping Computer, a typical area of a computer targeted by ransomware is a network location that can house sensitive data. Therefore, hyp3rlinx asserts that after the DLL exploit is loaded by placing that DLL in certain folders, the ransomware process should theoretically be stopped before it can inflict damage.
Malware is capable of evading security mitigation processes, but hyp3rlinx stresses that malicious code is completely ineffective when it faces DLLs.
That said, whether the researcher’s investigation results in long-lasting changes in preventing or at least reducing the impact of ransomware and malware attacks is another question entirely.
“If the samples are new, it is likely that the exploit will work only for a short time because ransomware gangs are quick to fix bugs, especially when they hit the public space,” Bleeping Computer said. “Even if these findings prove to be viable for a while longer, companies targeted by ransomware gangs still run the risk of having important files stolen and leaked, as exfiltration to pressure the victim into paying a ransom is part of this threat actor’s modus operandi.”
Still, the cybersecurity website added that hyp3rlinx’s exploits “could prove useful at least to prevent operational disruption, which can cause significant damage.”
As such, although it’s likely to be patched soon by ransomware groups in the immediate future, finding these exploits is an encouraging first step toward impacting the development and distribution of dangerous code. It may also lead to more advanced mitigation methods to prevent attacks.
There’s a new zero-day issue in Windows, and this time the bug has been disclosed to the public by an angry security researcher. The vulnerability relates to users leveraging the command prompt with unauthorized system privileges to share dangerous content through the network.
According to a report from Bleeping Computer, Abdelhamid Naceri, the security researcher who disclosed this bug, is frustrated with Microsoft over payouts from the bug bounty program. Bounties have apparently been downgraded significantly over the past two years. Naceri isn’t alone, either. One Twitter user reported in 2020 that zero-day vulnerabilities no longer pay $10,000 and are now valued at $1,000. Earlier this month, another Twitter user reported that bounties can be reduced at any time.
Microsoft apparently fixed a zero-day issue with the latest round of “Patch Tuesday” updates, but left another unpatched and incorrectly fixed. Naceri bypassed the patch and found a more powerful variant. The zero-day vulnerability impacts all supported versions of Windows, including Windows 8.1, Windows 10, and Windows 11.
“This variant was discovered during the analysis of CVE-2021-41379 patch. The bug was not fixed correctly, however, instead of dropping the bypass. I have chosen to actually drop this variant as it is more powerful than the original one,” explained Naceri in a GitHub post.
His proof of concept is on GitHub, and Bleeping Computer tested the exploit and ran it. It is also being exploited in the wild with malware, according to the publication.
In a statement, a Microsoft spokesperson said that it will do what is necessary to keep its customers safe and protected. The company also mentioned it is aware of the disclosure opf the latest zero-day vulnerability. It mentioned that attackers must already have access and the ability to run code on a target victim’s machine for it to work.
With the Thanksgiving holiday in the U.S., and the fact that a hacker would need physical access to a PC, it could be a while until a patch is released. Microsoft usually issues fixes on the second Tuesday of each month, known as “Patch Tuesday.” It also tests bug fixes with Windows Insiders first. A fix could come as soon as December 14.
Nearly 2 million terrorist watchlist records, including “no-fly” list indicators, were purportedly exposed online. The list was indexed across multiple search engines on July 19th, but the Department of Homeland Security did not remove it until three weeks later, as first reported by Bleeping Computer Monday.
Security Discovery researcher Volodymyr “Bob” Diachenko discovered the watchlist, which appears to be the product of the Terrorist Screening Center, last month. The files were indexed by multiple search engines in an easily readable format. Records included information like full names, citizenship status, date of birth, passport numbers, and no-fly indicators. No password or separate authentication was necessary to access it, Diachenko wrote in a LinkedIn post Monday.
“I immediately reported it to Department of Homeland Security officials, who acknowledged the incident and thanked me for my work,” Diachenko wrote. “The DHS did not provide any further official comment, though.”
The server was indexed by search engines like Censys and ZoomEye on July 19th. Diachenko discovered the data that day and reported it to the Department of Homeland Security. It wasn’t until August 9th that the server was taken down. It’s unclear if any unauthorized users accessed the data.
The Terrorist Screening Center is a multi-agency center led by the FBI and responsible for managing the US’s terrorist watchlist. It produces a watchlist used by screening agencies like the DHS and Transportation Security Authority (TSA) to identify known or suspected terrorists attempting to enter the country by boarding aircrafts or obtaining visas. Databases like these contain extremely sensitive information related to US national security concerns.
It’s not unusual for innocent people to be put on the FBI’s no-fly list. In 2008, NBC reported that one US airline recorded 9,000 false positives in a single day. In 2010, the American Civil Liberties Union filed a legal challenge on behalf of 10 US citizens or permanent residents who were falsely added to the no-fly list. In 2014, a court ruled that the government must notify citizens and permanent residents when they are placed on the list.
A security researcher has found that certain Wi-Fi networks with the percent symbol (%) in their names can disable Wi-Fi on iPhones and other iOS devices. Carl Schou tweeted that if an iPhone comes within range of a network named %secretclub%power, the device won’t be able to use Wi-Fi or any related features, and even after resetting network settings, the bug may continue to render Wi-Fi on the device unusable.
You can permanently disable any iOS device’s WiFI by hosting a public WiFi named %secretclub%power Resetting network settings is not guaranteed to restore functionality.#infosec#0day
A few weeks ago, Schou and his not-for-profit group, Secret Club, which reverse-engineers software for research purposes, found that if an iPhone connected to a network with the SSiD name %p%s%s%s%s%n it would cause a bug in iOS’ networking stack that would disable its Wi-Fi, and system networking features like AirDrop would become unusable.
the ‘%[character]’ syntax is commonly used in programming languages to format variables into an output string. In C, the ‘%n’ specifier means to save the number of characters written into the format string out to a variable passed to the string format function. The Wi-Fi subsystem probably passes the Wi-Fi network name (SSID) unsanitized to some internal library that is performing string formatting, which in turn causes an arbitrary memory write and buffer overflow. This will lead to memory corruption and the iOS watchdog will kill the process, hence effectively disabling Wi-Fi for the user.
We’ve reached out to Apple to see if it’s working on a fix, and will update if we hear back from them. But as 9to5 Mac notes, the bug can likely be avoided by not connecting to Wi-Fi networks with percent symbols in their names.
IOActive security researcher Josep Rodriquez has warned that the NFC readers used in many modern ATMs and point-of-sale systems are leaving them vulnerable to attacks, Wired reports. The flaws make them vulnerable to a range of problems, including being crashed by a nearby NFC device, locked down as part of a ransomware attack, or even hacked to extract certain credit card data.
Rodriquez even warns that the vulnerabilities could be used as part of a so-called “jackpotting” attack to trick a machine into spitting out cash. However, such an attack is only possible when paired with exploits of additional bugs, and Wired says it was not able to view a video of such an attack because of IOActive’s confidentiality agreement with the affected ATM vendor.
By relying on vulnerabilities in the machines’ NFC readers, Rodriquez’s hacks are relatively easy to execute. While some previous attacks have relied on using devices like medical endoscopes to probe machines, Rodriquez’ can simply wave an Android phone running his software in front of a machine’s NFC reader to exploit any vulnerabilities it might have.
In one video shared with Wired, Rodriquez causes an ATM in Madrid to display an error message, simply by waving his smartphone over its NFC reader. The machine then became unresponsive to real credit cards held up to the reader.
The research highlights a couple of big problems with the systems. The first is that many of the NFC readers are vulnerable to relatively simple attacks, Wired reports. For example, in some cases the readers aren’t verifying how much data they’re receiving, which means Rodriquez was able to overwhelm the system with too much data and corrupt its memory as part of a “buffer overflow” attack.
The second problem is that even once an issue is identified, companies can be slow to apply a patch to the hundreds of thousands of machines in use around the world. Often a machine needs to be physically visited to apply an update, and many don’t receive regular security patches. One company said the problem Rodriquez has highlighted was patched in 2018, for example, but the researcher says he was able to verify that the attack worked in a restaurant in 2020.
Rodriguez plans to present his findings as part of a webinar in the coming weeks to highlight what he says are the poor security measures of embedded devices.
Google has fired Margaret Mitchell, co-lead of the ethical AI team, after she used an automated script to look through her emails in order to find evidence of discrimination against her coworker Timnit Gebru. The news was first reported by Axios.
Mitchell’s firing comes one day after Google announced a reorganization to its AI teams working on ethics and fairness. Marian Croak, a vice president in the engineering organization, is now leading “a new center of expertise on responsible AI within Google Research,” according to a blog post.
Mitchell joined Google in 2016 as a senior research scientist, according to her LinkedIn. Two years later, she helped start the ethical AI team alongside Gebru, a renowned researcher known for her work on bias in facial recognition technology.
In December 2020, Mitchell and Gebru were working on a paper about the dangers of large language processing models when Megan Kacholia, vice president of Google Brain, asked that the article be retracted. Gebru pushed back, saying the company needed to be more open about why the research wasn’t acceptable. Shortly afterwards, she was fired, though Google characterized her departure as a resignation.
After Gebru’s termination, Mitchell became openly critical of Google executives, including Google AI division head Jeff Dean and Google CEO Sundar Pichai. In January, she lost her corporate email access after Google began investigating her activity.
“After conducting a review of this manager’s conduct, we confirmed that there were multiple violations of our code of conduct, as well as of our security policies, which included the exfiltration of confidential business-sensitive documents and private data of other employees,” Google said in a statement to Axios about Mitchell’s firing.
On Friday, Google announced it was making changes to its research and diversity policies, following an investigation into Gebru’s termination. In an internal email, Jeff Dean apologized to staff for how Gebru’s departure was handled. “I heard and acknowledge what Dr. Gebru’s exit signified to female technologists, to those in the Black community and other underrepresented groups who are pursuing careers in tech, and to many who care deeply about Google’s responsible use of AI. It led some to question their place here, which I regret,” he said.
The ethical AI team has been in crisis since Gebru’s firing in December. After the reorganization announcement yesterday, senior researcher Alex Hanna wrote that the team was not aware of Croak’s appointment until the news broke publicly Wednesday night. “We were told to trust the process, trust in decision-makers like Marian Croak to look out for our best interests,” she said on Twitter. “But these decisions were made behind our backs.”
The security researcher who discovered the Krack Wi-Fi vulnerability has discovered a slew of other flaws with the wireless protocol most of us use to power our online lives (via Gizmodo). The vulnerabilities relate to how Wi-Fi handles large chunks of data, with some being related to the Wi-Fi standard itself, and some being related to how it’s implemented by device manufacturers.
The researcher, Mathy Vanhoef, calls the collection of vulnerabilities “FragAttacks,” with the name being a mashup of “fragmentation” and “aggregation.” He also says the vulnerabilities could be exploited by hackers, allowing them to intercept sensitive data, or show users fake websites, even if they’re using Wi-Fi networks secured with WPA2 or even WPA3. They could also theoretically exploit other devices on your home network.
There are twelve different attack vectors that fall under the classification, which all work in different ways. One exploits routers accepting plaintext during handshakes, one exploits routers caching data in certain types of networks, etc. If you want to read all the technical details on how exactly they work, you can check out Vanhoef’s website.
According to The Record, Vanhoef informed the WiFi Alliance about the vulnerabilities that were baked-in to the way Wi-Fi works so they could be corrected before he disclosed them to the public. Vanhoef says that he’s not aware of the vulnerabilities being exploited in the wild. While he points out in a video that some of the vulnerabilities aren’t particularly easy to exploit, he says others would be “trivial” to take advantage of.
Vanhoef points out that some of the flaws can be exploited on networks using the WEP security protocol, indicating that they’ve been around since Wi-Fi was first implemented in 1997 (though if you’re still using WEP, these attacks should be the least of your concerns).
Vanhoef says that the flaws are wide-spread, affecting many devices, meaning that there’s a lot of updating to do.
The thing about updating Wi-Fi infrastructure is that it’s always a pain. For example, before writing this article I went to check if my router had any updates, and realized that I had forgotten my login information (and I suspect I won’t be alone in that experience). There’s also devices that are just plain old, whose manufacturers are either gone or not releasing patches anymore. If you can, though, you should keep an eye on your router manufacturer’s website for any updates that are rolling out, especially if they’re in the advisory list.
Some vendors have already released patches for some of their products, including:
As for anything else you need to do, Vanhoef recommends the usual steps: keep your computers updated, use strong, unique passwords, don’t visit shady sites, and make sure you’re using HTTPS as often as possible. Other than that, it’s mostly being thankful that you’re not in charge of widespread IT infrastructure (my deepest condolences if you, in fact, are).
(Reuters) — Apple said on Monday it has hired former distinguished Google scientist Samy Bengio, who left the search giant amid turmoil in its artificial intelligence research department.
Bengio is expected to lead a new AI research unit at Apple under John Giannandrea, senior vice president of machine learning and AI strategy, two people familiar with the matter said. Giannandrea joined Apple in 2018 after spending about eight years at Google.
Apple declined to comment on Bengio’s role.
Bengio who left Google last week after about 14 years said last month he was pursuing “other exciting opportunities”.
His decision followed Google’s firings of fellow scientists Margaret Mitchell for taking company data and Timnit Gebru after an offer to resign. Mitchell and Gebru had co-led a team researching ethics issues in AI, and had voiced concern about Google’s workplace diversity and approach to reviewing research. Bengio had expressed support for the pair.
As one of the early leaders of the Google Brain research team, Bengio advanced the “deep learning” algorithms that underpin today’s AI systems for analyzing images, speech and other data.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
On February 14, a researcher who was frustrated with reproducing the results of a machine learning research paper opened up a Reddit account under the username ContributionSecure14 and posted ther/MachineLearning subreddit: “I just spent a week implementing a paper as a baseline and failed to reproduce the results. I realized today after googling for a bit that a few others were also unable to reproduce the results. Is there a list of such papers? It will save people a lot of time and effort.”
The post struck a nerve with other users on r/MachineLearning, which is the largest Reddit community for machine learning.
“Easier to compile a list of reproducible ones…,” one user responded.
“Probably 50%-75% of all papers are unreproducible. It’s sad, but it’s true,” another user wrote. “Think about it, most papers are ‘optimized’ to get into a conference. More often than not the authors know that a paper they’re trying to get into a conference isn’t very good! So they don’t have to worry about reproducibility because nobody will try to reproduce them.”
A few other users posted links to machine learning papers they had failed to implement and voiced their frustration with code implementation not being a requirement in ML conferences.
The next day, ContributionSecure14 created “Papers Without Code,” a website that aims to create a centralized list of machine learning papers that are not implementable.
“I’m not sure if this is the best or worst idea ever but I figured it would be useful to collect a list of papers which people have tried to reproduce and failed,” ContributionSecure14wrote on r/MachineLearning. “This will give the authors a chance to either release their code, provide pointers or rescind the paper. My hope is that this incentivizes a healthier ML research culture around not publishing unreproducible work.”
Reproducing the results of machine learning papers
Machine learning researchers regularly publish papers on online platforms such as arXiv and OpenReview. These papers describe concepts and techniques that highlight new challenges in machine learning systems or introduce new ways to solve known problems. Many of these papers find their way into mainstream artificial intelligence conferences such as NeurIPS, ICML, ICLR, and CVPR.
Having source code to go along with a research paper helps a lot in verifying the validity of a machine learning technique and building on top of it. But this is not a requirement for machine learning conferences. As a result, many students and researchers who read these papers struggle with reproducing their results.
“Unreproducible work wastes the time and effort of well-meaning researchers, and authors should strive to ensure at least one public implementation of their work exists,” ContributionSecure14, who preferred to remain anonymous, toldTechTalksin written comments.“Publishing a paper with empirical results in the public domain is pointless if others cannot build off of the paper or use it as a baseline.”
But ContributionSecure14 also acknowledges that there are sometimes legitimate reasons for machine learning researchers not to release their code. For example, some authors may train their models on internal infrastructure or use large internal datasets for pretraining. In such cases, the researchers are not at liberty to publish the code or data along with their paper because of company policy.
“If the authors publish a paper without code due to such circumstances, I personally believe that they have the academic responsibility to work closely with other researchers trying to reproduce their paper,” ContributionSecure14 says. “There is no point in publishing the paper in the public domain if others cannot build off of it. There should be at least one publicly available reference implementation for others to build off of or use as a baseline.”
In some cases, even if the authors release both the source code and data to their paper, other machine learning researchers still struggle to reproduce the results. This can be due to various reasons. For instance, the authors might cherry-pick the best results from several experiments and present them as state-of-the-art achievements. In other cases, the researchers might have used tricks such as tuning the parameters of their machine learning model to the test data set to boost the results. In such cases, even if the results are reproducible, they are not relevant, because the machine learning model has been overfitted to specific conditions and won’t perform well on previously unseen data.
“I think it is necessary to have reproducible code as a prerequisite in order to independently verify the validity of the results claimed in the paper, but [code alone is] not sufficient,” ContributionSecure14 said.
Efforts for machine learning reproducibility
The reproducibility problem is not limited to small machine learning research teams. Even big tech companies that spend millions of dollars on AI research every year often fail to validate the results of their papers. In October 2020, a group of 31 scientists wrote ajoint article inNature, criticizing the lack of transparency and reproducibility in a paper on the use of AI in medical imaging, published by a group of AI researchers at Google. “[The] absence of sufficiently documented methods and computer code underlying the study effectively undermines its scientific value. This shortcoming limits the evidence required for others to prospectively validate and clinically implement such technologies,” the authors wrote. “Scientific progress depends on the ability of independent researchers to scrutinize the results of a research study, to reproduce the study’s main results using its materials, and to build on them in future studies.”
Recent years have seen a growing focus on AI’s reproducibility crisis. Notable work in this regard includes the efforts of Joelle Pineau, machine learning scientist at Montreal’s McGill University and Facebook AI, who has been pushing for transparency and reproducibility of machine learning research at conferences such as NeurIPS.
“Better reproducibility means it’s much easier to build on a paper. Often, the review process is short and limited, and the true impact of a paper is something we see much later. The paper lives on, and as a community we have a chance to build on the work, examine the code, have a critical eye to what are the contributions,” PineautoldNaturein an interview in 2019.
At NeurIPS, Pineau has helped develop standards and processes that can help researchers and reviewers evaluate the reproducibility of machine learning papers. Her efforts have resulted in an increase in code and data submission at NeurIPS.
Another interesting project isPapers With Code (where Papers Without Code gets its name from), a website that provides implementations for scientific research papers published and presented at different venues. Papers With Code currently hosts the implementation of more than 40,000 machine learning research papers.
“PapersWithCode plays an important role in highlighting papers that are reproducible. However, it does not address the problem of unreproducible papers,” ContributionSecure14 said.
When a machine learning research paper doesn’t include the implementation code, other researchers who read it must try to implement it by themselves, a non-trivial process that can take several weeks and ultimately result in failure.
“If they fail to implement it successfully, they might reach out to the authors (who may not respond) or simply give up,” ContributionSecure14 said. “This can happen to multiple researchers who are not aware of prior or ongoing attempts to reproduce the paper, resulting in many weeks of productivity wasted collectively.”
Papers Without Code
Papers Without Code includes asubmission page, where researchers can submit unreproducible machine learning papers along with the details of their efforts, such as how much time they spent trying to reproduce the results. If a submission is valid, Papers Without Code will contact the paper’s original authors and request clarification or publication of implementation details. If the authors do not reply in a timely fashion, the paper will be added to the list ofunreproducible machine learning papers.
“PapersWithoutCode solves the problem of centralizing information about prior or ongoing attempts to reproduce a paper and allows researchers (including the original author) to come together and implement a public implementation,” ContributionSecure14 said. “Once the paper has been successfully reproduced, it can be published on PapersWithCode or GitHub where other researchers can use it. In that sense, I would say the goals of PapersWithoutCode are synergistic with that or PapersWithCode and the ML community at large.”
The hope is that Papers Without Code will help establish a culture that incentivizes reproducibility in machine learning research. So far, the website has received more than 10 requests and one author has already pledged to upload their code.
“I realize that this can be a controversial subject in academia and the top priority is to protect the authors’ reputation while serving the broader ML community,” ContributionSecure14 said.
Papers Without Code can become a hub for creating a dialogue between the original authors of machine learning papers and researchers who are trying to reproduce their work.
“Instead of being a static list of unreproducible work, the hope is to create an environment where researchers can collaborate to reproduce a paper,” ContributionSecure14 said.
Reproducible machine learning research
For instance, if you’re working on a research building on the work done in another paper, you should try out the code or the machine learning model yourself.
“Don’t build off of claims or ‘insights’ that could potentially be unfounded just because the paper says so,” ContributionSecure14 says, adding that this includes papers from large labs or work that has been accepted in a reputable conference.
Another good resource is professor Pineau’s “Machine Learning Reproducibility Checklist.” The checklist provides clear guidelines on how to make the description, code, and data of a machine learning paper clear and reproducible for other researchers.
ContributionSecure14 believes that machine learning researchers can play a crucial role in promoting a culture of reproducibility.
“There is a lot of pressure to publish at the expense of academic depth and reproducibility and there are not many checks and balances to prevent this behavior,” ContributionSecure14 said. “The only way this will change is if the current and future generation of ML researchers prioritize quality over quantity in their own research.”
This article was originally published by Ben Dickson on TechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech and what we need to look out for. You can read the original article here.