Twitter introduces aliases for contributors to its Birdwatch moderation program

Twitter is introducing aliases for participants in its Birdwatch moderation tool so they don’t have to include their usernames in notes they leave on others’ tweets, the company announced in a blog post Monday. The social media platform launched the pilot of Birdwatch in January as a way to crowdsource fact-checking on tweets that might contain misleading or inaccurate information. But the company said contributors in the pilot Birdwatch program “overwhelmingly voiced a preference for contributing under aliases. This preference was strongest for women and Black contributors.”

Twitter said its research shows that aliases have the potential to reduce bias by putting the focus not on the author of a Birdwatch note but on the note’s content. It also found that aliases may help to “reduce polarization by helping people feel comfortable crossing partisan lines.”

Twitter introduced a pilot of the Birdwatch program in January, which allows participating users to fact-check tweets and add notes with additional context. Birdwatch participants can also rate each others’ notes. The notes aren’t otherwise visible on Twitter but are displayed on the public Birdwatch website. Applicants to the Birdwatch program are asked to promise to act in good faith and “be helpful, even to those who disagree,” as conditions for participating: “Genuinely and constructively contribute to help others stay informed. Do not attempt to game or manipulate the system.”

Twitter also said Monday it was rolling out Birdwatch profile pages “to ensure this change doesn’t come at the expense of accountability.” This will make users’ past Birdwatch contributions visible and allow contributors to be “accountable” to the ratings their notes receive.

For people participating in the Birdwatch pilot who contributed under their Twitter usernames prior to Monday, all previous contributions will now appear to come from whatever alias they choose, not their Twitter username. “That said, if someone who previously read one of your notes happened to recall the username that wrote it, they could possibly infer your alias,” the company noted, adding that users could opt to delete all of their prior Birdwatch contributions by contacting Twitter directly in a DM to @birdwatch.

Repost: Original Source and Author Link


Microsoft buys Two Hat to improve Xbox community moderation

On Friday, Microsoft , a company best known for its AI content moderation tools. Financial details have not been disclosed, but Microsoft did share its vision for how they’ll work together moving forward. Over the years, the two companies have frequently collaborated to make Xbox Live and other gaming communities safer, and by the sounds of it, that will be the focus of Two Hat moving forward.

“We have partnered with Xbox and the Microsoft team for several years and share the passion and drive to make meaningful change in the advancement of online civility and citizenship,” said Two Hat founder Chris Priebe and CEO Steve Parkis in a . “We are committed to ensuring safety, inclusion and online health and wellness are always at the forefront of our work and through joining Microsoft, we can provide the greatest concentration of talent, resources and insight necessary to further this vision.”

Before today’s announcement, Microsoft was only one of Two Hat’s customers, and that won’t change following the acquisition. “This is a deep investment in assisting and serving Two Hat’s existing customers, prospective new customers and multiple product and service experiences here at Microsoft,” the company said. “With this acquisition, we will help global online communities to be safer and inclusive for everyone to participate, positively contribute and thrive.”

Since 2019, Microsoft has placed an emphasis on . “Gaming is for everyone,” Xbox chief Phil Spencer said at the time. This acquisition should tie in nicely with that goal.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.

Repost: Original Source and Author Link


Microsoft acquires AI-powered moderation platform Two Hat

Microsoft today announced that it acquired Two Hat, an AI-powered content moderation platform, for an undisclosed amount. According to Xbox product services CVP Dave McCarthy, the purchase will combine the technology, research capabilities, teams, and cloud infrastructure of both companies to serve Two Hat’s existing and new customers and “multiple product and service experiences: at Microsoft.

“Working with the diverse and experienced team at Two Hat over the years, it has become clear that we are fully aligned with the core values inspired by the vision of founder, Chris c, to deliver a holistic approach for positive and thriving online communities,” McCarthy said in a blog post. “For the past few years, Microsoft and Two Hat have worked together to implement proactive moderation technology into gaming and non-gaming experiences to detect and remove harmful content before it ever reaches members of our communities.”


According to the Pew Research Center, 4 in 10 Americans have personally experienced some form of online harassment. Moreover, 37% of U.S.-based internet users say they’ve been the target of severe attacks — including sexual harassment and stalking — based on their sexual orientation, religion, race, ethnicity, gender identity, or disability. Children, in particular, are the subject of online abuse, with one survey finding a 70% increase in cyberbullying on social media and gaming platforms during the pandemic.

Priebe founded Two Hat in 2012 when he left his position as a senior app security specialist at Disney Interactive, Disney’s game development division. A former lead developer on the safety and security team for Club Penguin, Priebe was driven by a desire to tackle the issues of cyberbullying and harassment on the social web.

Today, Two Hat claims its content moderation platform — which combines AI, linguistics, and “industry-leading management best practices” — classifies, filters, and escalates more than a trillion human interactions including messages, usernames, images, and videos a month. The company also works with Canadian law enforcement to train AI to detect new child exploitative material, such as content likely to be pornographic.

“With an emphasis on surfacing online harms including cyberbullying, abuse, hate speech, violent threats, and child exploitation, we enable clients across a variety of social networks across the globe to foster safe and healthy user experiences for all ages,” Two Hat writes on its website.

Microsoft partnership

Several years ago, Two Hat partnered with Microsoft’s Xbox team to apply its moderation technology to communities in Xbox, Minecraft, and MSN. Two Hat’s platform allows users to decide the content they’re comfortable seeing — and what they aren’t — which Priebe believes is a key differentiator compared with AI-powered moderation solutions like Sentropy and Jigsaw Labs’ Perspective API.

“We created one of the most adaptive, responsive, comprehensive community management solutions available and found exciting ways to combine the best technology with unique insights,” Priebe said in a press release. “As a result, we’re now entrusted with aiding online interactions for many of the world’s largest communities.”

It’s worth noting that semi-automated moderation remains an unsolved challenge. Last year, researchers showed that Perceive, a tool developed by Google and its subsidiary Jigsaw, often classified online comments written in the African American vernacular as toxic. A separate study revealed that bad grammar and awkward spelling — like “Ihateyou love,” instead of “I hate you,” — make toxic content far more difficult for AI and machine detectors to spot.

As evidenced by competitions like the Fake News Challenge and Facebook’s Hateful Memes Challenge, machine learning algorithms also still struggle to gain a holistic understanding of words in context. Revealingly, Facebook admitted that it hasn’t been able to train a model to find new instances of a specific category of disinformation: misleading news about COVID-19. And Instagram’s automated moderation system once disabled Black members 50% more often than white users.

But McCarthy expressed confidence in the power of Two Hat’s product, which includes a user reputation system, supports 20 languages, and can automatically suspend, ban, and mute potentially abusive members of communities.

“We understand the complex challenges organizations face today when striving to effectively moderate online communities. In our ever-changing digital world, there is an urgent need for moderation solutions that can manage online content in an effective and scalable way,” he said. “We’ve witnessed the impact they’ve had within Xbox, and we are thrilled that this acquisition will further accelerate our first-party content moderation solutions across gaming, within a broad range of Microsoft consumer services, and to build greater opportunity for our third-party partners and Two Hat’s existing clients’ use of these solutions.”


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link

Tech News

How biased algorithms and moderation are censoring activists on social media

Following Red Dress Day on May 5, a day aimed to raise awareness for Missing and Murdered Indigenous Women and Girls (MMIWG), Indigenous activists and supporters of the campaign found posts about MMIWG had disappeared from their Instagram accounts. In response, Instagram released a tweet saying that this was “a widespread global technical issue not related to any particular topic,” followed by an apology explaining that the platform “experienced a technical bug, which impacted millions of people’s stories, highlights and archives around the world.”

Creators, however, said that not all stories were affected.

And this is not the first time social media platforms have been under scrutiny because of their erroneous censoring of grassroots activists and racial minorities.

Many Black Lives Matter (BLM) activists were similarly frustrated when Facebook flagged their accounts, but didn’t do enough to stop racism and hate speech against Black people on their platform.

So were these really about technical glitches? Or did they result from the platforms’ discriminatory and biased policies and practices? The answer lies somewhere in between.

Towards automated content moderation

Every time an activist’s post is wrongly removed, there are at least three possible scenarios.

First, sometimes the platform deliberately takes down activists’ posts and accounts, usually at request of and/or in co-ordination with the government. This happened when Facebook and Instagram removed posts and accounts of Iranians who expressed support for the Iranian general Qassem Soleiman.

In some countries and disputed territories, such as Kashmir, Crimea, Western Sahara and Palestinian territories, platforms censored activists and journalists to allegedly maintain their market access or to protect themselves from legal liabilities.

Second, a post can be removed through a user-reporting mechanism. To handle unlawful or prohibited communication, social media platforms have indeed primarily relied on users reporting.

Applying community standards developed by the platform, content moderators would then review reported content and determine whether a violation had occurred. If it had, the content would be removed, and, in the case of serious or repeat infringements, the user may be temporarily suspended or permanently banned.

This mechanism is problematic. Due to the sheer volume of reports received on a daily basis, there are simply not enough moderators to review each report adequately. Also, complexities and subtleties of language pose real challenges. Meanwhile, marginalized groups reclaiming abusive terms for public awareness, such as BLM and MMIWG, can be misinterpreted as being abusive.

Further, in flagging content, users tend to rely on partisanship and ideology. The user reporting approach is driven by popular opinion of a platform’s users while potentially repressing the right to unpopular speech.

Such approach also emboldens freedom to hate, where users exercise their right to voice their opinions while actively silencing others. A notable example is the removal by Facebook of “Freedom for Palestine,” a multi-artist collaboration posted by Coldplay, after a number of users reported the song as “abusive.”

Third, platforms are increasingly using artificial intelligence (AI) to help identify and remove prohibited content. The idea is that complex algorithms that use natural language processing can flag racist or violent content faster and better than humans possibly can. During the COVID-19 pandemic, social media companies are relying more on AI to cover for tens of thousands of human moderators who were sent home. Now, more than ever, algorithms decide what users can and cannot post online.

Algorithmic biases

There’s an inherent belief that AI systems are less biased and can scale better than human beings. In practice, however, they’re easily disposed to error and can impose bias on a colossal systemic scale.

In two 2019 computational linguistic studies, researchers discovered that AI intended to identify hate speech may actually end up amplifying racial bias.

In one study, researchers found that tweets written in African American English commonly spoken by Black Americans are up to twice more likely to be flagged as offensive compared to others. Using a dataset of 155,800 tweets, another study found a similar widespread racial bias against Black speeches.

What’s considered offensive is bound to social context; terms that are slurs when used in some settings may not be in others. Algorithmic systems lack an ability to capture nuances and contextual particularities, which may not be understood by human moderators who test data used to train these algorithms either. This means natural language processing which is often perceived as an objective tool to identify offensive content can amplify the same biases that human beings have.

Algorithmic bias may jeopardize some people who are already at risk by wrongly categorizing them as offensive, criminals or even terrorists. In mid 2020, Facebook deleted at least 35 accounts of Syrian journalists and activists on the pretext of terrorism while in reality, they were campaigning against violence and terrorism.

MMIWG, BLM and the Syrian cases exemplify the dynamic of “algorithms of opression” where algorithms reinforce older oppressive social relations and re-install new modes of racism and discrimination.

While AI is celebrated as autonomous technology that can develop away from human intervention, it is inherently biased. The inequalities that underpin bias already exist in society and influence who gets the opportunity to build algorithms and their databases, and for what purpose. As such, algorithms do not intrinsically provide ways for marginalized people to escape discrimination, but they also reproduce new forms of inequality along social, racial and political lines.

Despite the apparent problems, algorithms are here to stay. There is no silver bullet, but one can take steps to minimize bias. First is to recognize that there’s a problem. Then, making a strong commitment to root out algorithmic biases.

Bias can infiltrate the process anywhere in designing algorithms.

The inclusion of more people from diverse backgrounds within this process — Indigenous, racial minorities, women and other historically marginalized groups — is one of important steps to help mitigate the bias. In the meantime, it is important to push platforms to allow for as much transparency and public oversight as possible.

This article by Merlyna Lim, Canada Research Chair in Digital Media & Global Network Society and Founding Director of ALiGN Media Lab, Carleton University and Ghadah Alrasheed, Post-doctoral Fellow, Interim co-Director of ALiGN Media Lab, Carleton University, is republished from The Conversation under a Creative Commons license. Read the original article.

Repost: Original Source and Author Link


Facebook is now using AI to sort content for quicker moderation

Facebook has always made it clear it wants artificial intelligence to handle more moderation duties on its platforms. Today, it announced its latest step toward that goal: putting machine learning in charge of its moderation queue.

Here’s how moderation works on Facebook. Posts that are thought to violate the company’s rules (which includes everything from spam to hate speech and content that “glorifies violence”) are flagged, either by users or machine learning filters. Some very clear-cut cases are dealt with automatically (responses could involve removing a post or blocking an account, for example) while the rest go into a queue for review by human moderators.

Facebook employs about 15,000 of these moderators around the world, and has been criticized in the past for not giving these workers enough support, employing them in conditions that can lead to trauma. Their job is to sort through flagged posts and make decisions about whether or not they violate the company’s various policies.

In the past, moderators reviewed posts more or less chronologically, dealing with them in the order they were reported. Now, Facebook says it wants to make sure the most important posts are seen first, and is using machine learning to help. In the future, an amalgam of various machine learning algorithms will be used to sort this queue, prioritizing posts based on three criteria: their virality, their severity, and the likelihood they’re breaking the rules.

Facebook’s old system of moderation, combining proactive moderation by ML filters and reactive reports from Facebook users.
Image: Facebook

The new moderation workflow, which now uses machine learning to sort the queue of posts for review by human moderators.
Image: Facebook

Exactly how these criteria are weighted is not clear, but Facebook says the aim is to deal with the most damaging posts first. So, the more viral a post is (the more it’s being shared and seen) the quicker it’ll be dealt with. The same is true of a post’s severity. Facebook says it ranks posts which involve real-world harm as the most important. That could mean content involving terrorism, child exploitation, or self-harm. Posts like spam, meanwhile, which are annoying but not traumatic, are ranked as least important for review.

“All content violations will still receive some substantial human review, but we’ll be using this system to better prioritize [that process],” Ryan Barnes, a product manager with Facebook’s community integrity team, told reporters during a press briefing.

Facebook has shared some details on how its machine learning filters analyze posts in the past. These systems include a model named “WPIE,” which stands for “whole post integrity embeddings” and takes what Facebook calls a “holistic” approach to assessing content.

This means the algorithms judge various elements in any given post in concert, trying to work out what the image, caption, poster, etc., reveal together. If someone says they’re selling a “full batch” of “special treats” accompanied by a picture of what look to be baked goods, are they talking about Rice Krispies squares or edibles? The use of certain words in the caption (like “potent”) might tip the judgment one way or the other.

Facebook uses various machine learning algorithms to sort content, including the “holistic” assessment tool known as WPIE.
Image: Facebook

Facebook’s use of AI to moderate its platforms has come in for scrutiny in the past, with critics noting that artificial intelligence lacks a human’s capacity to judge the context of a lot of online communication. Especially with topics like misinformation, bullying, and harassment, it can be near impossible for a computer to know what it’s looking at.

Facebook’s Chris Palow, a software engineer in the company’s interaction integrity team, agreed that AI had its limits, but told reporters that the technology could still play a role in removing unwanted content. “The system is about marrying AI and human reviewers to make less total mistakes,” said Palow. “The AI is never going to be perfect.”

When asked what percentage of posts the company’s machine learning systems classify incorrectly, Palow didn’t give a direct answer, but noted that Facebook only lets automated systems work without human supervision when they are as accurate as human reviewers. “The bar for automated action is very high,” he said. Nevertheless, Facebook is steadily adding more AI to the moderation mix.

Correction: An earlier version of this article incorrectly gave Chris Palow’s name as Chris Parlow. We regret the error.

Repost: Original Source and Author Link


Pinterest details the AI that powers its content moderation

Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.

Pinterest this morning peeled back the curtains on the AI and machine learning technologies it’s using to combat harmful content on its platform. Leveraging algorithms to automatically detect adult content, hateful activities, medical misinformation, drugs, graphic violence, and more before it’s reported, the company says that policy-violating reports per impression have declined by 52% since fall 2019, when the technologies were first introduced. And reports for self-harm content have decreased by 80% since April 2019.

One of the challenges in building multi-category machine learning models for content safety is the scarcity of labeled data, forcing engineers to use simpler models that can’t be extended to multi-model inputs. Pinterest solves this problem with a system trained on millions of human-reviewed Pins, consisting of both user reports and proactive model-based sampling from its Trust and Safety operations team, which assigns categories and takes action on violating content. The company also employs a Pin model trained using a mathematical, model-friendly representation of Pins based on their keywords and images, aggregated with another model to generate scores that indicate which Pinterest boards might be in violation.

“We’ve made improvements to the information derived by optical character recognition on images and have deployed an online, near-real-time, version of our system. Also new is the scoring of boards and not just Pins,” Vishwakarma Singh, head of Pinterest’s trust and safety machine learning team, told VentureBeat via email. “An impactful multi-category [model] using multi-modal inputs — embeddings and text — for content safety is a valuable insight for decision makers … We use a combination of offline and online models to get both performance and speed, providing a system design that’s a nice learning for others and generally applicable.”

Pinterest content moderation

In production, Pinterest employs a family of models to proactively detect policy-violating Pins. When enforcing policies across Pins, the platform groups together Pins with similar images and identifies them by a unique hash called “image-signature.” Models generate scores for each image-signature, and based on these scores, the same content moderation decision is applied to all Pins with the same image-signature.

For example, one of Pinterest’s models identifies Pins that it believes violates the platform’s policy on health misinformation. Trained using labels from Pinterest, the model internally finds keywords or text associated with misinformation and blocks pins with that language while at the same time identifying visual representations associated with medical misinformation. It accounts for factors like image and URL and blocks any images online across Pinterest search, the home feed, and related pins, according to Singh.

Since users usually save thematically related Pins together as a collection on boards around topics like recipes, Pinterest deployed a machine learning model to produce scores for boards and enforce board-level moderation. A Pin model trained using only embeddings — i.e., representations — generates content safety scores for each Pinterest board. An embedding for the boards is constructed by aggregating the embeddings of the most recent Pins saved to them. When fed into the Pin model, these embeddings produce a content safety score for each board, allowing Pinterest to identify policy-violating boards without training a model for boards.

“These technologies, along with an algorithm that rewards positive content, and policy and product updates such as blocking anti-vaccination content, prohibiting culturally insensitive ads, prohibiting political ads, and launching compassionate search for mental wellness, are the foundation for making Pinterest an inspiring place online,” Singh said. “Our work has demonstrated the impact graph convolutional methods can have in a production recommender systems, as well as other graph representation learning problems at large scale, including knowledge graph reasoning and graph clustering.”


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Repost: Original Source and Author Link