Categories
Game

‘GoldenEye 007’ fans are creating a full game mod based on ‘The Spy Who Loved Me’

There’s a mod in the works for Nintendo 64 classic GoldenEye 007 that turns another James Bond film into a full game. Fans are building a playable version of The Spy Who Loved Me, Roger Moore’s third, and some would argue best, Bond movie.

As spotted by , YouTuber Graslu00 posted a playthrough video showing 11 levels of The Spy Who Loved Me 64. The mod depicts the key events and locations of the film, taking Bond from the Alps to the pyramids of Egypt and a supertanker in the Atlantic Ocean. It includes Moore’s likeness, as well as characters such as Anya Amasova (aka Agent XXX) and villain Karl Stromberg. It’s possible to run the mod on an emulator in 4K at 60 frames per second, though you can also play it on an N64 console.

It’s a work in progress, as Graslu00 notes. The build of The Spy Who Loved Me 64 that’s available is a demo of the first three levels with a peek at a planned four-player multiplayer mode. It looks like there’s quite a way for the fans working on the game to go, though. The stage select screen shows 20 levels including, curiously, Bond’s childhood home of Skyfall — that seems to be one of the multiplayer maps.

Meanwhile, there’s an official James Bond title in the works. It emerged in late 2020 that Hitman studio IO Interactive is developing a game that delves into the superspy’s origins. It’s expected to be the first official Bond game since 2012’s 007 Legends.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.

Repost: Original Source and Author Link

Categories
AI

Google releases TF-GNN for creating graph neural networks in TensorFlow Google has released

Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more


Let the OSS Enterprise newsletter guide your open source journey! Sign up here.

Google today released TensorFlow Graph Neural Networks (TF-GNN) in alpha, a library designed to make it easier to work with graph structured data using TensorFlow, its machine learning framework. Used in production at Google for spam and anomaly detection, traffic estimation, and YouTube content labeling, Google says that TF-GNN is designed to “encourage collaborations with researchers in industry.”

Graphs are a set of objects, places, or people and the connections between them. A graph represents the relations (edges) between a collection of entities (nodes or vertices), all of which can store data. Directionality can be ascribed to the edges to describe information, traffic flow, and more.

More often than not, the data in machine learning problems is structured or relational and thus can be described with a graph. Fundamental research on GNNs is decades old, but recent advances have led to great achievements in many domains, like modeling the transition of glass from a liquid to a solid and predicting pedestrian, cyclist, and driver behavior on the road.

TF-GNN

Above: Graphs can model the relationships between many different types of data, including web pages (left), social connections (center), or molecules (right).

Image Credit: Google

Indeed, GNNs can be used to answer questions about multiple characteristics of graphs. By working at the graph level, they can try to predict aspects of the entire graph, for example identifying the presence of certain “shapes” like circles in a graph that might represent close social relationships. GNNs can also be used on node-level tasks to classify the nodes of a graph or at the edge level to discover connections between entities.

TF-GNN

TF-GNN provides building blocks for implementing GNN models in TensorFlow. Beyond the modeling APIs, the library also delivers tooling around the task of working with graph data, including a data-handling pipeline and example models.

Also included with TF-GNN is an API to create GNN models that can be composed with other types of AI models. In addition to this, TF-GNN ships with a schema to declare the topology of a graph (and tools to validate it), helping to describe the shape of training data.

“Graphs are all around us, in the real world and in our engineered systems … In particular, given the myriad types of data at Google, our library was designed with heterogeneous graphs in mind,” Google’s Sibon Li, Jan Pfeifer, Bryan Perozzi, and Douglas Yarrington wrote in the blog post introducing TF-GNN.

TF-GNN adds to Google’s growing collection of TensorFlow libraries, which spans TensorFlow Privacy, TensorFlow Federated, and TensorFlow.Text. More recently, the company open-sourced TensorFlow Similarity, which trains models that search for related items — for example, finding similar-looking clothes and identifying currently playing songs.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link

Categories
AI

AI Weekly: WHO outlines steps for creating inclusive AI health care systems

Where does your enterprise stand on the AI adoption curve? Take our AI survey to find out.


This week, the World Health Organization (WHO) released its first global report on AI in health, along with six guiding principles for design, development, and deployment. The fruit of two years of consultations with WHO-appointed experts, the work cautions against overestimating the benefits of AI while highlighting how it could be used to improve screening for diseases, assist with clinical care, and more.

The health care industry produces an enormous amount of data. An IDC study estimates the volume of health data created annually, which hit over 2,000 exabytes in 2020, will continue to grow at a 48% rate year over year. The trend has enabled significant advances in AI and machine learning, which rely on large datasets to make predictions ranging from hospital bed capacity to the presence of malignant tumors in MRIs. But unlike other domains to which AI has been applied, the sensitivity and scale of health care data makes collecting and leveraging it a formidable challenge.

The WHO report acknowledges this, pointing out that the opportunities brought about by AI are linked with risks. There’s the harms that biases encoded in algorithms could cause patients, communities, and care providers. Systems trained primarily on data from people in high-income countries, for example, may not perform well for low- and middle-income patients. What’s more, unregulated use of AI might undermine the rights of patients in favor of the commercial interests or governments engaged in surveillance.

The datasets used to train AI systems that can predict the onset of conditions like Alzheimer’s, diabetes, diabetic retinopathy, breast cancer, and schizophrenia come from a range of sources. But in many cases, patients aren’t fully aware their information is included. In 2017, U.K. regulators concluded that The Royal Free London NHS Foundation Trust, a division of the U.K.’s National Health Service based in London, provided Google’s DeepMind with data on 1.6 million patients without their consent.

Regardless of the source, this data can contain bias, perpetuating inequalities in AI algorithms trained for diagnosing diseases. A team of U.K. scientists found that almost all eye disease datasets come from patients in North America, Europe, and China, meaning eye disease-diagnosing algorithms are less certain to work well for racial groups from underrepresented countries. In another study, researchers from the University of Toronto, the Vector Institute, and MIT showed that widely used chest X-ray datasets contain racial, gender, and socioeconomic biases.

Further illustrating the point, Stanford researchers found that some AI-powered medical devices approved by the U.S. Food and Drug Administration (FDA) are vulnerable to data shifts and bias against underrepresented patients. Even as AI becomes embedded in more medical devices — the FDA approved over 65 AI devices last year — the accuracy of these algorithms isn’t necessarily being rigorously studied, because they’re not being evaluated by prospective studies.

Experts argue that prospective studies, which collect test data prior to rather than concurrent with deployment, are necessary, particularly for AI medical devices because their actual use can differ from the intended use. For example, most computer-powered diagnostic systems are designed to be decision-support tools rather than primary diagnostic tools. A prospective study might reveal that clinicians are misusing a device for diagnosis, leading to outcomes that might deviate from what’s expected.

Beyond dataset challenges, models lacking peer review can encounter roadblocks when deployed in the real world. Scientists at Harvard found that algorithms trained to recognize and classify CT scans could become biased toward scan formats from certain CT machine manufacturers. Meanwhile, a Google-published whitepaper revealed challenges in implementing an eye disease-predicting system in Thailand hospitals, including issues with scan accuracy.

To limit the risks and maximize the benefits of AI for health, the WHO recommends taking steps to protect autonomy, ensure transparency and explainability, foster responsibility and accountability, and work toward inclusiveness and equity. The recommendations also include promoting well-being, safety, and the public interest, as well as AI that’s responsive and sustainable.

The WHO says redress should be available to people adversely affected by decisions based on algorithms, and also that designers should “continuously” assess AI apps to determine whether they’re aligning with expectations and requirements. In addition, the WHO recommends both governments and companies address disruptions in the workplace caused by automated systems, including training for health care workers to adapt to the use of AI.

“AI systems should … be carefully designed to reflect the diversity of socioeconomic and health care settings,” the WHO said in a press release. “They should be accompanied by training in digital skills, community engagement, and awareness-raising, especially for millions of healthcare workers who will require digital literacy or retraining if their roles and functions are automated, and who must contend with machines that could challenge the decision making and autonomy of providers and patients.”

As new examples of problematic AI in health care emerge, from widely deployed but untested algorithms to biased dermatological datasets, it’s becoming critical that stakeholders follow accountability steps like those outlined by the WHO. Not only will it foster trust in AI systems, but it could improve care for the millions of people who might be subjected to AI-powered diagnostic systems in the future.

“Machine learning really is a powerful tool, if designed correctly — if problems are correctly formalized and methods are identified to really provide new insights for understanding these diseases,” Dr. Mihaela van der Schaar, a Turing Fellow and professor of machine learning, AI, and health at the University of Cambridge and UCLA, said during a keynote at the ICLR conference in May 2020. “Of course, we are at the beginning of this revolution, and there is a long way to go. But it’s an exciting time. And it’s an important time to focus on such technologies.”

For AI coverage, send news tips to Kyle Wiggers — and be sure to subscribe to the AI Weekly newsletter and bookmark our AI channel, The Machine.

Thanks for reading,

Kyle Wiggers

AI Staff Writer

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link

Categories
Tech News

Sellful is a turn-key business softwate for creating an entire business infrastructure in seconds

TLDR: Sellful is a white label service that puts all of your digital business tools into one place for efficient retailing, manufacturing, sales, marketing, customer relations, and more.

Trying to launch an online business is not as simple as building a website and setting it live. Sure, that may be where it all begins, but if you really want this endeavor to take off and fly, there are so many factors you have to consider. 

Do you have your marketing plan squared away? What’s your answer for handling customer relations? How does invoicing work? And do you have a structure in place to conceive, staff, and track progress on your new company’s biggest projects? 

If you’ve got answers for those questions, it’s likely the result of having access to a handful of different apps, each with their own purchase and membership fees. Or you’ve got Sellful, the service that is making some noises as a one-stop home for every feature a new digital business need to grow and thrive.

Through their simple interface that keeps everything within arm’s reach, Sellful users can start by creating their business website from the more than 2,300 white label templates available. Along with Sellful’s archive including millions of high quality images, you can fashion a site that’s completely unique to your business. And unlike site builders like WordPress or Wix, there’s no company branding on a Sellful site…well, none except your company, of course.

Once the site is ready, Sellful is loaded with features to support that business. If you want to sell, your site can be an online shop as well.  You can keep track of customers with a built-in customer relations management (CRM) system. You can collect emails and phone numbers through your site, then send newsletters with important details on sales, discounts and more. 

You can create membership programs, book appointments, manage inventory, and even handle business admin duties like invoicing or collecting payment through online gateways like PayPal, Stripe and more right through your Sellful desktop.

Marketing outreach, social media tools, even project management templates all fall under the Sellful banner, offering members incredible functionality and cross-disciplinary help, all from one convenient app.

You give the Sellful White Label Website Builder and Software universe of tools a try with one of their Basic plans, which includes coverage for one website with the ability to send up to 10,000 emails a month for just $79, a savings of 90 percent off the standard fee. 

And for those with a broader empire in mind, you can get a Sellfull Small Business Plan (2 websites with up to 50,000 emails) for $149, a Sellful ERP Plan (3 websites with 100,000 emails) for $199, and an Agency Plan (up to 10 websites with 1 million free emails a month) for only $499.

Prices are subject to change.

Repost: Original Source and Author Link

Categories
Security

DHS reportedly creating cybersecurity regulations for pipelines

The hack of the Colonial Pipeline — which kneecapped oil availability on the East Coast for almost two weeks — was as disastrous as it was likely preventable. A branch of the Department of Homeland Security, however, is hoping to correct course by changing the rules on cybersecurity and disclosure for Colonial and other companies in the pipeline industry.

As reported by The Washington Post, the Transportation Security Administration (yes, the same sub-branch of DHS everyone associates with taking their shoes off in airports) will be requiring pipeline companies to report breaches and other cybersecurity incidents, with additional rules on how to keep these critical infrastructure systems secure from digital threats arriving “in coming weeks.” Any sort of abnormality which could, say, cause a company to part with $4.4 million in ransom money, would need to be reported to both the TSA and the Cybersecurity and Infrastructure Security Agency (CISA).

Incidentally, guidelines already exist to keep these sorts of systems secure — following them was merely voluntary. Companies were also free to decline inspections of their systems by the TSA. (We’ve reached out to Colonial to see if it chose to duck any such inspection.)

According to an anonymous source within the agency who spoke to The Washington Post, failing to meet the forthcoming requirements is likely to result in financial penalties, though how much is unclear. They would have to be fairly substantial in order to change the essential calculus. As Wharton researchers point out, the average cost of a breach in 2017 was just north of $7 million — not a massive expenditure compared to say, the price tag for implementing top-notch cybersecurity across a swath of legacy systems; they also found that “in the short run, the market jumps in fright after disclosure of a breach, but in a longer period of time (even just a month), there is hardly a difference between a breached and an un-breached company.” In short: a successful breach does very little to a company’s bottom line, either through immediate costs or longer-term stock valuation changes.

Essentially, TSA’s new rules will need to have substantial power to inflict financial hardship, or companies probably will not have much incentive to change their lax habits.

That these decisions are driven entirely by profits is nowhere better exemplified than by the Colonial hack itself, which did nothing at all to harm the actual systems responsible for delivering fuel: what was compromised, according to CNN, was Colonial’s billing system, and the protracted shutdown was due largely to the company being unable to determine how much customers would have owed.

Even assuming pipeline companies are broadly cooperative, the TSA is setting itself up for a Sisyphean task of overseeing over 2 million miles of pipeline with a staff — as of 2019 — of just five auditors.

Repost: Original Source and Author Link

Categories
Tech News

Instagram on the Web might finally allow creating posts

Although it is one of the biggest social networks in the world, Instagram is able to get away with one rather big sin. It’s still a mostly smartphone-only affair, refusing to embrace tablets and even desktops to some extent. While Instagram does have a presence on web browsers, it’s meant only for viewing and, only recently, messaging with other users. That may be changing, however, and the Web interface might finally let users make posts while still ignoring tablets.

To some extent, it does make sense that Instagram remains completely on phones. It revolves around content that comes directly from a phone’s gallery, namely photos and videos, and doesn’t make immediate sense for a desktop workflow. That said, devices and workflows have changed in the last few years and it may now make sense to support this use case.

According to mobile developer and leaker Alessandro Paluzzi, Instagram is already working on opening the doors to uploading photos and videos from a web browser. Uses will have access to nearly the same options as on mobile, including filters, which means suggests users will be able to more easily post their artwork and edits from their desktops without having to move them to their phones first.

Paluzzi doesn’t disclose how he was able to enable this feature and makes no guesses on how long it will take Instagram to make it public. He also makes no mention if Instagram Stories are also supported in this new workflow.

Unfortunately, there is still no sign of proper tablet support. With tablets like the iPad Pro cementing itself as a powerful creation tool, an Instagram app that takes advantage of that makes perfect sense to everyone except Instagram. This browser-based feature could be a temporary workaround, presuming it even works on mobile browsers.

Repost: Original Source and Author Link

Categories
Tech News

Automate creating on-brand marketing creatives for social media with this tool

TLDR: The RelayThat Design app automates all your graphic design work to unify project looks, create multiple versions, and generate agency-level creative ads and social posts in minutes.

If you’re a driven entrepreneur, you’re likely many things, from a grinding workhorse to a strong salesperson to a gifted marketer. With everything on your plate and all the skills you need to excel, design expert or artistic visionary isn’t always going to find a home on that list.

However, in our brand-centric world of digital marketing, your ads and social media posts have to look every bit as professional as anything from Apple, Coca-Cola, Nike, or any of the other global big boys. And maintaining a consistent look and tone to all that content across multiple online platforms is hugely important in crafting a concise, effective marketing effort.

RelayThat Design ($59.99, over 90 percent off, from TNW Deals) can run point on that effort for your project, sharpening and unifying any marketing message with eye-catching, professional marketing creative elements usable anywhere you need to be talking up your product or service.

RelayThat is a graphic design software app that automatically takes over a lot of the heavy lifting when it comes to creating an ad, a social media post, a web graphic, or any piece of design art.

You start with your brand, maybe a logo and color scheme, and possibly a few key selling points. From that, RelayThat incorporates your key elements into one of more than 2,000 agency-level design templates, each resulting in perfectly sized and optimized graphics ready for posting to each social media channel.

While the RelayThat templates are all professional-grade, there’s also room for customization. RelayThat is home to a copyright-free stock image and icons library of more than 3 million elements for crafting just the message you want.

If you’re unsure, RelayThat is full of tools to help take the guesswork out of design work. That includes libraries of the top-performing color and font combinations, one-clock resizing to resize or remix layouts to perfectly fit any advertising or social media channel, and unified brand assets that will keep the same design aesthetics for a specific brand across any and all uses.

Everything is managed through the easy RelayThat portal, making it simple and efficient to collaborate with other designers, employees, or virtual assistants.

Right now, you can try out the time-saving advantages of the RelayThat Design app as part of this lifetime subscription deal. That takes the regular $720 price down significantly to only $59.99

Prices are subject to change.

Repost: Original Source and Author Link

Categories
AI

Deepfake bots on Telegram make the work of creating fake nudes dangerously easy

Researchers have discovered a “deepfake ecosystem” on the messaging app Telegram centered around bots that generate fake nudes on request. Users interacting with these bots say they’re mainly creating nudes of women they know using images taken from social media, which they then share and trade with one another in various Telegram channels.

The investigation comes from security firm Sensity, which focuses on what it calls “visual threat intelligence,” particularly the spread of deepfakes. Sensity’s researchers found more than 100,000 images have been generated and shared in public Telegram channels up to July 2020 (meaning the total number of generated images, including those never shared and those made since July, is much higher). Most of the users in these channels, roughly 70 percent, come from Russia and neighboring countries, says Sensity. The Verge was able to confirm that many of the channels investigated by the company are still active.

The bots are free to use, but they generate fake nudes with watermarks or only partial nudity. Users can then pay a fee equal to just a few cents to “uncover” the pictures completely. One “beginner rate” charges users 100 rubles (around $1.28) to generate 100 fake nudes without watermarks over a seven day period. Sensity says “a limited number” of the bot-generated images feature targets “who appeared to be underage.”

Both The Verge and Sensity have contacted Telegram to ask why they permit this content on their app but have yet to receive replies. Sensity says it’s also contacted the relevant law enforcement authorities.

In a poll in one of the main channels for sharing deepfake nudes (originally posted in both Russian and English), most users said they wanted to generate images of women they knew in “real life.”
Image: Sensity

The software being used to generate these images is known as DeepNude. It first appeared on the web last June, but its creator took down its website hours after it received mainstream press coverage, saying “the probability that people will misuse it is too high.” However, the software has continued to spread over backchannels, and Sensity says DeepNude “has since been reverse engineered and can be found in enhanced forms on open source repositories and torrenting websites.” It’s now being used to power Telegram bots, which handle payments automatically to generate revenue for their creators.

DeepNude uses an AI technique known as generative adversarial networks, or GANs, to generate fake nudes, with the resulting images varying in quality. Most are obviously fake, with smeared or pixellated flesh, but some can easily be mistaken for real pictures.

Since before the arrival of Photoshop, people have created nonconsensual fake nudes of women. There are many forums and websites currently dedicated to this activity using non-AI tools, with users sharing nudes of both celebrities and people they know. But deepfakes have led to the faster generation of more realistic images. Now, automating this process via Telegram bots makes generating fake nudes as easy as sending and receiving pictures.

“The key difference is accessibility of this technology,” Sensity’s CEO and co-author of the report, Giorgio Patrini, told The Verge. “It’s important to notice that other versions of the AI core of this bot, the image processing and synthesis, are freely available on code repositories online. But you need to be a programmer and have some understanding of computer vision to get them to work, other than powerful hardware. Right now, all of this is irrelevant as it is taken care of by the bot embedded into a messaging app.”

Sensity’s report says it’s “reasonable to assume” that most of the people using these bots “are primarily interested in consuming deepfake pornography” (which remains a popular category on porn sites). But these images and videos can also be used for extortion, blackmail, harassment, and more. There have been a number of documented cases of women being targeted using AI-generated nudes, and it’s possible some of those creating nudes using the bots on Telegram are doing so with these motives in mind.

Patrini told The Verge that Sensity’s researchers had not seen direct evidence of the bot’s creations being used for these purposes, but said the company believed this was happening. He added that while the political threat of deepfakes had been “miscalculated” (“from the point of view of perpetrators, it is easier and cheaper to resort to photoshopping images and obtain a similar impact for spreading disinformation, with less effort”), it’s clear the technology poses “a series threat for personal reputation and security.”

Repost: Original Source and Author Link

Categories
AI

5 steps to creating a responsible AI Center of Excellence

Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.


To practice trustworthy or responsible AI (AI that is truly fair, explainable, accountable, and robust), a number of organizations are creating in-house centers of excellence. These are groups of trustworthy AI stewards from across the business that can understand, anticipate, and mitigate any potential problems. The intent is not to necessarily create subject matter experts but rather a pool of ambassadors who act as point people.

Here, I’ll walk your through a set of best practices for establishing an effective center of excellence in your own organization. Any larger company should have such a function in place.

1. Deliberately connect groundswells

To form a Center of Excellence, notice groundswells of interest in AI and AI ethics in your organization and conjoin them into one space to share information. Consider creating a slack channel or some other curated online community for the various cross-functional teams to share thoughts, ideas, and research on the subject. The groups of people could either be from various geographies and/or various disciplines. For example, your organization may have a number of minority groups with a vested interest in AI and ethics that could share their viewpoints with data scientists that are configuring tools to help mine for bias.  Or perhaps you have a group of designers trying to infuse ethics into design thinking who could work directly with those in the organization that are vetting governance.

2. Flatten hierarchy

This group has more power and influence as a coalition of changemakers. There should be a rotating leadership model within an AI Center of Excellence; everyone’s ideas count — everyone is welcome to share and to co-lead. A rule of engagement is that everyone has each other’s back.

3. Source your force

Begin to source your AI ambassadors from this Center of Excellence — put out a call to arms.  Your ambassadors will ultimately help to identify tactics for operationalizing your trustworthy AI principles including but not limited to:

A) Explaining to developers what an AI lifecycle is. The AI lifecycle includes a variety of roles, performed by people with different specialized skills and knowledge who collectively produce an AI service. Each role contributes in a unique way, using different tools. A key requirement for enabling AI governance is the ability to collect model facts throughout the AI lifecycle. This set of facts can be used to create a fact sheet for the model or service. (A fact sheet is a collection of relevant information about the creation and deployment of an AI model or service.) Facts could range from information about the purpose and criticality of the model to measured characteristics of the dataset, model, or service, to actions taken during the creation and deployment process of the model or service. Here is an example of a fact sheet that represents a text sentiment classifier (an AI model that determines which emotions are being exhibited in text.) Think of a fact sheet as being the basis for what could be considered a “nutrition label” for AI. Much like you would pick up a box of cereal in a grocery store to check for sugar content, you might do the same when choosing which loan provider to choose given which AI they use to determine the interest rate on your loan.

B) Introducing ethics into design thinking for data scientists, coders, and AI engineers. If your organization currently does not use design thinking, then this is an important foundation to introduce.  These exercises are critical to adopt into design processes. Questions to be answered in this exercise include:

  • How do we look beyond the primary purpose of our product to forecast its effects?
  • Are there any tertiary effects that are beneficial or should be prevented?
  • How does the product affect single users?
  • How does it affect communities or organizations?
  • What are tangible mechanisms to prevent negative outcomes?
  • How will we prioritize the preventative implementations (mechanisms) in our sprints or roadmap?
  • Can any of our implementations prevent other negative outcomes identified?

C) Teaching the importance of feedback loops and how to construct them.

D) Advocating for dev teams to source separate “adversarial” teams to poke holes in assumptions made by coders, ultimately to determine unintended consequences of decisions (aka ‘Red Team vs Blue Team‘ as described by Kathy Baxter of Salesforce).

E) Enforcing truly diverse and inclusive teams.

F) Teaching cognitive and hidden bias and its very real affect on data.

G) Identifying, building, and collaborating with an AI ethics board.

H) Introducing tools and AI engineering practices to help the organization mine for bias in data and promote explainability, accountability, and robustness.

These AI ambassadors should be excellent, compelling storytellers who can help build the narrative as to why people should care about ethical AI practices.

4. Begin teaching trustworthy AI training at scale

This should be a priority. Curate trustworthy AI learning modules for every individual of the workforce, customized in breadth and depth based on various archetype types. One good example I’ve heard of on this front is Alka Patel, head of AI ethics policy at the Joint Artificial Intelligence Center (JAIC). She has been leading an expansive program promoting AI and data literacy and, per this DoD blog, has incorporated AI ethics training into both the JAIC’s DoD Workforce Education Strategy and a pilot education program for acquisition and product capability managers. Patel has also modified procurement processes to make sure they comply with responsible AI principles and has worked with acquisition partners on responsible AI strategy.

5. Work across uncommon stakeholders

Your AI ambassadors will work across silos to ensure that they bring new stakeholders to the table, including those whose work is dedicated to diversity and inclusivity, HR, data science, and legal counsel. These people may NOT be used to working together! How often are CDIOs invited to work alongside a team of data scientists? But that is exactly the goal here.

Granted, if you are a small shop, your force may be only a handful of people. There are certainly similar steps you can take to ensure you are a steward of trustworthy AI too. Ensuring that your team is as diverse and inclusive as possible is a great start. Have your design and dev team incorporate best practices into their day-to-day activities.  Publish governance that details what standards your company adheres to with respect to trustworthy AI.

By adopting these best practices, you can help your organization establish a collective mindset that recognizes that ethics is an enabler not an inhibitor. Ethics is not an extra step or hurdle to overcome when adopting and scaling AI but is a mission critical requirement for orgs. You will also increase trustworthy-AI literacy across the organization.

As Francesca Rossi, IBM’s AI and Ethics leader  stated, “Overall, only a multi-dimensional and multi-stakeholder approach can truly address AI bias by defining a values-driven approach, where values such as fairness, transparency, and trust are the center of creation and decision-making around AI.”

Phaedra Boinodiris, FRSA, is an executive consultant on the Trust in AI team at IBM and is currently pursuing her PhD in AI and ethics. She has focused on inclusion in technology since 1999. She is also a member of the Cognitive World Think Tank on enterprise AI.

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Repost: Original Source and Author Link

Categories
AI

AI Weekly: The challenges of creating open source AI training datasets

In January, AI research lab OpenAI released Dall-E, a machine learning system capable of creating images to fit any text caption. Given a prompt, Dall-E generates photos for a range of concepts, including cats, logos, and glasses.

The results are impressive, but training Dall-E required building a large-scale dataset that OpenAI has so far opted not to make public. Work is ongoing on an open source implementation, but according to Connor Leahy, one of the data scientists behind the effort, development has stalled because of the challenges in compiling a corpus that respects both moral and legal norms.

“There’s plenty of not-legal-to-scrape data floating around that isn’t [fair use] on platforms like social media, Instagram first and foremost,” Leahy, who’s a member of the volunteer AI research effort EleutherAI, told VentureBeat. “You could scrape that easily at large scale, but that would be against the terms of service, violate people’s consent, and probably scoop up illegal data both due to copyright and other reasons.”

Indeed, creating AI training datasets in a privacy-preserving, ethical way remains a major blocker for researchers in the AI community, particularly those who specialize in computer vision. In January 2019, IBM released a corpus designed to mitigate bias in facial recognition algorithms that contained nearly a million photos of people from Flickr. But neither the photographers nor the subjects of the photos were notified by IBM that their work would be included. Separately, an earlier version of ImageNet, a dataset used to train AI systems around the world, was found to contain photos of naked children, porn actresses, college parties, and more — all scraped from the web without those individuals’ consent.

“There are real harms that have emerged from casual repurposing, open-sourcing, collecting, and scraping of biometric data,” said Liz O’Sullivan, cofounder and technology director at the Surveillance Technology Oversight Project, a nonprofit organization litigating and advocating for privacy. “[They] put people of color and those with disabilities at risk of mistaken identity and police violence.”

Techniques that rely on synthetic data to train models might lessen the need to create potentially problematic datasets in the first place. According to Leahy, while there’s usually a minimum dataset size needed to achieve good performance on a task, it’s possible to a degree to “trade compute for data” in machine learning. In other words, simulation and synthetic data, like AI-generated photos of people, could take the place of real-world photos from the web.

“You can’t trade infinite compute for infinite data, but compute is more fungible than data,” Leahy said. “I do expect for niche tasks where data collection is really hard, or where compute is super plentiful, simulation to play an important role.”

O’Sullivan is more skeptical that synthetic data will generalize well from lab conditions to the real world, pointing to existing research on the topic. In a study last January, researchers at Arizona State University showed that when an AI system trained on a dataset of images of engineering professors was tasked with creating faces, 93% were male and 99% white. The system appeared to have amplified the dataset’s existing biases — 80% of the professors were male and 76% were white.

On the other hand, startups like Hazy and Mostly AI say that they’ve developed methods for controlling the biases of data in ways that actually reduce harm. A recent study published by a group of Ph.D. candidates at Stanford claims the same — the coauthors say their technique allows them to weight certain features as more important in order to generate a diverse set of images for computer vision training.

Ultimately, even where synthetic data might come into play, O’Sullivan cautions that any open source dataset could put people in that set at greater risk. Piecing together and publishing a training dataset is a process that must be undertaken thoughtfully, she says — or not at all, where doing so might result in harm.

“There are significant worries about how this technology impacts democracy and our society at large,” O’Sullivan said.

For AI coverage, send news tips to Khari Johnson and Kyle Wiggers and AI editor Seth Colaner — and be sure to subscribe to the AI Weekly newsletter and bookmark our AI channel, The Machine.

Thanks for reading,

Kyle Wiggers

AI Staff Writer

VentureBeat

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform
  • networking features, and more

Become a member

Repost: Original Source and Author Link