AI and computer vision powers growing shop-and-go platform

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Watch here.

AI and computer vision were not necessarily top-of-mind for Sodexo, a food and facilities management company that runs over 400 university dining programs, which was looking for a future-forward, seamless experience to offer students instead of the usual buffet meal options.

All the company knew is that they wanted something like Amazon Go’s cashierless, shop-and-go stores. That is, where shoppers can walk in, pick items off the shelves, and leave without standing in line at the cashier or suffering through swiping codes at the self-checkout. 

“Students today want things they can partially or fully prepare in their room or apartment, with organic, highly-local options,” said Kevin Rettle, global vice president product development and digital innovation at Sodexo. “We also wanted to remove friction, but many solutions still require the interaction of the guest with a cashier – this generation really doesn’t want to talk to a lot of people in their service interactions.” 

For the University of Denver, Sodexo chose the San Jose-based AiFi, which offers a frictionless and cashierless AI-powered retail solution. Its flexibility (the company says it can deploy two stores per week) and diverse locations (sports stadiums, music festivals, grocery store chains, college campuses and more) make it unique, explained Steve Gu, who cofounded AiFi in 2016 with his wife, Ying Zheng. Both Gu and Zheng have Ph.D.s in computer vision and spent time at Apple and Google.

AiFi, which is powered only by cameras and computer vision technology, announced today that it now boasts a total of 80 checkout-free stores worldwide, partnering with retailers including Carrefour, Aldi, Loop and Verizon. It has also opened 53 Zabka stores in Poland and 2 NFL stores. Gu maintains this is an industry benchmark for how this technology can scale in a way that Amazon Go, which has more than 42 stores, cannot.

Cameras and computer vision, not sensors

Amazon Go’s stores are retrofitted with specialized cameras, sensors, and weighted shelves, Gu explained. “That makes the solution very expensive and hard to scale,” he said. Instead, AiFi uses the “cheapest-possible off-the-shelf cameras,” combined with what he says is the real power: Computer vision. 

AiFi deploys sophisticated AI models through a large number of cameras placed across the ceiling, Gu said, in order to understand everything happening in the shop. Cameras track customers throughout their shopping journey, while computer vision recognizes products and detects different activities, including putting items onto or grabbing items off the shelves.

Beneath the platform’s hood are neural network models specifically developed for people-tracking as well as activity and product recognition. AiFi also developed advanced calibration algorithms that allow the company to re-create the shopping environment in 3D.

AiFi also leverages simulated datasets. “We spend quite a lot of effort building those simulated environments so we can train the AI algorithms and the models inside them,” Gu said. “That really helps us develop those models faster and make them more scalable.” 

In a simulated world, he explained, you can easily adjust human shapes and characteristics, as well as the shelf layout and the look of the product. You can create a cluttered, crowded store environment or one that is neat and orderly. “Things that cannot be done in the real world can be easily done in a simulated world,” he said. “The AI can learn about those scenarios and will then be able to perform or outperform in a real setting.” 

Computer vision that is constantly evolving

AiFi’s system is evolving and will improve over time, Gu continued, citing current challenges including the ability for the platform to recognize small items such as gum or lipstick.

“If they are not placed in the right place, it’s very hard for the computer vision to discern what it is,” he said. There are also issues related to items with similar looks and textures. “If they are placed together in adjacent spaces it sometimes causes confusion for the cameras and computer vision to recognize these products,” he said. “But the good thing is that it’s not purely based on the visual texture – you also have the 3D scene geometry, the location, the context as well.”  

There also are current limitations to the size of the store and the number of people it can track. “The question is can the solution also be scalable to super centers of 100,000 square feet?” he said. “Also, the system is able to track hundreds of people shopping simultaneously in a shop environment. But in order for that to further scale, to track thousands of people, with very complex shopping behavior, that’s something that is still a work in progress.”

To enter an AiFi-powered store, shoppers don’t need a biometric scan or an AiFi app — they can swipe a credit card or use the retailer’s app. At the University of Denver, for example, Sodexo wanted a partner that was agnostic to the front end. “We were able to use our wallet and payment processing, and tie the AiFi technology, the cameras, and the AI into our system,” said Rettle.

Consumer adoption is key

“From a product ownership perspective, you always kind of hold your breath. Is it going to work?” he said. But ultimately, at the University of Denver the students immediately took to the AiFi concept.

“We didn’t have to teach any of the students what to do,” he said. “They get it without having a bunch of prompts.”

Critics in the retail space also predicted the AiFi technology would be a “loss-prevention nightmare — that the students will figure out how to game the system,” Rettle said. Instead, the current accuracy rate for the AiFi solution is 98.3% and the shrink rate (what shoppers walk out without paying for) has actually declined, he said.

Some products don’t quite work yet with AiFi’s solution, Rettle admits, including college and fan “swag.” “The platform still has to understand consumer behavior around that, which will certainly evolve with the technology,” he said.

Rettle also said he doesn’t envision a campus or stadium that could shift to 100% autonomous retail. “For us it’s something that complements,” he said. “But I see a strong future in terms of being able to continue to deploy and drive ubiquity with the solution based on consumer acceptance.”

For Gu, AiFi’s potential is “huge,” with over a dozen new stores in the works and a growing partnership with Microsoft as an independent software vendor partner (AiFi runs its solution on Azure). “You’re going to see a lot of autonomous retail in a variety of verticals — not just stadiums, festivals and universities, but offices, movie theaters and other spaces,” he said.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.

Repost: Original Source and Author Link


Razer made a Snapdragon G3x handheld to show us Qualcomm’s gaming vision

Since Nintendo put games in consumers pockets with the Game & Watch, gamers have been seeking the ideal mobile gaming machine. From the Nintendo Game Boy to the Sega Game Gear on forward to the NVIDIA SHIELD and the Nintendo Switch, we’ve been seeing the same basic idea played over and over again. The next entry in this device category comes from both Qualcomm and Razer. On one hand we have the Snapdragon G3x Gen 1 Gaming Platform, a “new category of gaming devices” – on the other, we have a device built with all the best bits of that platform in mind, made by Razer.

Qualcomm wants to show off the powers of their latest generation of processors and tech, so they’ve created a “platform” that manufacturers can use to jump off into a futuristic gaming device with minimal effort. Razer is the first to step up to the plate with efforts that’ve resulted in the Snapdragon G3x Handheld Gaming Developer Kit.

The “developer kit” in this case is a piece of hardware – made only for developers. It looks like a handheld gaming device, running Android, but it’s not made for the average consumer. This isn’t like that time NVIDIA made the first SHIELD. It’s more like every other time Qualcomm has made a “developer kit” device so that developers can get their hands on the technology that Qualcomm hopes they’ll demand from manufacturers in the near future.

The dev kit

The Snapdragon G3x Handheld Gaming Developer Kit is a handheld gaming device with a 6.65-inch Full HD+ OLED display with up to 120hz refresh rate and 10-bit HDR. This device is powered by the Snapdragon G3x chipset and represents all the key features and capabilities of the Snapdragon G3x gaming platform.

This device has 5G mmWave connectivity (with the right SIM card, of course), USB-C for accessories, USB-C compatibility with DisplayPort (to HDMI) with support for 4K HDR output (on a larger display).

There’s a front-facing FHD 1080p webcam (5MP), and a set of hardware and touchscreen controls. You’ll find the XYAB control buttons on the right, along with a hardware start button, and both Select and Menu on the left with a cross directional pad. There are two joysticks, and two front-facing speakers supported by Snapdragon Sound Technology.

Inside are “advanced haptics” from Lofelt – with a dedicated Haptics Engine so you’ll feel all the rumbles. This device has support for all the most updated Qualcomm-developed gaming features like Qualcomm Game Quick Touch (touch refresh), Qualcomm Game Color Plus, Qualcomm Game Smoother, and Qualcomm FastConnect Subsystem with WiFi 6E.

How can I get one?

If you’re not a game developer, you might want to be asking: Why should I care about this device and this gaming platform? Qualcomm wont be releasing the Developer Kit to the general public. Instead, developers will get the device so that next-generation high-powered mobile games can be developed, and next-generation game streaming experiences can be envisioned.

Qualcomm’s plan here is to get the developer kit into developer hands, to get developers to create awesome games that work best on this platform, and then, the most important step: manufacturing. Manufacturers – gaming brands, smartphone makers, and others, can create final consumer hardware.

So one day we might have a proper gaming device that uses all the best parts of the Snapdragon G3x Gaming Platform so the everyday consumer can play at home and on-the-go. Until then, there’ll be Snapdragon 8 Gen 1 devices in early 2022 for your not-just-gaming-focused smartphone pleasure!

Repost: Original Source and Author Link


Video-level computer vision advances business insights

This article was contributed by Can Kocagil, data scientist at OREDATA.

From spatial to spatiotemporal visual processing

Instance-based classification, segmentation, and object detection in images are fundamental issues in the context of computer vision. Different from image-level information retrieval, the video-level problems aim at detection, segmentation, and tracking of object instances in spatiotemporal domain that have both space and time dimensions.

Video domain learning is a crucial task for spatiotemporal understanding in camera and drone-based systems with applications in video-editing, autonomous driving, pedestrian tracking, augmented reality, robot vision, and a lot more. Furthermore, it helps us to decode spatiotemporal raw data to actionable insights along with the video, as it has richer content compared to visual-spatial data. With the addition of temporal dimension to our decoding process, we get further information about

  • Motion
  • Viewpoint variations
  • Illuminations
  • Occlusions
  • Deformations
  • Local ambiguities

from the video frames. Because of this, video-level information retrieval has gained popularity as a research area, and it attracts the community along the lines of research for video understanding.

Conceptually speaking, video-level information retrieval algorithms are mostly adapted from image-level processes by adding additional heads to capture temporal information. Aside from simpler video-level classification and regression tasks, video object detection, video object tracking, video captioning, and video instance segmentation are the most common tasks.

To start with, let’s recall the image-level instance segmentation problem.

Image-level instance segmentation

Instance segmentation not only groups pixels into different semantic classes, but also groups them into different object instances. A two-stage paradigm is usually adopted, which first generates object proposals using a Region Proposal Network (RPN), and then predicts object bounding boxes and masks using aggregated RoI features. Different from semantic segmentation, which segments different semantic classes only, instance segmentation also segments the different instances of each class.

Instance segmentation example

Above: Left figure: Semantic segmentation. Right figure: Instance segmentation.

Video classification

The video classification task is a direct adaptation of image classification to the video domain. Instead of giving images as inputs, video frames are given to the model to learn from. By nature, the sequences of images that are temporally correlated are given to learning algorithms that incorporate features of both spatial and temporal visual information to produce classification scores.

The core idea is that, given specific video frames, we want to identify the type of video from pre-defined classes.

Video captioning

Video captioning is the task of generating captions for a video by understanding the action and event in the video, which can help in the retrieval of the video efficiently through text. The idea here is that, given specific video frames, we want to generate natural language that describes the concept and context of the video.

Video Captioning Example

Above: Video captioning example

Image Credit: Can Kocagil

Video captioning is a multidisciplinary problem that requires algorithms from both computer vision (to extract features) and natural language processing (to map extracted features to natural language).

Video object detection (VOD)

Video object detection aims to detect objects in videos, which was first proposed as part of the ImageNet visual challenge. Even though the association and providing of identity improves the detection quality, this challenge is limited to spatially preserved evaluation metrics for per-frame detection and does not require joint object detection and tracking. However, there is no joint detection, segmentation, and tracking as opposed to video-level semantic tasks.

Video Object Detection Example

Above: Video object detection

Image Credit: Can Kocagil

The difference between image-level object detection and video object detection is that the time series of images are given to the machine learning model, which contains temporal information as opposed to image-level processes.

Video object tracking (VOT)

Video object tracking is the process of both localizing the objects and tracking them across the video. Given an initial set of detections in the first frame, the algorithm generates a unique ID for each object in each timestamp and tries to successfully match them across the video. For instance, if I say that the particular object has an ID of “P1” in the first frame, the model tries to predict the ID of “P1” of that particular object in the remaining frames.

Video object tracking tasks are generally categorized as detection-based and detection-free tracking approaches. In detection-based tracking algorithms, objects are jointly detected and tracked such that the tracking part improves the detection quality, whereas in detection-free approaches we’re given an initial bounding box and try to track that object across video frames.

Video Object Tracking example

Above: Video object tracking

Video instance segmentation (VIS)

Video instance segmentation is the recently introduced computer vision research topic that aims at joint detection, segmentation, and tracking of instances in the video domain. Because the video instance segmentation task is supervised, it requires human-oriented high-quality annotations for bounding boxes and binary segmentation masks with predefined categories. It requires both segmentation and tracking, and it is a more challenging task compared to image-level instance segmentation. Hence, as opposed to previous fundamental computer vision tasks, video instance segmentation requires multidisciplinary and aggregated approaches. VIS is like a contemporary all-in-one computer vision task that is the composition of general vision problems.

Video Instance Segmentation Prediction example

Above: Video instance segmentation prediction

Image Credit: Can Kocagil

Knowledge brings value: Video-level information retrieval in action

Acknowledging the technical boundaries of video-level information retrieval tasks will improve the understanding of business concerns and customer needs from a practical perspective. For example, when a client says, “we have videos and want to extract only the locations of pedestrians from the videos,” you’ll recognize that your task is video object detection. What if they want to both localize and track them in videos? Then your problem is translated to the video object tracking task. Let’s say that they also want to segment them across videos. Your task is now video instance segmentation. However, if a client says that they want to generate automatic captions for videos, from a technical point of view, your problem can be formulated as video captioning. Understanding the scope of the project and drawing technical business requirements depends on the kind of insights clients want to derive, and it is crucial for technical teams to formulate the issue as an optimization problem.

This article was contributed by Can Kocagil, data scientist at OREDATA.


Welcome to the VentureBeat community!

DataDecisionMakers is where experts, including the technical people doing data work, can share data-related insights and innovation.

If you want to read about cutting-edge ideas and up-to-date information, best practices, and the future of data and data tech, join us at DataDecisionMakers.

You might even consider contributing an article of your own!

Read More From DataDecisionMakers

Repost: Original Source and Author Link


MLOps platform Landing AI raises $57M to help manufacturers adopt computer vision

Join gaming leaders online at GamesBeat Summit Next this upcoming November 9-10. Learn more about what comes next. 

Palo Alto, California-based Landing AI, the AI startup led by Andrew Ng — the cofounder of Google Brain, one of Google’s AI research divisions — today announced that it raised $57 million in a series A funding round led by McRock Capital. In addition, Insight Partners, Taiwania Capital, Canadian Pension Plan Investment Board, Intel Capital, Samsung Catalyst Fund, Far Eastern Group’s DRIVE Catalyst, Walsin Lihwa, and AI Fund participated, bringing Landing AI’s total raised to around $100 million.

The increased use of AI in manufacturing is dovetailing with the broader corporate sector’s embrace of digitization. According to Google Cloud, 76% of manufacturing companies turned to data and analytics, cloud, and AI technologies due to the pandemic. As pandemic-induced challenges snarl the supply chain, including skilled labor shortages and transportation disruptions, the adoption of AI is likely to accelerate. Deloitte reports that 93% of companies believe that AI will be a pivotal component in driving growth and innovation in manufacturing.

Landing AI was founded in 2o17 by Ng, an adjunct professor at Stanford, formerly an associate professor and director of the university’s Stanford AI Lab. Landing AI’s flagship product is LandingLens, a platform that allows companies to build, iterate, and deploy AI-powered visual inspection solutions for manufacturing.

“AI will transform industries, but that means it needs to work with all kinds of companies, not just those with millions of data points to feed into AI engines. Manufacturing problems often have dozens or hundreds of data points. LandingLens is designed to work even on these small data problems,” Ng told VentureBeat via email. “In consumer internet, a single, monolithic AI system can serve billions of users. But in manufacturing, each manufacturing plant might need its own AI model. By enabling domain experts, rather than only AI experts, to build these AI systems, LandingLens is democratizing access to cutting-edge AI.”

Deep background in AI

Ng, who previously served as chief scientist at Baidu, is an active entrepreneur in the AI industry. After leaving Baidu, he launched an online curriculum of classes centered around machine learning called, and soon after incorporated the company Landing AI.

While at Stanford, Ng started the Stanford Engineering Everywhere, a compendium of freely available online courses, which served as the foundation for Coursera. Ng is currently the chairman of AI cognitive behavioral therapy startup Woebot; sat on the board of Apple-owned driverless car company, and has written several guides and online training courses that aim to demystify AI for business executives.

Three years ago, Ng unveiled the AI Fund, a $175 million incubator that backs small teams of experts looking to solve key problems using AI. In a Medium post announcing the fund, which was an early investor in Landing AI, Ng wrote that he wants to “develop systematic and repeatable processes to initiate and pursue new AI opportunities.”


Landing AI focuses on MLOps, the discipline involving collaboration between data scientists and IT professionals with the aim of productizing AI systems. A compound of “machine learning” and “information technology operations,” the market for such solutions could grow from a nascent $350 million to $4 billion by 2025, according to Cognilytica.

LandingLens provides low-code and no-code visual inspection tools that enable computer vision engineers to train, test, and deploy AI systems to edge devices like laptops. Users create a “defect book” and upload their media. After labeling the data, they can divide it into “training” and “validation” subsets to create and evaluate a model before deploying it into production.

Landing AI

Above: Landing AI’s development dashboard.

Labeled datasets, such as pictures annotated with captions, expose patterns to AI systems, in effect telling machines what to look for in future datasets. Training datasets are the samples used to create the model, while test datasets are used to measure their performance and accuracy.

“For instance … [Landing AI] can help manufacturers more readily identify defects by working with the small data sets the companies have … or spot patterns in a smattering of health care diagnoses,” a spokesperson from Landing AI explained to VentureBeat via email. “Overcoming the ‘big data’ bias to instead concentrate on ‘good data’ — the food for AI — will be critical to unlocking the power of AI in ever more industries.”

On its website, Landing AI touts LandingLens as a tailored solution for OEMs, system integrators, and distributors to evaluate model efficacy for a single app or as part of a hybrid solution, combined with traditional systems. In manufacturing, Landing AI supports uses cases like assembly inspection, processing monitoring, and root cause analysis. But the platform can also be used to develop models in industries like automotive, electronics, agriculture, retail — particularly for tasks involving glass and weld inspection, wafer and die inspection, automated picking and weeding, identifying patterns and trends to generate customer insights.

“A data-centric AI approach [like Landing AI’s] involves building AI systems with quality data — with a focus on ensuring that the data clearly conveys what the AI must learn,” Landing AI writes on its website. “Quality managers, subject-matter experts, and developers can work together during the development process to reach a consensus on defects and labels build a model to analyze results to make further optimizations … Additional benefits of data-centric AI include the ability  for teams to develop consistent methods for collecting and labeling images and for training, optimizing, and updating the models … Landing AI’s AI deep learning workflow simplifies the development of automated machine solutions that identify, classify, and categorize defects while improving production yield.”

With upwards of 82% of firms saying that custom app development outside of IT is important, Gartner predicts that 65% of all apps — including AI-powered apps — will be created using low-code platforms by 2024. Another study reports that 85% of 500 engineering leads think that low-code will be commonplace within their organizations as soon as the end of this year, while one-third anticipates that the market for low- and no-code will climb to between $58.8 billion and $125.4 billion in 2027.

Landing AI competes with, Comet, Domino Data Lab, and others in the burgeoning MLOps and machine learning lifecycle management segment. But investors like Insight Partners’ George Mathew believe that the startup’s platform offers enough to differentiate it from the rest of the pack. Landing AI’s customers include battery developer QuantumScape and life sciences company Ligand Pharmaceuticals, which says it’s using LandingLens to improve its cell screening technologies. Manufacturing giant Foxconn is another client — Ng says that Landing AI has been working with since June 2017 to “develop AI technologies, talent, and systems that build on the core competencies of the two companies.”

“Digital modernization of manufacturing is rapidly growing and is expected to reach $300 billion by 2023,” Mathew explained in a press release. “The opportunity and need for Landing AI is only exploding. It will unlock the untapped segment of targeted machine vision projects addressing quality, efficiency, and output. We’re looking forward to playing a role in the next phase of Landing AI’s exciting journey.”


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link


Computer vision platform Cogniac nabs $20M to bolster its customer acquisition efforts

Cogniac, a San Jose, California-based startup developing computer vision tech for task automation, today announced that it raised $20 million in a series B1 financing round led by National Grid Partners with participation from National Grid, Autotech Ventures, Cisco Investments, Energy Innovation Capital, London Technology Club, Vanedge Capital, and Wing Venture Capital. CEO Chuck Myers says that the proceeds will be put toward the expansion of Cogniac’s workforce and the ramp-up of R&D efforts to support the company’s approach to computer vision, data storage, and “human-AI interactivity.”

Computer vision is a type of AI technology that allows machines to understand, categorize, and differentiate between images. Using photos from cameras and videos as well as deep learning components, computer vision can identify and classify objects and then react to what it “sees.”

Investments in computer vision startups are on the rise as businesses embrace automation during the pandemic, which continues to place a strain on the worldwide labor market. Despite not having passed the “awareness phase,” as per one survey, the computer vision market could grow from $10.9 billion in 2019 to $17.4 billion by 2024. External investments in computer vision startups have already far exceeded the $3.5 billion McKinsey estimated in 2016.


Above: Cogniac’s computer vision platform.

Image Credit: Cogniac

Cogniac’s AI platform has customers connect machine vision cameras, security cameras, drones, smartphones, and other sources and define objects and conditions of interest to them. They might specify surface damage and supply chain quality control inspections, for example, or accident prevention and real-time physical threat detection. Cogniac then monitors and improves classification, identification, counting, and measuring through a feedback system while integrating with third-party apps to deliver alerts and notifications.

Cogniac generates custom AI models for scenarios based on imagery and feedback. Once deployed, these models can learn new characteristics, adapting based either on archival imagery or data users enter. The platform monitors the confidence level of reach new prediction, prioritizing predictions with the lowest level for review while a core learning engine searches for configuration variations, ostensibly lessening the need for manual intervention.

Cogniac claims that with deep convolutional neural networks — types of AI models often applied to analyzing visual imagery — its system can achieve accuracy over 90% prior to human corrections. Moreover, the startup says the technology enables its platform to support multiple deployment environments, including cloud, gateway, on-premises, and hybrid.

Promise and pitfalls

Tasks in manufacturing, which is one of Cogniac’s key markets, can be error-prone when humans are in the loop. A study from Vanson Bourne found that 23% of all unplanned downtime in manufacturing is the result of human error, compared with rates as low as 9% in other segments. The $327.6 million Mars Climate Orbiter spacecraft was destroyed because of a failure to properly convert between units of measurement. And one pharma company reported a misunderstanding that resulted in an alert ticket being overridden, which cost four days on the production line at £200,000 ($253,946) per day.

And broadly speaking, computer vision can be used for nefarious purposes, like monitoring the responses of ride-hailing customers to in-car advertisements. This summer, AnyVision, a controversial Israeli facial recognition startup, raised $235 million in venture capital from SoftBank and Eldridge Industries. Public records and a 2019 version of its user guide show how invasive AnyVision’s software can be — one school using it saw that a student’s face was captured more than 1,000 times during the week.

Cogniac — a member of Nvidia’s Inception accelerator program, with partners including SAP and Rockwell Automation — has controversially provided its software to the U.S. Army to analyze battlefield drone data. The company has also participated in trials with U.S. Customs and Border Protection and helped an Arizona sheriff’s department to identify when people cross the U.S.-Mexico border — and expressed an openness to larger deployments down the line.

Of course, Cogniac isn’t alone in this — machine learning, computer vision, and facial recognition vendors including TrueFace, Clearview AI, TwoSense, and AI.Reverie also have contracts with various U.S. military and law enforcement branches. But according to Cogniac cofounder Bill Kish, government contracts are a small portion of the company’s business, which is primarily focused on industrial applications.

One Cogniac client is Georgia Pacific, which is finalizing the deployment of a solution that simplifies processes around the company’s mill operations. Another is Bobcat, which says it’s implementing Cogniac’s platform within the manufacturing warehouse kitting inspection workflows in warehouses across Otsego, Minnesota facilities. (Kitting refers to compiling products into a single “kit” that’s then shipped to a customer.) More recently, Cogniac announced a partnership with Trimac Transportation, a transportation service company based in North America, to deploy the startup’s technology throughout Trimac’s document identification and filing processes.

On the subject of bias that might arise in Cogniac’s models from imbalanced datasets, Kish says the company employs a process in which multiple people review uncertain data to establish a consensus. The company’s system acts as a source of record for managing assets, ensuring biases inherent in the visual data are spotlighted so they can be addressed through feedback.

“We’re at a key inflection point for AI vision adoption in the industrial and manufacturing sectors,” Myers said in a statement. “Our product’s efficacy and ease of implementation offer our customers significant and material improvement to their workstreams and processes. This funding allows us to scale our operations to meet the needs of this currently nascent but massively important and growing space. AI vision will serve as the foundation of safety and efficiency for the future of logistics and manufacturing, and we’re leading the creation of that infrastructure and operation standard.”

To date, Cogniac has raised over $30 million in venture capital.


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link


Amazon’s on-premises device for vision apps, AWS Panorama Appliance, launches publicly

This article is part of a VB special issue. Read the full series: AI and Surveillance.

Amazon today announced the general availability of the AWS (Amazon Web Services) Panorama Appliance, a device that allows customers to use existing on-premises cameras and analyze video feeds with AI. Ostensibly designed for use cases like quality checks and supply chain monitoring, Amazon says that the Panorama Appliance is already being used by companies including Accenture, Deloitte, and Sony.

“Customers in industrial, hospitality, logistics, retail, and other industries want to use computer vision to make decisions faster and optimize their operations. These organizations typically have cameras installed onsite to support their businesses, but they often resort to manual processes like watching video feeds in real time to extract value from their network of cameras, which is tedious, expensive, and difficult to scale,” Amazon wrote in a press release. “Most customers are stuck using slow, expensive, error-prone, or manual processes for visual monitoring and inspection tasks that do not scale and can lead to missed defects or operational inefficiencies.”

By contrast, the Panorama Appliance connects to a local network to perform computer vision processing at the edge, Amazon says. Integrated with Amazon SageMaker — Amazon’s service for building machine learning models — the Panorama Appliance can be updated and deployed with new computer vision models. Companies that opt not to create their own models can choose from solutions offered by Deloitte, TaskWatch, Vistry, Sony, Accenture, and other Amazon partners.

To date, customers have developed models running on the Panorama Appliance for manufacturing, construction, hospitality, and retail, Amazon says. Some are analyzing retail foot traffic to inform store layouts and displays, while others are identifying peak times in stores to pinpoint where staff might be needed.

The Cincinnati/Northern Kentucky International Airport in Hebron, Kentucky, is using the Panorama Appliance to monitor congestion across airport traffic lanes. With the help of Deloitte, The Vancouver Fraser Port Authority has applied the Panorama Appliance to track containers throughout its facilities. And Tyson has built models on the device to count packaged products on lines for quality assurance.

“Organizations across all industries like construction, hospitality, industrial, logistics, retail, transportation, and more are always keen to improve their operations and reduce costs. Computer vision offers a valuable opportunity to achieve these goals, but companies are often inhibited by a range of factors including the complexity of the technology, limited internet connectivity, latency, and inadequacy of existing hardware,” VP of Amazon machine learning at AWS Swami Sivasubramanian said in a statement. “We built the Panorama Appliance to help remove these barriers so our customers can take advantage of existing on-premises cameras and accelerate inspection tasks, reduce operational complexity, and improve consumer experiences through computer vision.”

Privacy implications

Since its unveiling at Amazon’s re:Invent 2020 conference in December, experts have raised concerns about how the Panorama Appliance could be misused. While the purported goal is “optimization,” the device could be coopted for other, less humanitarian intents, like allowing managers to chastise employees in the name of productivity.

In the promotional material for the Panorama Appliance, Fender says it uses the product to “track how long it takes for an associate to complete each task in the assembly of a guitar.” Each state has its own surveillance laws, but most give wide discretion to employers so long as any equipment they use to track employees is plainly visible. There’s no federal legislation that explicitly prohibits companies from monitoring staff during the workday.

Bias could also arise from the computer vision models deployed to the Panorama Appliance if the models aren’t trained on sufficiently diverse data. A study conducted by researchers at the University of Virginia found that two prominent research-image collections displayed gender bias in their depiction of sports and other activities, showing images of shopping linked to women while associating things like coaching with men. Even differences in the sun path between the northern and southern hemispheres and variations in background scenery can affect model accuracy, as can the varying specifications of camera models like resolution and aspect ratio.

Recent history is filled with examples of the consequences of training computer vision models on biased datasets, like virtual backgrounds and automatic photo-cropping tools that disfavor darker-skinned people. Back in 2015, a software engineer pointed out that the image recognition algorithms in Google Photos were labeling his Black friends as “gorillas.” And the nonprofit AlgorithmWatch has shown that Google’s Cloud Vision API at one time automatically labeled thermometers held by a Black person as “guns” while labeling thermometers held by a light-skinned person as “electronic devices.”

Amazon has pitched — and employed — surveillance technologies before. The company’s Rekognition software sparked protests and pushback, which led to a moratorium on the use of the technology. And Amazon’s notorious “Time Off Task” system dings warehouse employees for spending too much time away from the work they’re assigned to perform, like scanning barcodes or sorting products into bins.

An Amazon spokeswoman recently told the BBC that the Panorama Appliance was “designed to improve industrial operations and workplace safety” and that how it is used is up to customers. “For example, AWS Panorama does not include any pre-packaged facial recognition capabilities,” the spokesperson said. All its machine learning functions can happen on the device, they added, “and [relevant data] never has to leave the customer’s facility.”

The Panorama Appliance is now available for sale through Amazon’s AWS Elemental service in the U.S., Canada, U.K., and E.U.

Read More: VentureBeat's Special Issue on AI and Surveillance

Repost: Original Source and Author Link


Computer vision startup Zebra Medical Vision sells for $200M

The Transform Technology Summits start October 13th with Low-Code/No Code: Enabling Enterprise Agility. Register now!

Zebra Medical Vision, a computer vision startup focused on health care, today announced it has entered into an agreement to be acquired by publicly traded health firm Nanox. Terms of the deal weren’t disclosed, but a Zebra spokesperson said it was expected to be valued at around $200 million — $100 million upfront and another $100 million tied to specific milestones, all in stock.

Computer vision, which deals with algorithms that can gain a high-level understanding of images and videos, is being applied across a range of medical domains. While some research has raised concerns about bias, startups and incumbents are chasing after a market that’s anticipated to be worth $5.31 billion by 2026, according to Verified Market Research. They argue that a shortage of professionals — the U.S. Bureau of Labor Statistics projects only a 9% increase in the number of radiologic and MRI technicians by 2028 — will necessitate scalable computer vision technologies. Moreover, the companies claim, computer vision has the potential to reduce labor costs, as well as medical imaging workloads.

Zebra Medical was founded in 2014 by Elad Benjamin, Eyal Gura, and Eyal Toledano to help patients, physicians, and health care providers use computer vision tools to diagnose bone, liver, lung, and cardiovascular diseases. The startup delivers what it calls one of the largest open clinical research platforms globally, enabling researchers to access millions of anonymized, indexed clinical records for scientific discovery. Zebra also developed an analytics solution that provides algorithms and clinical insight decision support tools to health care institutions via a software-as-a-service-based model.

Above: A screenshot of one of Zebra Medical Vision’s diagnostic imaging tools.

Beyond this, Zebra hosts a data repository with over 2 million medical images and has U.S. Food and Drug Administration (FDA)-cleared and CE-marked solutions, including seven FDA-cleared and 10 CE-marked AI solutions for medical imaging — the most recent being a 3D modeling product for x-ray images used for preoperative orthopedic surgery planning. In partnership with several radiological industry associations, Zebra in July lobbied the American Medical Association to allow insurers to reimburse clinicians using the company’s AI in vertebral compression fracture (VCF) detection screenings.

Zebra, which had raised $57.4 million in venture capital, counts over 1,100 hospitals, academic institutions, and care providers among its customers, including InterMountain Healthcare, Johnson & Johnson, Nuance, Nvidia, and the University of Oxford.

Pivoting focus

Ahead of the acquisition, Zebra pivoted from focusing on diagnosis and triage to leveraging its data to help health care systems evaluate large volumes of patients for chronic conditions. Should the acquisition be completed, CEO Zohar Elhanani says Zebra will combine its capabilities with the acquiring company’s strategy to “accelerate the population health vision” and make medical imaging “more efficient.”

“Zebra Medical Vision has always operated with the goal of expanding the use of AI in medical imaging to improve health outcomes for patients worldwide,” Elhanani said. “At this time, we understand that that vision is best served by joining forces with a trusted partner with the means to boost our capabilities and propel population health, powered by AI, to the next level. Screening populations to detect and treat chronic disease early has proven to improve outcomes, and we’re thrilled to be taking the helm of the population health transformation in health care.”

Nanox also announced today that it has entered into a binding letter of intent to acquire USARAD and its related company, Medical Diagnostics Web, for $30 million in cash and stock. USARAD operates a network of 300 radiologists across health centers, urgent care facilities, and other providers, which Nanox says will provide it access to trained radiologists — lowering the barrier to U.S. market entry and other countries around the globe.

Nanox, which was founded in 2016 by Japanese venture capital tycoon Hitoshi Masuya, hopes to reinvent the x-ray with hardware inspired by Star Trek’s biobed. Its product, called the Arc, is designed to promote the early detection of conditions discoverable by computed tomography (CT), mammography, fluoroscopy, angiogram, and other imaging modalities. A cloud-based software dubbed Nanox.Cloud complements the Arc with value-added services, including a scan repository, radiologist matching, online and offline diagnostic review and annotation, connectivity to diagnostic assistive AI systems, billing, and reporting.

“Expanding access to medical imaging via widespread deployment of the Nanox Arc solves one of the obstacles to achieving true population health management,” Nanox CEO Ran Poliakine said in a press release. “Yet the global shortage of trained radiologists represents a significant bottleneck in the imaging process. The Nanox ARC, together with the acquisitions of Zebra Medical Vision and USARAD, if consummated, would move us toward our vision of deploying our systems and have the support of a large network of radiologists empowered with highly advanced AI algorithms that will allow for the rapid interpretation of medical images into actionable medical interventions, which would represent an end-to-end, globally connected medical imaging solution.”

The pandemic spurred investments in AI across nearly every industry. According to CB Insights’ Q2 2021 report, AI startups attracted record funding — more than $20 billion — despite a drop in deal volume. Health care AI continued to have the largest AI deal share, accounting for 17% of all AI deals ($2.36 billion).


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link


How computer vision works — and why it’s plagued by bias

All the sessions from Transform 2021 are available on-demand now. Watch now.

It’s no secret that AI is everywhere, yet it’s not always clear when we’re interacting with it, let alone which specific techniques are at play. But one subset is easy to recognize: If the experience is intelligent and involves photos or videos, or is visual in any way, computer vision is likely working behind the scenes.

Computer vision is a subfield of AI, specifically of machine learning. If AI allows machines to “think,” then computer vision is what allows them to “see.” More technically, it enables machines to recognize, make sense of, and respond to visual information like photos, videos, and other visual inputs.

Over the last few years, computer vision has become a major driver of AI. The technique is used widely in industries like manufacturing, ecommerce, agriculture, automotive, and medicine, to name a few. It powers everything from interactive Snapchat lenses to sports broadcasts, AR-powered shopping, medical analysis, and autonomous driving capabilities. And by 2022, the global market for the subfield is projected to reach $48.6 billion annually, up from just $6.6 billion in 2015.

The computer vision story follows that of AI overall. A slow rise full of technical hurdles. A big boom enabled by massive amounts of data. Rapid proliferation. And then growing concern over bias and how the technology is being used. To understand computer vision, it’s important to understand how it works, how it’s being used, and both the challenges it overcame and the ones it still faces today.

How computer vision works

Computer vision allows computers to accomplish a variety of tasks. There’s image segmentation (divides an image into parts and examines them individually) and pattern recognition (recognizes the repetition of visual stimuli between images). There’s also object classification (classifies objects found in an image), object tracking (finds and tracks moving objects in a video), and object detection (looks for and identifies specific objects in an image). Additionally, there’s facial recognition, an advanced form of object detection that can detect and identify human faces.

As mentioned, computer vision is a subset of machine learning, and it similarly uses neural networks to sort through massive amounts of data until it understands what it’s looking at. In fact, the example in our machine learning explainer about how deep learning could be used to separate photos of ice cream and pepperoni pizza is more specifically a computer vision use case. You provide the AI system with a lot of photos depicting both foods. The computer then puts the photos through several layers of processing — which make up the neural network — to distinguish the ice cream from the pepperoni pizza one step at a time. Earlier layers look at basic properties like lines or edges between light and dark parts of the images, while subsequent layers identify more complex features like shapes or even faces.

This works because computer vision systems function by interpreting an image (or video) as a series of pixels, which are each tagged with a color value. These tags serve as the inputs the system process as it moves the image through the neural network.

Rise of computer vision

Like machine learning overall, computer vision dates back to the 1950s. Without our current computing power and data access, the technique was originally very manual and prone to error. But it did still resemble computer vision as we know it today; the effectiveness of first processing according to basic properties like lines or edges, for example, was discovered in 1959. That same year also saw the invention of a technology that made it possible to transform images into grids of numbers , which incorporated the binary language machines could understand into images.

Throughout the next few decades, more technical breakthroughs helped pave the way for computer vision. First, there was the development of computer scanning technology, which for the first time enabled computers to digitize images. Then came the ability to turn two-dimensional images into three-dimensional forms. Object recognition technology that could recognize text arrived in 1974, and by 1982, computer vision really started to take shape. In that same year, one researcher further developed the processing hierarchy, just as another developed an early neural network.

By the early 2000s, object recognition specifically was garnering a lot of interest. But it was the release of ImageNet, a dataset containing millions of tagged images, in 2010 that helped propel computer vision’s rise. Suddenly, a vast amount of labeled, ready-to-go data was available for anyone who wanted it. ImageNet was used widely, and most of the computer vision systems that have been built today relied on it. But while computer vision systems were popular at this point, they were still turning up a lot of errors. That changed in 2012 when a model called AlexNet, which used ImageNet, significantly reduced the error rate for image recognition, ushering in today’s field of computer vision.

Computer vision’s bias and challenges

The availability of ImageNet was transformative for the growth and adoption of computer vision. It quite literally became the basis for the industry. But it also scarred the technology in ways that are having a real impact today.

The story of ImageNet reflects a popular saying in data science and AI: “garbage in, garbage out.” In jumping to take advantage of the dataset, researchers and data scientists didn’t pause to consider where the images came from, who chose them, who labeled them, why the were labeled as they were, what images or labels may have been omitted, and the effect all of this might have on how their technology would function, let alone the impact it would have on society and people’s lives. Years later, in 2019, a study on ImageNet revealed the prevalence of bias and problematic labels throughout the dataset.

“Many truly offensive and harmful categories hid in the depth of ImageNet’s Person categories. Some classifications were misogynist, racist, ageist, and ableist. … Insults, racist slurs, and oral judgements abound,” wrote AI researcher Kate Crawford in her book Atlas of AI. And even besides these explicitly obvious harms (some of which have been removed — ImageNet is reportedly working to address various sources of bias), curious choices in terms of categories, hierarchy, and labeling have been found throughout the dataset. It’s now widely criticized for privacy violations as well, as people whose photos were used in the dataset didn’t consent to being included or labeled.

Data and algorithmic bias is one of the core issues of AI overall, but it’s especially easy to see the impact in some computer vision applications. Facial recognition technology, for example, is known to misidentify Black people, but its use is surging in retail stores. It’s also already common in policing, which has prompted protests and regulations in several U.S. cities and states.

Regulations overall are an emerging challenge for computer vision (and AI in general). It’s clear more of it is coming (especially if more of the world follows in the European Union’s path), but it’s not yet known exactly what such regulations will look like, making it difficult for researchers and companies to navigate in this moment. “There’s no standardization and it’s uncertain. For these types of things, having clarification would be helpful,” said Haniyeh Mahmoudian, DataRobot’s global AI ethicist and a winner of VentureBeat’s Women in AI responsibility and ethics award.

Computer vision has some technical challenges as well. It’s limited by hardware, including cameras and sensors. Additionally, computer vision systems are very complex to scale. And like all types of AI, they require massive amounts of computing power (which is expensive) and data. And as the entire history of computer vision makes clear, good data that is representative, unbiased, and ethically collected is hard to come by — and incredibly tedious to tag.


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link


Duke Energy used computer vision and robots to cut costs by $74M

All the sessions from Transform 2021 are available on-demand now. Watch now.

Duke Energy’s AI journey began because the utility company had a business problem to solve,  Duke Energy chief information officer Bonnie Titone told VentureBeat’s head of AI content strategy Hari Sivaraman at the Transform 2021 virtual conference on Thursday.

Duke Energy was facing some significant challenges, such as the growing issue of climate change and the need to transition to clean energy in order to reach net zero emissions by 2050. Duke Energy is considered an essential service, as it supplies 25 million people with electricity daily, and everything the utility company does revolves around a culture of safety and reliability. The variables together was a catalyst for exploring AI technologies, Titone said, because whatever the company chose to do, it had to support the clean energy transition, deliver value to customers, and find a way for employees to work and improve safety.

“We look to emerging data science tools and AI solutions, which in turn brought us to computer vision, and ultimately, drones in order to inspect our solar farms,” Titone said.

There is a significant amount of solar farms in the shift to clean energy — Florida alone has 3 million solar panels, Titone said — and inspecting them is a very labor-intensive, time-consuming, and risky endeavor. It can take about 40 hours to inspect one unit, and a regular solar site may have somewhere between 20 and 25 units to inspect. It’s a dangerous task, as technicians walk around 500-acre solar sites with heat guns so they can inspect the panels and may need to touch live wires. The company began experimenting with advanced drones with infrared cameras to try to streamline the work. The technicians were able to use the images taken by the drones to determine where they’re seeing faults and issues. Thousands of images were stitched together with computer vision, giving technicians the ability to look for issues using the images in a much safer way, Titone said.

After finding the computer vision, Duke Energy began to consider automating the process. The company developed a MOVES model (Mobile Observation Vehicle and Equipment Solutions) that collects and processes the data and images from the drones and identifies the faults within minutes. Through applying AI and machine learning technologies, the program has significantly reduced labor and time costs for the company. Accuracy also continued to improve over the time; the latest model used in the inspection reached 91% accuracy.

“We compiled that information for the technicians and gave them the ability to navigate pretty easily to where we can schedule maintenance for customers, and we did this all without a technician ever having to go out to the site,” Titone said. The program has led to more than $74 million reductions in cost and 385,000 in man-hours.

Cloud and edge processing

Duke Energy had to consider the question of how to process the data the drones were collecting. A typical drone flight can produce thousands of photos, sometimes with no precise location data associated with the images. Trying to do the analysis in the cloud to figure out if the drone image showed a solar site would be impossible because of the sheer amount of data and information involved. Duke Energy had to process the images at the edge so that it could make real-time decisions. The images had to be stitched together to make a precise picture of the solar farm without having to require somebody go and actually walk around the site.

Instead of trying to do everything at once, Duke Energy worked on small increments of the project. Once one thing worked, the team moved on to the next step. Since Duke Energy had its own software engineering team, it was able to build its own models with its own methodologies as part of a one-stop shop. This process eventually led to creating over 40 products.

Titone said, “Had we not had that footprint in the cloud journey, we wouldn’t have been able to develop these models and be able to process that data as quickly as we could.”

Working with data

Titone also discussed best practices with storing and cleaning data. As the team has moved toward a cloud-based data strategy, it uses a lot of data lakes. The data lakes are accessible by other systems and also by some data analysis and data science components that must quickly process the information.

“I would say we’re using a lot of the traditional methods around data lakes in order to process all of that,” Titone said, and the team models the data with “what we call our MATLAB, which stands for machine learning, AI and deep learning.”

Reflecting upon the high accuracy that the product reached, Titone said that it was important to be OK with failing in the beginning. “I think at the beginning of the journey, we didn’t have an expectation that we would get right out of the gate,” she said. As time moved on, the team learned and continued to modify the model according to the results. For example, in the journey with iterations and reflections, the team realized that they should not only extract images but piece different processing techniques together. They also adjusted the angle and height of the drone.

AI as a career opportunity

The fact that AI is more efficient and cost-effective does result in reduced labor hours, which raises the concern that AI is taking jobs away from people. Titone said the better perspective was to view this as an opportunity. She said that upskilling employees to be able to work with AI was an investment in the workforce. If the employees understand AI, she said, they become more valuable as workers because they qualify for more advanced roles.

“I never approach AI as taking somebody’s job or role; the way I’ve always approached AI is that it should complement our workforce, that it should give us a set of skills and career paths that our teammates can take,” Titone said.


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link


eBay taps computer vision to transform online shopping

All the sessions from Transform 2021 are available on-demand now. Watch now.

When eBay rolled out image-based search for shoppers nearly four years ago, the company was among the first to deploy computer vision technology in ecommerce. eBay has learned many lessons since then, and it’s now working on bringing that innovation to sellers on its platform.

In a conversation at VentureBeat’s Transform 2021 virtual conference, eBay’s new chief AI officer, Nitzan Mekel-Bobrov, shared some insights with Maribel Lopez, founder and principal analyst at Lopez Research. Mekel-Bobrov discussed the challenges of developing image-based search technology, the essential elements of any AI strategy, and what’s next in computer vision.

The challenges of image-based search

Offering buyers the ability to look for similar or exact items using images instead of text has become fairly standard in the industry, but launching the feature in 2017 was a big deal, Mekel-Bobrov said.

Data presented some of the hardest obstacles, especially with the range of items eBay is known for.

“The adage that data is king is true, probably nowhere more than in training CD models for item-matching,” Mekel-Bobrov said. “And for us, this was compounded at the time [by] the fact that we don’t control all the ways in which our sellers photograph their items.”

Hardware also posed issues in the first years, Mekel-Bobrov said, and the company had to find creative solutions.

“First, we did our training separately,” he explained. “We trained the models on the public cloud, where we could access [graphic processing units]. Then we did the inference and deployment locally. We also set up a series of what were essentially stateless systems, holding the entire index in memory.”

eBay was also one of the early users of product quantization. That reduced the dimensionality and size of the problem during search.

Organizational challenges remain eBay’s biggest undertaking. Launching a product that requires many different teams with “tremendous” upstream and downstream dependencies presents myriad issues, Mekel-Bobrov said. “A lot of the workflows needed reworking to allow for integration of computer vision capabilities.”

Additionally, ecommerce category taxonomy changes based on the market and inventory. “As the taxonomy innovation progresses, we have to keep up with it on the image search site and continuously adapt and evolve not just our models but our entire approach sometimes,” Mekel-Bobrov said.

The essential elements of AI strategy

Having only been with the company for two months, Mekel-Bobrov said he’s been spending most of his days thinking about eBay’s AI strategy. His plan focuses on three broad categories: taking a customer-first approach, democratizing AI among teams, and nailing down the data infrastructure.

The question should never be what new technology can be adopted, but what customer problems can be solved, Mekel-Bobrov said. “For us, as a two-sided marketplace … the mediation of that relationship — there’s a ton of opportunity and complexity there that AI can really transform.”

The rapidly changing nature of AI also calls for companywide upskilling and training. “We want to put these technologies into the hands of developers much more broadly than AI specialists, even product managers and analysts,” Mekel-Bobrov said. “The question is, how do we organize ourselves such that we can have a smaller cohort of AI specialists build differentiated core capabilities that can then enable teams across the company to leverage those and build unique products in their own domains?”

Finally, a key facet of any AI strategy is data. Mekel-Bobrov said a company must ask where the data is while offline and streaming, as well as how it’s collected, labeled, managed, and searched. “The more we’re doing at scale, the more we’re doing with real time, the more we’re doing with dynamic learning and reinforcement learning, the more we have to push the boundaries of these infrastructure considerations,” Mekel-Bobrov added.

What’s next?

The ecommerce giant is now “turbocharging” its platform to make computer vision an integral part of the experience for both buyers and sellers, Mekel-Bobrov said.

eBay recently launched an image-scanning tool on its app, starting with trading cards that sellers can scan with their phone to auto-populate and create listings.

“If you think about an individual seller or small business that needs to continuously create new listings at a high volume but wants to maintain the highest quality of information, selling tools based on computer vision can be a real game-changer,” Mekel-Bobrov said. The ultimate goal is to maintain trust between buyers and sellers.

During the COVID-19 pandemic, eBay has used computer vision detection and classification models on items like hand sanitizers and masks to identify listings with inflated prices and products that make false health claims.

Since November 2020, eBay has removed 50 million listings that violated its COVID-19 policies, Mekel-Bobrov said.

“We’re continuously making every effort to ensure that anyone who sells on our platform follows local laws, as well as eBay’s policies,” Mekel-Bobrov said. “Doing that at a scale requires a lot of creative thinking and advanced technologies like computer vision.”


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link