We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 – 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!
Emza Visual Sense and Alif Semiconductor have demonstrated an optimized face detection model running on Alif’s Ensemble microcontroller based on Arm IP. The two found it is suitable for enhancing low-power artificial intelligence (AI) at the edge.
The emergence of optimized silicon, models and AI and machine learning (ML) frameworks has made it possible to run advanced AI inference tasks such as eye tracking and face identification at the edge, at low-power and low cost. This opens up new use cases in areas such as industrial IoT and consumer applications.
Making edge devices magnitudes faster
By using Alif’s Ensemble multipoint control unit (MCU), which the Alif claims is the first MCU using the Arm Ethos-U55 microNPU, the AI model ran “an order of magnitude” faster than a CPU-only solution with the M55 at 400MHz. It appears Alif meant two orders of magnitude, as the footnotes state that the high-performance U55 took 4ms compared to 394ms for the M55. The high efficiency U55 executed the model in 11ms. The Ethos-U55 is part of Arm’s Corstone-310 subsystem, which it launched new solutions for in April.
Emza said it trained a full “sophisticated” face detection model on the NPU that can be used for face detection, yaw face angle estimation and facial landmarks. The complete application code has been contributed to Arm’s open-source AI repository called “ML Embedded Eval Kit,” making it the first Arm AI ecosystem partner to do so. The repository can be used to gauge runtime, CPU demand and memory allocation before silicon is available.
“To unleash the potential of endpoint AI, we need to make it easier for IoT developers to access higher performance, less complex development flows and optimized ML models,” said Mohamed Awad, vice president of IoT and embedded at Arm. “Alif’s MCU is helping redefine what is possible at the smallest endpoints and Emza’s contribution of optimized models to the Arm AI open-source repository will accelerate edge AI development.”
Emza claims its visual sensing technology is already shipping in millions of products and with this demonstration, it is expanding its optimized algorithms to SoC vendors and OEMs.
“As we look at the dramatically expanding horizon for TinyML edge devices, Emza is focused on enabling new applications across a broad array of markets,” said Yoram Zylberberg, CEO ofEmza. “There is virtually no limit to the types of visual sensing use cases that can be supported by new powerful, highly efficient hardware.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.
Join Transform 2021 for the most important themes in enterprise AI & Data. Learn more.
During its Microsoft Ignite 2021 conference this week, Microsoft unveiled Azure Percept, a platform of hardware and services aimed at simplifying the ways customers can use AI technologies at the edge. According to the company, the goal of the new offering is to give customers an end-to-end system, from the hardware to the AI and machine learning capabilities.
Edge computing is forecast to be a $6.72 billion market by 2022. Its growth will coincide with that of the deep learning chipset market, which some analysts predict will reach $66.3 billion by 2025. There’s a reason for these rosy projections — edge computing is expected to make up roughly three-quarters of the total global AI chipset business in the next six years.
The Azure Percept platform includes a development kit with a camera called Azure Percept Vision, as well as a “getting started” experience called Azure Percept Studio that guides customers through the AI lifecycle. Azure Percept Studio includes developing and training resources, as well as guidance on deploying proof-of-concept ideas.
AI at the edge
Azure Percept Vision and Azure Percept Audio, which ships separately from the development kit, connect to Azure services and come with embedded hardware-accelerated modules that enable speech and vision AI at the edge or during times when the device isn’t connected to the internet. The hardware in the Azure Percept development kit uses the industry standard 80/20 T-slot framing architecture, which Microsoft says will make it easier for customers to pilot new product ideas.
As customers work on their ideas with the Azure Percept development kit, they’ll have access to Azure AI Cognitive Services and Azure Machine Learning models, plus AI models available from the open source community designed to run on the edge, Microsoft says. In addition, Azure Percept devices will automatically connect to Azure IoT Hub, which helps enable communication with security protections between internet of things devices and the cloud.
Azure Percept competes with Google’s Coral, a collection of hardware kits and accessories intended to bolster AI development at the edge. And Amazon recently announced AWS Panorama Appliance, a plug-in appliance that connects to a network and identifies videos from existing cameras with computer vision models for manufacturing, retail, construction, and other industries.
But in addition to announcing first-party hardware, Microsoft says it’s working with third-party silicon and equipment manufacturers to build an ecosystem of devices to run on the Azure Percept platform. Moreover, the company says the Azure Percept team is currently working with select early customers to understand concerns around the responsible development and deployment of AI on devices, providing them with documentation and access to toolkits for their AI implementations.
“We’ve started with the two most common AI workloads, vision and voice [and] sight and sound, and we’ve given out that blueprint so that manufacturers can take the basics of what we’ve started,” Microsoft VP Roanne Sones said. “But they can envision it in any kind of responsible form factor to cover a pattern of the world.”
A continued investment
In 2018, Microsoft committed $5 billion to intelligent edge innovation by 2022 — an uptick from the $1.5 billion it spent prior to 2018 — and pledged to grow its IoT partner ecosystem to over 10,000. This investment has borne fruit in Azure IoT Central, a cloud service that enables customers to quickly provision and deploy IoT apps, and IoT Plug and Play, which provides devices that work with a range of off-the-shelf solutions. Microsoft’s investment has also bolstered Azure Sphere; Azure Security Center, its unified cloud and edge security suite; and Azure IoT Edge, which distributes cloud intelligence to run in isolation on IoT devices directly.
Microsoft has competition in Google’s Cloud IoT, a set of tools that connect, process, store, and analyze edge device data. Not to be outdone, Amazon Web Services’ IoT Device Management tracks, monitors, and manages fleets of devices running a range of operating systems and software. And Baidu’s OpenEdge offers a range of IoT edge computing boards and a cloud-based management suite to manage edge nodes, edge apps, and resources such as certification, password, and program code.
But the Seattle company has ramped up its buildout efforts, most recently with the acquisition of CyberX and Express Logic, a San Diego, California-based developer of real-time operating systems (RTOS) for IoT and edge devices powered by microcontroller units. Microsoft has also partnered with companies like DJI, SAP, PTC, Qualcomm, and Carnegie Mellon University for IoT and edge app development.
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform
AMD’s highly anticipated Ryzen 4000 mobile CPUs may be built on the same 7nm process as the company’s wildly successfully Ryzen 3000 chips, but this time around the company is pinning the chip’s success on a carefully balanced design.
AMD officials said they’ve actually been working on the design for Ryzen 4000 mobile (code-named ‘Renoir’) since 2017, which, they note, predates the introduction of the company’s first Ryzen desktop chips.
The goal for the mobile chip couldn’t be more different. “The challenge in doing a notebook processor is balance: How do we balance the attributes that make it a good notebook processor?” said Dan Bouvier, AMD’s client products chief architect.
A laptop chip can’t go all-out like a desktop chip can. It has to consider the notebook chassis, the Z-height (thickness), the power envelope, and the battery life. “These are all opposing things that work against bringing higher performance,” Bouvier explained, “but you still want to balance that and bring the best performance.”
AMD
Most of the architectural changes with Zen 2 are well known, but its 15-percent increase in Instructions Per Clock (IPC) have made the Zen 2 the hit it is.
Bouvier added that AMD took a risk by stretching Renoir’s design beyond that of its quad-core predecessor. “When we started Renoir, we said, ‘let’s do quad-core, we’ll just make it faster.’” But Bouvier said AMD realized even more was possible so it aimed for a 6-core CPU. And once those models came back, AMD aimed even higher. “We started looking at the models and said, this is looking pretty good—let’s go further. So we did go eight cores, and we really went out on a limb.”
And remember, Bouvier pointed out: In 2017, competitor Intel was still selling a dual-core CPU.
AMD
Two 4-core CCXs are used to build the basic Ryzen 4000 CPUs today.
The Ryzen 4000 CPU’s basic building block is essentially the same 7nm Zen 2 core AMD has used with its Ryzen 3000 series and third-generation Threadripper CPUs, but optimized for mobile. The basic building block of a mobile Ryzen is built on quad-core core complexes, or “CCX.” Each CCX features four cores with SMT and 512MB of L2 cache, plus a 1MB Level 3 cache that’s shared among all four cores. Two of the CCXs make up an 8-core chip.
You might expect AMD to use a single cluster for power efficiency needs. Bouvier described it as a “tradeoff,” but noted that the multi-CCX still enables very high bandwidth, very high frequency, and better power performance.
AMD
Ryzen 4000 chips include highly optimized 7nm Radeon Vega graphics
7nm Vega: Not your Ryzen 3000’s Vega
When Ryzen 4000 was unveiled at CES, some were disappointed that its graphics are based on AMD’s older Radeon Vega cores, rather than the company’s newest Navi cores.
Bouvier said in 2017, when the chip was first sketched out, AMD didn’t think a mobile-optimized version of its Navi cores could be done in time. Luckily, Bouvier said, AMD also realized that the Vega architecture “still had a lot of gas in the tank.”
Many of the design decisions appear counterintuitive, Bouvier admitted. For example, while the 12nm Ryzen 7 3700U features 10 Vega Compute Units, the 7nm Ryzen 7 4800U features 8 Vega Compute units. “The reason is we studied this against our performance,” Bouvier explained. “As we shrink that engine down, everything got closer together, the wires got closer together, and we’re now able to run at a much higher frequency. So we traded area, which is good in a mobile device, for frequency, and 7nm gave us that.”
AMD
AMD said it will pick up 59 percent more performance per graphics compute unit over the previous gen Ryzen in large part thanks to the smaller 7nm process, higher clock speeds and architecture changes.
AMD says the newer 7nm-based Vega cores will offer a blistering 59 percent more performance per CU than the 12nm Vega cores in the Ryzen 7 3700U chip.
AMD silicon doesn’t get all the credit for the performance uplift though. Bouvier said that the platform greatly benefits from faster memory support. “In designing APUs, this is our biggest nemesis,” Bouvier said. “DDR(4) memory is not going faster and faster.” By moving to LPDDR4X support at up to 4,266MHz, Ryzen 4000 laptops gain a 77-percent uptick in memory bandwidth, according to Bouvier.
AMD
AMD’s Ryzen 4000 chips will pick up 77 percent more memory bandwidth over previous versions by using LPDDR4X/4266.
Battery life matters
While AMD’s first-gen 12nm Ryzen offered reasonable raw performance, they typically got blown out by Intel’s CPUs in battery life. That won’t likely be the same with Ryzen 4000, which will be far faster and far more efficient, AMD said.
“What we wanted to do is take a dragster engine, put it in an SUV, but still get Prius efficiency,” said AMD senior fellow Scott Swanstrom, referring in the latter case to Toyota’s pioneering fuel-efficient hybrid. “That was really the challenge for this product.” To do that, AMD made some significant changes to how the CPU manages its power states and its boost states.
AMD
AMD said the new Ryzen 4000 CPUs consume 59 percent less power compared to Ryzen 3000 CPUs during an application, because the chip spends more time in a low-power state.
AMD said one issue it solved in the move from the 12nm Ryzen 3000 mobile chips was ramping in and out from low power states. While it may sound good to ramp into a low power state as much as possible, the previous generation of chip was “overly aggressive” with it, Swanstrom said. Ryzen 3000 would ping-pong back and forth, which actually wasted power efficiency. With the Ryzen 4000, as you can see below, AMD largely avoids the see-saw and uses less power.
Swanstrom said Ryzen 4000 does this through looking at driver feedback, BIOS feedback, and OS feedback, as well as sensors placed in the laptop and in the CPU itself. For example, the graphics driver might look at whether it’s running a 3D-intensive load and then flag the system’s System Management Controller to let it know more power is needed.
While most of the time the System Management Controller will try to monitor the hardware and software to predict what performance to provide, Swanstrom said Ryzen 4000 will be closely tied with Windows 10’s Power Slider UI.
Today, the Power Slider UI doesn’t seem to do much on most laptops, but on many Ryzen 4000-based systems, it should actually give you more performance or actually maximize battery life. Many OEMs choose to provide their own power controls as well, and those will still exist with Ryzen 4000 laptops.
AMD
AMD’s System Management Controll is based mostly in firmware and knows if the laptop has a discrete graphics card and can budget power to different blocks of the laptop.
Maximum Boost
Modern laptops have long exploited boost clocks modes to extract the most amount of performance from a chip. AMD’s Ryzen 4000 will feature two modes to boost the clocks as high and as long as possible.
Both methods rely on telemetry data from the CPU itself as well as remote temperature diodes placed in the laptop chassis.
The first is Skin Temperature Aware Power Management, or STAPM. This is how AMD mainly tunes for high clocks on bursty or short loads. For example, if you were to launch a webpage and start to browse, STAPM would push the clock speeds and power usage as hard as possible and even exceed the CPU’s rated long-term thermal or power limits for a few micro-seconds. In some ways, it’s somewhat analogous to Intel’s Power Limit settings, which typically dictate power usage and clock speeds for very short duration.
Once the CPU’s power management has reached its limit and realizes that no, this isn’t a burst load, AMD’s System Temperature Tracking V2 technology kicks it.
STT V2 looks at the diodes that measure the temperature on the bottom of the laptop, or the keyboard, or near the GPU, and decides just how hard the processor can keep pushing. Laptop makers decide where the diodes go and just how hot the laptop skin can get, but AMD said STT V2 can typically extend a burst clock duration by 4x versus what using STAPM or STAPM-like techniques alone.
AMD
AMD’s STT V2 and STAPM are the guard rails on just how hard and how long the new Ryzen 4000 can boost to its maximum clock speeds. STAPM mainly guides short burst boosts while STT V2 controls long duration boost loads by considering laptop skin temp.
Conclusion: Now we wait
In the end, this sounds very impressive, but it’s just theory until we start to see Ryzen 4000 CPUs in shipping laptops. Still, we shouldn’t understate just how significant this is for AMD: For the first time in its history, it may very well be the lead in laptops.
Note: When you purchase something after clicking links in our articles, we may earn a small commission. Read our affiliate link policy for more details.