Machine learning techniques are providing new tools that could help archaeologists understand the past — particularly when it comes to deciphering ancient texts. The latest example is an AI model created by Alphabet-subsidiary DeepMind that helps not only restore text that is missing from ancient Greek inscriptions but offers suggestions for when the text was written (within a 30-year period) and its possible geographic origins.
“Inscriptions are really important because they are direct sources of evidence … written directly by ancient people themselves,” Thea Sommerschield, a historian and machine learning expert who helped created the model, told journalists in a press briefing.
Due to their age, these texts are often damaged, making restoration a rewarding challenge. And because they are often inscribed on inorganic material like stone or metal, it means methods like radiocarbon dating can’t be used to find out when they were written. “To solve these tasks, epigraphers look for textual and contextual parallels in similar inscriptions,” said Sommerschield, who was co-lead on the work alongside DeepMind staff research scientist Yannis Assael. “However, it’s really difficult for a human to harness all existing, relevant data and to discover underlying patterns.”
That’s where machine learning can help.
Ancient Greek inscriptions are often fragmented. The software Ithaca can suggest what letters are missing. Image: DeepMind
The new software, named Ithaca, is trained on a dataset of some 78,608 ancient Greek inscriptions, each of which is labeled with metadata describing where and when it was written (to the best of historians’ knowledge). Like all machine learning systems, Ithaca looks for patterns in this information, encoding this information in complex mathematical models, and uses these inferences to suggest text, date, and origins.
In a paper published in Naturethat describes Ithaca, the scientists who created the model say it is 62 percent accurate when restoring letters in damaged texts. It can attribute an inscription’s geographic origins to one of 84 regions of the ancient world with 71 percent accuracy and can date a text to within, on average, 30 years of its known year of writing.
These are promising statistics, but it’s important to remember that Ithaca is not capable of operating independently of human expertise. Its suggestions are ultimately based on data collected by traditional archaeological methods, and its creators are positioning it as simply another tool in a wider set of forensic methods, rather than a fully-automated AI historian. “Ithaca was designed as a complementary tool to aid historians,” said Sommerschield.
Ithaca is the first model to geographical and chronological attribution with textual restoration. Image: DeepMind
Eleanor Dickey, a professor of classics from the University of Reading who specializes in ancient Greek and Latin sociolinguists, told The Verge that Ithaca was an “exciting development that may improve our knowledge of the ancient world.” But, she added that a 62 percent accuracy for restoring lost text was not reassuringly high — “when people rely on it they will need to keep in mind that it is wrong about one third of the time” — and that she was not sure how the software would fit into existing academic methodologies.
For example, DeepMind highlighted tests that showed the model helped improve the accuracy of historians restoring missing text in ancient inscriptions from 25 percent to 72 percent. But Dickey notes that those being tested were students, not professional epigraphers. She says that AI models may be broadly accessible, but that doesn’t mean they can or should replace the small cadre of specialized academics who decipher texts.
“It is not yet clear to what extent use of this tool by genuinely qualified editors would result in an improvement in the editions generally available — but it will be interesting to find out,” said Dickey. She added that she was looking for to trying the Ithaca model out for herself. The software, along with its open-source code, is available online for anyone to test.
Ithaca and its predecessor (named Pythia and released in 2019) have already been used to help recent archaeological debates — including helping date inscriptions discovered in the Acropolis of Athens. However, the true potential of the software has yet to be seen.
Sommerschield stresses that the real value of Ithaca may be in its flexibility. Although it was trained on ancient Greek inscriptions, it could be easily configured to work with other ancient scripts. “Ithaca’s architecture makes it really applicable to any ancient language, not just Latin, but Mayan, cuneiform; really any written medium — papyri, manuscripts,” she said. “There’s a lot of opportunities.”
Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022.Learn more
While discussions about AI often center around the technology’s commercial potential, increasingly, researchers are investigating ways that AI can be harnessed to drive societal change. Among others, Facebook chief AI scientist Yann LeCun and Google Brain cofounder Andrew Ng have argued that mitigating climate change and promoting energy efficiency are preeminent challenges for AI researchers.
Along this vein, researchers at the Montreal AI Ethics Institute have proposed a framework designed to quantify the social impact of AI through techniques like compute-efficient machine learning. An IBM project delivers farm cultivation recommendations from digital farm “twins” that simulate the future soil conditions of real-world crops. Other researchers are using AI-generated images to help visualize climate change, and nonprofits like WattTime are working to reduce households’ carbon footprint by automating when electric vehicles, thermostats, and appliances are active based on where renewable energy is available.
Seeking to spur further explorations in the field, a group at the Stanford Sustainability and Artificial Intelligence Lab this week released (to coincide with NeurIPS 2021) a benchmark dataset called SustainBench for monitoring sustainable development goals (SDGs) including agriculture, health, and education using machine learning. As the coauthors told VentureBeat in an interview, the goal is threefold: (1) lower the barriers to entry for researchers to contribute to achieving SDGs; (2) provide metrics for evaluating SDG-tracking algorithms, and (3) encourage the development of methods where improved AI model performance facilitates progress towards SDGs.
“SustainBench was a natural outcome of the many research projects that [we’ve] worked on over the past half-decade. The driving force behind these research projects was always the lack of large, high-quality labeled datasets for measuring progress toward the United Nations Sustainable Development Goals (UN SDGs), which forced us to come up with creative machine learning techniques to overcome the label sparsity,” the coauthors said. “[H]aving accumulated enough experience working with datasets from diverse sustainability domains, we realized earlier this year that we were well-positioned to share our expertise on the data side of the machine learning equation … Indeed, we are not aware of any prior sustainability-focused datasets with similar size and scale of SustainBench.”
Motivation
Progress toward SDGs has historically been measured through civil registrations, population-based surveys, and government-orchestrated censuses. However, data collection is expensive, leading many countries to go decades between taking measurements on SDG indicators. It’s estimated that only half of SDG indicators have regular data from more than half of the world’s countries, limiting the ability of the international community to track progress toward the SDGs.
“For example, early on during the COVID-19 pandemic, many developing countries implemented their own cash transfer programs, similar to the direct cash payments from the IRS in the United States. However … data records on household wealth and income in developing countries are often unreliable or unavailable,” the coauthors said.
Innovations in AI have shown promise in helping to plug the data gaps, however. Data from satellite imagery, social media posts, and smartphones can be used to train models to predict things like poverty, annual land cover, deforestation, agricultural cropping patterns, crop yields, and even the location and impact of natural disasters. For example, the governments of Bangladesh, Mozambique, Nigeria, Togo, and Uganda used machine learning-based poverty and cropland maps to direct economic aid to their most vulnerable populations during the pandemic.
But progress has been hindered by challenges, including a lack of expertise and dearth of data for low-income countries. With SustainBench, the Stanford researchers — along with contributors at Caltech, UC Berkeley, and Carnegie Mellon — hope to provide a starting ground for training machine learning models that can help measure SDG indicators and have a wide range of applications for real-world tasks.
SustainBench contains a suite of 15 benchmark tasks across seven SDGs taken from the United Nations, including good health and well-being, quality education, and clean water and sanitation. Beyond this, SustainBench offers tasks for machine learning challenges that cover 119 countries, each designed to promote the development of SDG measurement methods on real-world data.
The coauthors caution that AI-based approaches should supplement, rather than replace, ground-based data collection. They point out that ground truth data are necessary for training models in the first place, and that even the best sensor data can only capture some — but not all — of the outcomes of interest. But AI, they still believe, can be helpful for measuring sustainability indicators in regions where ground truth measurements are scarce or unavailable.
“[SDG] indicators have tremendous implications for policymakers, yet ‘key data are scarce, and often scarcest in places where they are most needed,’ as several of our team members wrote in a recent Science review article. By using abundant, cheap, and frequently updated sensor data as inputs, AI can help plug these data gaps. Such input data sources include publicly available satellite images, crowdsourced street-level images, Wikipedia entries, and mobile phone records, among others,” the coauthors said.
Future work
In the short term, the coauthors say that they’re focused on raising awareness of SustainBench within the machine learning community. Future versions of SustainBench are in the planning stages, potentially with additional datasets and AI benchmarks.
“Two technical challenges stand out to us. The first challenge is to develop machine learning models that can reason about multi-modal data. Most AI models today tend to work with single data modalities (e.g., only satellite images, or only text), but sensor data often comes in many forms … The second challenge is to design models that can take advantage of the large amount of unlabeled sensor data, compared to sparse ground truth labels,” the coauthors said. “On the non-technical side, we also see a challenge in getting the broader machine learning community to focus more efforts on sustainability applications … As we alluded to earlier, we hope SustainBench makes it easier for machine learning researchers to recognize the role and challenges of machine learning for sustainability applications.”
For AI coverage, send news tips to Kyle Wiggers — and be sure to subscribe to the AI Weekly newsletter and bookmark our AI channel, The Machine.
Thanks for reading,
Kyle Wiggers
AI Staff Writer
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
Hundreds of millions of years of evolution have produced a variety of life-forms, each intelligent in its own fashion. Each species has evolved to develop innate skills, learning capacities, and a physical form that ensures survival in its environment.
But despite being inspired by nature and evolution, the field of artificial intelligence has largely focused on creating the elements of intelligence separately and fusing them together after the development process. While this approach has yielded great results, it has also limited the flexibility of AI agents in some of the basic skills found in even the simplest life-forms.
In a new paper published in the scientific journal Nature, AI researchers at Stanford University present a new technique that can help take steps toward overcoming some of these limits. Called “deep evolutionary reinforcement learning,” or DERL, the new technique uses a complex virtual environment and reinforcement learning to create virtual agents that can evolve both in their physical structure and learning capacities. The findings can have important implications for the future of AI and robotics research.
Evolution is hard to simulate
In nature, the body and brain evolve together. Across many generations, every animal species has gone through countless cycles of mutation to grow limbs, organs, and a nervous system to support the functions it needs in its environment. Mosquitos are equipped with thermal vision to spot body heat. Bats have wings to fly and an echolocation apparatus to navigate dark spaces. Sea turtles have flippers to swim with and a magnetic field detector system to travel very long distances. Humans have an upright posture that frees their arms and lets them see the far horizon, hands and nimble fingers that can manipulate objects, and a brain that makes them the best social creatures and problem solvers on the planet.
Interestingly, all these species descended from the first life-form that appeared on Earth several billion years ago. Based on the selection pressures caused by the environment, the descendants of those first living beings evolved in many directions.
Studying the evolution of life and intelligence is interesting, but replicating it is extremely difficult. An AI system that would want to recreate intelligent life in the same way that evolution did would have to search a very large space of possible morphologies, which is extremely expensive computationally. It would need a lot of parallel and sequential trial-and-error cycles.
AI researchers use several shortcuts and predesigned features to overcome some of these challenges. For example, they fix the architecture or physical design of an AI or robotic system and focus on optimizing the learnable parameters. Another shortcut is the use of Lamarckian rather than Darwinian evolution, in which AI agents pass on their learned parameters to their descendants. Yet another approach is to train different AI subsystems separately (vision, locomotion, language, etc.) and then tack them on together in a final AI or robotic system. While these approaches speed up the process and reduce the costs of training and evolving AI agents, they also limit the flexibility and variety of results that can be achieved.
Deep evolutionary reinforcement learning
In their new work, the researchers at Stanford aim to bring AI research a step closer to the real evolutionary process while keeping the costs as low as possible. “Our goal is to elucidate some principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control,” they wrote in their paper.
Within the DERL framework, each agent uses deep reinforcement learning to acquire the skills required to maximize its goals during its lifetime. DERL uses Darwinian evolution to search the morphological space for optimal solutions, which means that when a new generation of AI agents are spawned, they only inherit the physical and architectural traits of their parents (along with slight mutations). None of the learned parameters are passed on across generations.
“DERL opens the door to performing large-scale in silico experiments to yield scientific insights into how learning and evolution cooperatively create sophisticated relationships between environmental complexity, morphological intelligence, and the learnability of control tasks,” the researchers wrote.
Simulating evolution
For their framework, the researchers used MuJoCo, a virtual environment that provides highly accurate rigid-body physics simulation. Their design space is called Universal Animal (Unimal), in which the goal is to create morphologies that learn locomotion and object-manipulation tasks in a variety of terrains.
Each agent in the environment is composed of a genotype that defines its limbs and joints. The direct descendant of each agent inherits the parent’s genotype and goes through mutations that can create new limbs, remove existing limbs, or make small modifications to characteristics, such as the degrees of freedom or the size of limbs.
Each agent is trained with reinforcement learning to maximize rewards in various environments. The most basic task is locomotion, in which the agent is rewarded for the distance it travels during an episode. Agents whose physical structures are better suited for traversing terrain learn faster to use their limbs for moving around.
To test the system’s results, the researchers generated agents in three types of terrains: flat (FT), variable (VT), and variable terrains with modifiable objects (MVT). The flat terrain puts the least selection pressure on the agents’ morphology. The variable terrains, on the other hand, force the agents to develop a more versatile physical structure that can climb slopes and move around obstacles. The MVT variant has the added challenge of requiring the agents to manipulate objects to achieve their goals.
The benefits of DERL
Above: Deep evolutionary reinforcement learning generates a variety of successful morphologies across different environments.
Image Credit: TechTalks
One of the interesting findings of DERL is the diversity of the results. Other approaches to evolutionary AI tend to converge on one solution because new agents directly inherit the physique and learnings of their parents. But in DERL, only morphological data is passed on to descendants; the system ends up creating a diverse set of successful morphologies, including bipeds, tripeds, and quadrupeds with and without arms.
At the same time, the system shows traits of the Baldwin effect, which suggests that agents that learn faster are more likely to reproduce and pass on their genes to the next generation. DERL shows that evolution “selects for faster learners without any direct selection pressure for doing so,” according to the Stanford paper.
“Intriguingly, the existence of this morphological Baldwin effect could be exploited in future studies to create embodied agents with lower sample complexity and higher generalization capacity,” the researchers wrote.
Finally, the DERL framework also validates the hypothesis that more complex environments will give rise to more intelligent agents. The researchers tested the evolved agents across eight different tasks, including patrolling, escaping, manipulating objects, and exploration. Their findings show that in general, agents that have evolved in variable terrains learn faster and perform better than AI agents that have only experienced flat terrain.
Their findings seem to be in line with another hypothesis by DeepMind researchers that a complex environment, a suitable reward structure, and reinforcement learning can eventually lead to the emergence of all kinds of intelligent behaviors.
AI and robotics research
The DERL environment only has a fraction of the complexities of the real world. “Although DERL enables us to take a significant step forward in scaling the complexity of evolutionary environments, an important line of future work will involve designing more open-ended, physically realistic, and multiagent evolutionary environments,” the researchers wrote.
In the future, the researchers plan to expand the range of evaluation tasks to better assess how the agents can enhance their ability to learn human-relevant behaviors.
The work could have important implications for the future of AI and robotics and push researchers to use exploration methods that are much more similar to natural evolution.
“We hope our work encourages further large-scale explorations of learning and evolution in other contexts to yield new scientific insights into the emergence of rapidly learnable intelligent behaviors, as well as new engineering advances in our ability to instantiate them in machines,” the researchers wrote.
Ben Dickson is a software engineer and the founder of TechTalks. He writes about technology, business, and politics.
This story originally appeared on Bdtechtalks.com. Copyright 2021
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
Streamlit, a popular app framework for data science and machine learning, has reached its version 1.0 milestone. The open source project is curated by a company of the same name that offers a commercial service built on the platform. So far, the project has had more than 4.5 million GitHub downloads and is used by more than 10,000 organizations.
The framework fills a vital void between data scientists who want to develop a new analytics widget or app and the data engineering typically required to deploy these at scale. Data scientists can build web apps to access and explore machine-learning models, advanced algorithms, and complex data types without having to master back-end data engineering tasks.
Streamlit cofounder and CEO Adrien Treuille told VentureBeat that “the combination of the elegant simplicity of the Streamlit library and the fact that it is all in Python means developers can do things in hours that normally took weeks.”
Examples of this increased productivity boost include reducing data app development time from three and a half weeks to six hours or reducing 5,000 lines of JavaScript to 254 lines of Python in Streamlit, Treuille said.
The crowded landscape of data science apps
The San Francisco-based company joins a crowded landscape filled with dozens of DataOps tools that hope to streamline various aspects of AI, analytics, and machine-learning development. Treuille attributes the company’s quick growth to being able to fill the gap between data scientists’ tools for rapid exploration (Jupyter notebooks, for one example) and the complex technologies companies use to build robust internal tools (React and GraphQL), front-end interface (React and JavaScript), and data engineering tools (dbt and Spark). “This gap has been a huge pain point for companies and often means that rich data insights and models are siloed in the data team,” Treuille said.
The tools are used by everyone from data science students to large companies. The company is seeing the fastest growth in tech-focused enterprises with a large base of Python users and a need to rapidly experiment with new apps and analytics.
“Every company has the same problems with lots of data, lots of questions, and too little time to answer all of them,” Treuille said.
Improvements in v1.0 include faster app speed and responsiveness, improved customization, and support for statefulness. The company plans to enhance its widget library, improve the developer experience, and make it easier for data scientists to share code, components, apps, and answers next year in 2022.
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
This article is part of a Technology and Innovation Insights series paid for by Samsung.
Similar to the relationship between an engine and oil, data and artificial intelligence (AI) are symbiotic. Data fuels AI, and AI helps us to understand the data available to us. Data and AI are two of the biggest topics in technology in recent years, as both work together to shape our lives on a daily basis. The sheer amount of data available right now is staggering and it doubles every two years. However, we currently only use about 2 percent of the data available to us. Much like when oil was first discovered, it is taking time for humans to figure out what to do with the new data available to us and how to make it useful.
Whether pulled from the cloud, your phone, TV, or an IoT device, the vast range of connected streams provide data on just about everything that goes on in our daily lives. But what do we do with it?
Earlier this month, HARMAN’s Chairman Young Sohn sat down with international journalist Ali Aslan in Berlin, Germany at the “New Data Economy and its Consequences” video symposium held by Global Bridges. Young and Ali discussed the importance of data, why AI without data is useless, and what needs to be considered when we look at the ethical use of data and AI — including bias, privacy, and security.
Bias
Unlike humans, technology and data are not inherently bias. As the old adage goes — data never lies. Bias in data and AI comes into play when humans train an AI algorithm or interpret data. Much of what we are consuming is influenced based on where the data is coming from and what data is going into the system. Understanding and eliminating our bias are essential to ensuring a neutral algorithm and system.
Controlling data access and permissions are a key first step to remove bias. Having a diverse and inclusive team when developing algorithms and systems is essential. Not everyone has lived the same experiences and backgrounders. Diversity in both can help curb biases by providing different ways of interpreting data inputs and outputs.
Privacy
Permission and access are paramount when we look at the privacy aspect of data. Privacy is extremely important in our increasingly digital society. As such, consumers should have a choice at the beginning of a relationship with an organization and be asked whether they want to opt-in, rather than having to opt-out. GDPR has been a good first step in helping to protect consumers in regards to the capture and use of their data. While GDPR has many well-designed and important initiatives, the legislation could be more efficient.
Security
Whereas data privacy is more of a concern to consumers and individuals, data security has become a global concern for consumers, organizations, and nation-states.
It seems like every day we are reading about another cyber-attack or threat that we should be aware of. Chief among these concerns are the influx of ransomware attacks. Companies and individuals are paying increasingly large amounts of money to bad actors in an attempt to mitigate risk, attention, and embarrassment. These attacks are being carried out by individuals, collectives, and even nation-states in an attempt to cripple the systems of enemies, gather classified information, or garner capital gains.
So how do we trust our data and information is safe and what can we do to be better protected? While there may be bad actors using technology and data for their own nefarious devices, there are also many positive uses for technology. The amount of education and investments being made in the cybersecurity space have helped many organizations to train employees and invest in technologies that are designed to prevent cybercrime at the source — human error. And while we may not be able to stop all cybercrime, we are making progress.
Data and AI for good
While data — both from a collection and storage viewpoint — and AI have gotten negative press around biases, privacy, and security, both can also be used to do an immense amount of good. For example, both data and AI have been crucial in the biomedical and agtech industries. Whether it’s COVID-19 detection and vaccine creation or the creation of biomes and removal of toxins in soil, data and AI have incredible potential. However, one cannot move forward without the other. A solid and stable infrastructure and network are also needed to ensure that we can make use of the other 98 percent of the global data available.
VB Lab Insights content is created in collaboration with a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. Content produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact sales@venturebeat.com.
Deep neural networks have gained fame for their capability to process visual information. And in the past few years, they have become a key component of manycomputer vision applications.
Among the key problems neural networks can solve is detecting and localizing objects in images. Object detection is used in many different domains, includingautonomous driving, video surveillance, and healthcare.
One of the key components of most deep learning–based computer vision applications is theconvolutional neural network(CNN). Invented in the 1980s by deep learning pioneerYann LeCun, CNNs are a type of neural network that is efficient at capturing patterns in multidimensional spaces. This makes CNNs especially good for images, though they are used to process other types of data too. (To focus on visual data, we’ll consider our convolutional neural networks to be two-dimensional in this article.)
Every convolutional neural network is composed of one or severalconvolutional layers, a software component that extracts meaningful values from the input image. And every convolution layer is composed of several filters, square matrices that slide across the image and register the weighted sum of pixel values at different locations. Each filter has different values and extracts different features from the input image. The output of a convolution layer is a set of “feature maps.”
When stacked on top of each other, convolutional layers can detect a hierarchy of visual patterns. For instance, the lower layers will produce feature maps for vertical and horizontal edges, corners, and other simple patterns. The next layers can detect more complex patterns such as grids and circles. As you move deeper into the network, the layers will detect complicated objects such as cars, houses, trees, and people.
Each layer of the neural network encodes specific features from the input image.
Most convolutional neural networks use pooling layers to gradually reduce the size of their feature maps and keep the most prominent parts. Max-pooling, which is currently the main type of pooling layer used in CNNs, keeps the maximum value in a patch of pixels. For example, if you use a pooling layer with a size 2, it will take 2×2-pixel patches from the feature maps produced by the preceding layer and keep the highest value. This operation halves the size of the maps and keeps the most relevant features. Pooling layers enable CNNs to generalize their capabilities and be less sensitive to the displacement of objects across images.
Finally, the output of the convolution layers is flattened into a single dimension matrix that is the numerical representation of the features contained in the image. That matrix is then fed into a series of “fully connected” layers of artificial neurons that map the features to the kind of output expected from the network.
Architecture of convolutional neural network (CNN)
The most basic task for convolutional neural networks is image classification, in which the network takes an image as input and returns a list of values that represent the probability that the image belongs to one of several classes.
For example, say you want to train a neural network to detect all 1,000 classes of objects contained in the popular open-source datasetImageNet. In that case, your output layer will have 1,000 numerical outputs, each of which contains the probability of the image belonging to one of those classes.
You can always create and test your own convolutional neural network from scratch. But most machine learning researchers and developers use one of several tried and tested convolutional neural networks such as AlexNet, VGG16, and ResNet-50.
Object detection datasets
Object-detection networks need to be trained on precisely annotated images.
While an image classification network can tell whether an image contains a certain object or not, it won’t say where in the image the object is located. Object detection networks provide both the class of objects contained in an image and a bounding box that provides the coordinates of that object.
Object detection networks bear much resemblance to image classification networks and use convolution layers to detect visual features. In fact, most object detection networks use an image classification CNN and repurpose it for object detection.
Object detection is a supervised machine learning problem, which means you must train your models on labeled examples. Each image in the training dataset must be accompanied with a file that includes the boundaries and classes of the objects it contains. There are several open-source tools that create object detection annotations.
Example of an annotation file for object detection training data.
The object detection network is trained on the annotated data until it can find regions in images that correspond to each kind of object.
Now let’s look at a few object-detection neural network architectures.
The R-CNN deep learning model
R-CNN architecture.
The Region-based Convolutional Neural Network (R-CNN) was proposed by AI researchers at the University of California, Berkley, in 2014. The R-CNN is composed of three key components.
First, a region selector uses “selective search,” algorithm that find regions of pixels in the image that might represent objects, also called “regions of interest” (RoI). The region selector generates around 2,000 regions of interest for each image.
Next, the RoIs are warped into a predefined size and passed on to a convolutional neural network. The CNN processes every region separately extracts the features through a series of convolution operations. The CNN uses fully connected layers to encode the feature maps into a single-dimensional vector of numerical values.
Finally, a classifier machine learning model maps the encoded features obtained from the CNN to the output classes. The classifier has a separate output class for “background,” which corresponds to anything that isn’t an object.
Object detection with R-CNN.
The original R-CNN paper suggests the AlexNet convolutional neural network for feature extraction and a support vector machine (SVM) for classification. But in the years since the paper was published, researchers have used newer network architectures and classification models to improve the performance of R-CNN.
R-CNN suffers from a few problems. First, the model must generate and crop 2,000 separate regions for each image, which can take quite a while. Second, the model must compute the features for each of the 2,000 regions separately. This amounts to a lot of calculations and slows down the process, making R-CNN unsuitable for real-time object detection. And finally, the model is composed of three separate components, which makes it hard to integrate computations and improve speed.
Fast R-CNN
Fast R-CNN architecture.
In 2015, the lead author of the R-CNN paper proposed a new architecture called Fast R-CNN, which solved some of the problems of its predecessor. Fast R-CNN brings feature extraction and region selection into a single machine learning model.
Fast R-CNN receives an image and a set of RoIs and returns a list of bounding boxes and classes of the objects detected in the image.
One of the key innovations in Fast R-CNN was the “RoI pooling layer,” an operation that takes CNN feature maps and regions of interest for an image and provides the corresponding features for each region. This allowed Fast R-CNN to extract features for all the regions of interest in the image in a single pass as opposed to R-CNN, which processed each region separately. This resulted in a significant boost in speed.
However, one issue remained unsolved. Fast R-CNN still required the regions of the image to be extracted and provided as input to the model. Fast R-CNN was still not ready for real-time object detection.
Faster R-CNN
[faster r-cnn architecture]
Faster R-CNN architecture.
Faster R-CNN, introduced in 2016, solves the final piece of the object-detection puzzle by integrating the region extraction mechanism into the object detection network.
Faster R-CNN takes an image as input and returns a list of object classes and their corresponding bounding boxes.
The architecture of Faster R-CNN is largely similar to that of Fast R-CNN. Its main innovation is the “region proposal network” (RPN), a component that takes the feature maps produced by a convolutional neural network and proposes a set of bounding boxes where objects might be located. The proposed regions are then passed to the RoI pooling layer. The rest of the process is similar to Fast R-CNN.
By integrating region detection into the main neural network architecture, Faster R-CNN achieves near-real-time object detection speed.
YOLO
YOLO architecture.
In 2016, researchers at Washington University, Allen Institute for AI, and Facebook AI Research proposed “You Only Look Once” (YOLO), a family of neural networks that improved the speed and accuracy of object detection with deep learning.
The main improvement in YOLO is the integration of the entire object detection and classification process in a single network. Instead of extracting features and regions separately, YOLO performs everything in a single pass through a single network, hence the name “You Only Look Once.”
YOLO can perform object detection at video streaming frame rates and is suitable applications that require real-time inference.
In the past few years, deep learning object detection has come a long way, evolving from a patchwork of different components to a single neural network that works efficiently. Today, many applications use object-detection networks as one of their main components. It’s in your phone, computer, car, camera, and more. It will be interesting (and perhaps creepy) to see what can be achieved with increasingly advanced neural networks.
This article was originally published by Ben Dickson on TechTalks, a publication that examines trends in technology, how they affect the way we live and do business, and the problems they solve. But we also discuss the evil side of technology, the darker implications of new tech, and what we need to look out for. You can read the original article here.
Normal human beings think of running, going to the gym, and putting a cap on those beer bottles when it comes to losing weight. But scientists are thinking about this whole thing differently. They want you to shut your trap, so you can stop hogging on those fries — literally.
A bunch of researchers from the University of Otago and the UK have teamed up to come up with a device called DentalSlim for weight loss. While the name sounds fancy, it doesn’t describe the type of torture device it really is.
According to the description, DentalSlim will prevent you from opening your mouth wider than 2mm by locking your jaw through “magnetic devices and custom-manufactured locking bolts.”
It gets attached to your teeth and it’s meant to make you adhere to the low-calorie diet and stop your snacking habits. Researchers say that it allows you to consume liquids easily without restricting breathing or speech.
They also mentioned that after two or three weeks of this rigorous routine (read: torture), the magnetic hinges of the device loosen up, allowing the person to have a more relaxed diet. The idea is to avoid cost-heavy surgery to reduce weight, and instead force control for a certain time with this device.
There have been some unique devices, such as this Gastric Band, out there to control food intake. But in terms of being straight out horrific, DentalSlim takes the cake.
While the intention of the device. seem noble on paper, the device itself looks like a miniature version of a jaw trap that’ll be featured in the movie series SAW.
Elevate your enterprise data technology and strategy at Transform 2021.
This week in a paper published in the journal Nature, researchers at Google detailed how they used AI to design the next generation of tensor processing units (TPU), the company’s application-specific integrated circuits optimized for AI workloads. While the work wasn’t novel — Google’s been refining the technique for the better part of years — it gave the clearest illustration yet of AI’s potential in hardware design. Previous experiments didn’t yield commercially viable products, only prototypes. But the Nature paper suggests AI can at the very least augment human designers to accelerate the brainstorming process.
Beyond chips, companies like U.S.- and Belgium-based Oqton are applying AI to design domains including additive manufacturing. Oqton’s platform automates CNC, metal, and polymer 3D printing and hybrid additive and subtractive workflows, like creating castable jewelry wax. It suggests a range of optimizations and fixes informed by AI inspection algorithms, as well as by pre-analyses of part geometry and real-time calibration. For example, Oqton can automatically adjust geometries to get parts within required tolerances, simulating heat treatment effects like warpage, shrinkage, and stress relief on titanium, cobalt, chrome, zirconia, and other materials.
While it’s still in the research stages, MIT’s Computer Science and Artificial Intelligence Laboratory developed an AI-powered tool called LaserFactory that can print fully functional robots and drones. LaserFactory leverages a three-ingredient recipe that lets users create structural geometry, print traces, and assemble electronic components like sensors, circuits, and actuators. As the researchers behind LaserFactory note in a paper describing their work, it could in theory be used for jobs like delivery or search-and-rescue.
At Renault, engineers are leveraging AI-powered software created by Siemens Digital Industries Software to automate the design of automated manual transmission (AMT) systems in cars. AMT, which behaves like an automatic transmission but allows drivers to shift gears electronically using a push-button, can take up to a year of trial and error to ideate, develop, and thoroughly validate. But Siemen’s tool enables Renault engineers to drag, drop, and connect icons to graphically create a model of an AMT. The software predicts the behavior and performance of the AMT’s components and makes any necessary refinements early in the development cycle.
Even Nutella is tapping AI for physical products, using the technology to pull from a database of dozens of patterns and colors to create different versions of its packaging. In 2017, working with advertising agency Ogilvy & Mather Italia, the company splashed over 7 million unique designs on “Nutella Unica” jars throughout Italy, which sold out in a month.
Philosophical shift
People might perceive these applications as taking agency away from human designers, but the coauthors of a recent Harvard Business School working paper argue that AI actually enables designers to overcome past limitations — from scale and scope to learning.
“In the context of AI factories, solutions may even be more user-centered, more creative, and continuously updated through learning iterations that span the entire life cycle of a product. Yet, we found that AI profoundly changes the practice of design,” the coauthors write. “Problem solving tasks, traditionally carried on by designers, are now automated into learning loops that operate without limitations of volume and speed. These loops think in a radically different way than a designer: they address complex problems through very simple tasks, iterated exponentially.”
In a recent blog post, user experience designer Miklos Philips echoed the findings of the Harvard Business Review paper contributors, noting that designers working with AI can create prototypes quickly and more cheaply due to the increased efficiency it offers. AI’s power will lie in the speed in which it can analyze vast amounts of data and suggest design adjustments, he says, so that a designer can cherry-pick and approve adjustments based on data and create the most effective designs to test expediently.
In any case, the ROI of AI-assisted design tools is potentially substantial. According to a 2020 PricewaterhouseCoopers survey, companies in manufacturing expect efficiency gains over the next five years attributable to digital transformations, including the adoption of AI and machine learning. Perhaps unsurprisingly, 76% of respondents to a Google Cloud report published this week said they’ve turned to “disruptive technologies” like AI, data analytics, and the cloud, particularly to help navigate challenges brought on by the pandemic.
Given the business value, AI-powered design is likely here to stay — and to grow. That’s generally good news not only for designers, but for the enterprises and consumers that stand to reap the benefits of automation across physical product creation.
VentureBeat
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.
Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:
up-to-date information on the subjects of interest to you
our newsletters
gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
“It’s not an earthquake that kills people, but the collapse of a poorly built building.”
Build Change, a foundation dedicated to preventing housing loss caused by natural disasters such as earthquakes and windstorms, is announcing the “Intelligence Supervision Assistant for Construction” (ISAC-SIMO) app. It’s an open-source, AI-powered quality assurance tool for construction. And it could save countless lives.
Meet ISAC-SIMO:
The tool utilizes machine learning to help people ensure they’re using the best materials and construction methods to ensure buildings are disaster-ready.
Typically, when we discuss earthquake-proofing, we’re talking about making skyscrapers and bridges safe using advanced engineering and materials. But the challenge of rebuilding communities in emergent areas isn’t necessarily about coming up with new engineering solutions.
Often, one of the biggest challenges in these scenarios is finding enough expertise to ensure that laborers are building back safely and with the correct materials.
According to Elizabeth Hausler, Founder & CEO of Build Change:
ISAC-SIMO has amazing potential to radically improve construction quality and ensure that homes are built or strengthened to a resilient standard, especially in areas affected by earthquakes, windstorms, and climate change.
We’ve created a foundation from which the open source community can develop and contribute different models to enable this tool to reach its full potential. The Linux Foundation, building on the support of IBM over these past three years, will help us build this community.
The app got its beginning as a runner up in the Call for Code challenge, a yearly open-source development event hosted by David Clarke Cause, IBM, The Linux Foundation, and other partners.
Quick take: Anyone who’s ever done construction in emergent areas where regulatory bodies are either stretched thin or non-existent can attest to the fact that you absolutely never know what you’re going to find when it comes to building safety.
Putting an app in people’s hands that will let them know things such as whether the brick and mortar in their walls is safe, down to whether the proper rebar and brackets are in use ahead of a build, will definitely save time, money, and lives.
TLDR: With Camtasia 2021, no-nonsense video producers can knock out a professional grade video that looks fantastic and won’t eat up hours of time to make, all at $100 off its regular price.
Video production. The words instantly make any content creator get anxious. They aren’t usually overly worried about creating the video itself. They’re worried about the time and money that go into such an event.
While Hollywood blockbusters and the high cost of video production suites like the Adobe Creative Cloud have many fearful they’ll end up footing the bill for a $200 million James Cameron-style spectacular, there are still plenty of smart, even cost-effective ways of delivering a brilliant digital video quickly and painlessly.
Camtasia 2021 is ready to step in on that project, offering a host of options for making a professional-grade video for any presentation, social media post, advertisement, and more that won’t cost an arm and a leg.
Camtasia works with a very simple streamlined approach. Loaded with pre-built video templates, this software lets users record their computer screens, import PowerPoint presentations and basically add all kinds of visual aspects to their project, then turn it all into a video.
Without resorting to the fancy editing tricks of those ultra-expensive editing suites, Camtasia is a beginner-friendly platform and interface focused on easy solutions that make creating a video both quick to produce and attractive for viewers.
Using Camtasia, creators can record from their computer screen or from a webcam, then add high quality visual effects that fit with the project, all with simple drag-and-drop functionality. With a click, you can build in eye-catching titles and annotations, zoom in, pan across, animate objects and even transition between scenes with the skill of an Oscar-winning editor.
Castasia also includes a complete library of royalty-free music and sound effects for projects, so you’ll never run afoul of copyright infringement lawyers.
And as the 2021 edition, this version includes a whole collection of brand-new features, including 75 new and modern transition effects, motion blur and corner rounding abilities, and customizable media clips to go with your newly-shot video, logos, color schemes, and more.
Right now, this bundle includes a copy of Camtasia 2021 as well as a year of program maintenance, which not only offers priority support and exclusive training, but will also hook you up with a brand new copy of Camtasia 2022 when it’s eventually released.
Regularly $299, this extremely limited time deal cuts that price by a third, getting you Camtasia 2021 and a year of app support for just $199. This offer is only available for a few more days, so get in on the deal now while you can.