DeepMind’s new AI model helps decipher, date, and locate ancient inscriptions

Machine learning techniques are providing new tools that could help archaeologists understand the past — particularly when it comes to deciphering ancient texts. The latest example is an AI model created by Alphabet-subsidiary DeepMind that helps not only restore text that is missing from ancient Greek inscriptions but offers suggestions for when the text was written (within a 30-year period) and its possible geographic origins.

“Inscriptions are really important because they are direct sources of evidence … written directly by ancient people themselves,” Thea Sommerschield, a historian and machine learning expert who helped created the model, told journalists in a press briefing.

Due to their age, these texts are often damaged, making restoration a rewarding challenge. And because they are often inscribed on inorganic material like stone or metal, it means methods like radiocarbon dating can’t be used to find out when they were written. “To solve these tasks, epigraphers look for textual and contextual parallels in similar inscriptions,” said Sommerschield, who was co-lead on the work alongside DeepMind staff research scientist Yannis Assael. “However, it’s really difficult for a human to harness all existing, relevant data and to discover underlying patterns.”

That’s where machine learning can help.

Ancient Greek inscriptions are often fragmented. The software Ithaca can suggest what letters are missing.
Image: DeepMind

The new software, named Ithaca, is trained on a dataset of some 78,608 ancient Greek inscriptions, each of which is labeled with metadata describing where and when it was written (to the best of historians’ knowledge). Like all machine learning systems, Ithaca looks for patterns in this information, encoding this information in complex mathematical models, and uses these inferences to suggest text, date, and origins.

In a paper published in Nature that describes Ithaca, the scientists who created the model say it is 62 percent accurate when restoring letters in damaged texts. It can attribute an inscription’s geographic origins to one of 84 regions of the ancient world with 71 percent accuracy and can date a text to within, on average, 30 years of its known year of writing.

These are promising statistics, but it’s important to remember that Ithaca is not capable of operating independently of human expertise. Its suggestions are ultimately based on data collected by traditional archaeological methods, and its creators are positioning it as simply another tool in a wider set of forensic methods, rather than a fully-automated AI historian. “Ithaca was designed as a complementary tool to aid historians,” said Sommerschield.

Ithaca is the first model to geographical and chronological attribution with textual restoration.
Image: DeepMind

Eleanor Dickey, a professor of classics from the University of Reading who specializes in ancient Greek and Latin sociolinguists, told The Verge that Ithaca was an “exciting development that may improve our knowledge of the ancient world.” But, she added that a 62 percent accuracy for restoring lost text was not reassuringly high — “when people rely on it they will need to keep in mind that it is wrong about one third of the time” — and that she was not sure how the software would fit into existing academic methodologies.

For example, DeepMind highlighted tests that showed the model helped improve the accuracy of historians restoring missing text in ancient inscriptions from 25 percent to 72 percent. But Dickey notes that those being tested were students, not professional epigraphers. She says that AI models may be broadly accessible, but that doesn’t mean they can or should replace the small cadre of specialized academics who decipher texts.

“It is not yet clear to what extent use of this tool by genuinely qualified editors would result in an improvement in the editions generally available — but it will be interesting to find out,” said Dickey. She added that she was looking for to trying the Ithaca model out for herself. The software, along with its open-source code, is available online for anyone to test.

Ithaca and its predecessor (named Pythia and released in 2019) have already been used to help recent archaeological debates — including helping date inscriptions discovered in the Acropolis of Athens. However, the true potential of the software has yet to be seen.

Sommerschield stresses that the real value of Ithaca may be in its flexibility. Although it was trained on ancient Greek inscriptions, it could be easily configured to work with other ancient scripts. “Ithaca’s architecture makes it really applicable to any ancient language, not just Latin, but Mayan, cuneiform; really any written medium — papyri, manuscripts,” she said. “There’s a lot of opportunities.”

Repost: Original Source and Author Link


AI Weekly: AI researchers release toolkit to promote AI that helps to achieve sustainability goals

Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more

While discussions about AI often center around the technology’s commercial potential, increasingly, researchers are investigating ways that AI can be harnessed to drive societal change. Among others, Facebook chief AI scientist Yann LeCun and Google Brain cofounder Andrew Ng have argued that mitigating climate change and promoting energy efficiency are preeminent challenges for AI researchers.

Along this vein, researchers at the Montreal AI Ethics Institute have proposed a framework designed to quantify the social impact of AI through techniques like compute-efficient machine learning. An IBM project delivers farm cultivation recommendations from digital farm “twins” that simulate the future soil conditions of real-world crops. Other researchers are using AI-generated images to help visualize climate change, and nonprofits like WattTime are working to reduce households’ carbon footprint by automating when electric vehicles, thermostats, and appliances are active based on where renewable energy is available.

Seeking to spur further explorations in the field, a group at the Stanford Sustainability and Artificial Intelligence Lab this week released (to coincide with NeurIPS 2021) a benchmark dataset called SustainBench for monitoring sustainable development goals (SDGs) including agriculture, health, and education using machine learning. As the coauthors told VentureBeat in an interview, the goal is threefold: (1) lower the barriers to entry for researchers to contribute to achieving SDGs; (2) provide metrics for evaluating SDG-tracking algorithms, and (3) encourage the development of methods where improved AI model performance facilitates progress towards SDGs.

“SustainBench was a natural outcome of the many research projects that [we’ve] worked on over the past half-decade. The driving force behind these research projects was always the lack of large, high-quality labeled datasets for measuring progress toward the United Nations Sustainable Development Goals (UN SDGs), which forced us to come up with creative machine learning techniques to overcome the label sparsity,” the coauthors said. “[H]aving accumulated enough experience working with datasets from diverse sustainability domains, we realized earlier this year that we were well-positioned to share our expertise on the data side of the machine learning equation … Indeed, we are not aware of any prior sustainability-focused datasets with similar size and scale of SustainBench.”


Progress toward SDGs has historically been measured through civil registrations, population-based surveys, and government-orchestrated censuses. However, data collection is expensive, leading many countries to go decades between taking measurements on SDG indicators. It’s estimated that only half of SDG indicators have regular data from more than half of the world’s countries, limiting the ability of the international community to track progress toward the SDGs.

“For example, early on during the COVID-19 pandemic, many developing countries implemented their own cash transfer programs, similar to the direct cash payments from the IRS in the United States. However … data records on household wealth and income in developing countries are often unreliable or unavailable,” the coauthors said.

Innovations in AI have shown promise in helping to plug the data gaps, however. Data from satellite imagery, social media posts, and smartphones can be used to train models to predict things like poverty, annual land cover, deforestation, agricultural cropping patterns, crop yields, and even the location and impact of natural disasters. For example, the governments of Bangladesh, Mozambique, Nigeria, Togo, and Uganda used machine learning-based poverty and cropland maps to direct economic aid to their most vulnerable populations during the pandemic.

But progress has been hindered by challenges, including a lack of expertise and dearth of data for low-income countries. With SustainBench, the Stanford researchers — along with contributors at Caltech, UC Berkeley, and Carnegie Mellon — hope to provide a starting ground for training machine learning models that can help measure SDG indicators and have a wide range of applications for real-world tasks.

SustainBench contains a suite of 15 benchmark tasks across seven SDGs taken from the United Nations, including good health and well-being, quality education, and clean water and sanitation. Beyond this, SustainBench offers tasks for machine learning challenges that cover 119 countries, each designed to promote the development of SDG measurement methods on real-world data.

The coauthors caution that AI-based approaches should supplement, rather than replace, ground-based data collection. They point out that ground truth data are necessary for training models in the first place, and that even the best sensor data can only capture some — but not all — of the outcomes of interest. But AI, they still believe, can be helpful for measuring sustainability indicators in regions where ground truth measurements are scarce or unavailable.

“[SDG] indicators have tremendous implications for policymakers, yet ‘key data are scarce, and often scarcest in places where they are most needed,’ as several of our team members wrote in a recent Science review article. By using abundant, cheap, and frequently updated sensor data as inputs, AI can help plug these data gaps. Such input data sources include publicly available satellite images, crowdsourced street-level images, Wikipedia entries, and mobile phone records, among others,” the coauthors said.

Future work

In the short term, the coauthors say that they’re focused on raising awareness of SustainBench within the machine learning community. Future versions of SustainBench are in the planning stages, potentially with additional datasets and AI benchmarks.

“Two technical challenges stand out to us. The first challenge is to develop machine learning models that can reason about multi-modal data. Most AI models today tend to work with single data modalities (e.g., only satellite images, or only text), but sensor data often comes in many forms … The second challenge is to design models that can take advantage of the large amount of unlabeled sensor data, compared to sparse ground truth labels,” the coauthors said. “On the non-technical side, we also see a challenge in getting the broader machine learning community to focus more efforts on sustainability applications … As we alluded to earlier, we hope SustainBench makes it easier for machine learning researchers to recognize the role and challenges of machine learning for sustainability applications.”

For AI coverage, send news tips to Kyle Wiggers — and be sure to subscribe to the AI Weekly newsletter and bookmark our AI channel, The Machine.

Thanks for reading,

Kyle Wiggers

AI Staff Writer


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link


New deep reinforcement learning technique helps AI to evolve

Hundreds of millions of years of evolution have produced a variety of life-forms, each intelligent in its own fashion. Each species has evolved to develop innate skills, learning capacities, and a physical form that ensures survival in its environment.

But despite being inspired by nature and evolution, the field of artificial intelligence has largely focused on creating the elements of intelligence separately and fusing them together after the development process. While this approach has yielded great results, it has also limited the flexibility of AI agents in some of the basic skills found in even the simplest life-forms.

In a new paper published in the scientific journal Nature, AI researchers at Stanford University present a new technique that can help take steps toward overcoming some of these limits. Called “deep evolutionary reinforcement learning,” or DERL, the new technique uses a complex virtual environment and reinforcement learning to create virtual agents that can evolve both in their physical structure and learning capacities. The findings can have important implications for the future of AI and robotics research.

Evolution is hard to simulate

In nature, the body and brain evolve together. Across many generations, every animal species has gone through countless cycles of mutation to grow limbs, organs, and a nervous system to support the functions it needs in its environment. Mosquitos are equipped with thermal vision to spot body heat. Bats have wings to fly and an echolocation apparatus to navigate dark spaces. Sea turtles have flippers to swim with and a magnetic field detector system to travel very long distances. Humans have an upright posture that frees their arms and lets them see the far horizon, hands and nimble fingers that can manipulate objects, and a brain that makes them the best social creatures and problem solvers on the planet.

Interestingly, all these species descended from the first life-form that appeared on Earth several billion years ago. Based on the selection pressures caused by the environment, the descendants of those first living beings evolved in many directions.

Studying the evolution of life and intelligence is interesting, but replicating it is extremely difficult. An AI system that would want to recreate intelligent life in the same way that evolution did would have to search a very large space of possible morphologies, which is extremely expensive computationally. It would need a lot of parallel and sequential trial-and-error cycles.

AI researchers use several shortcuts and predesigned features to overcome some of these challenges. For example, they fix the architecture or physical design of an AI or robotic system and focus on optimizing the learnable parameters. Another shortcut is the use of Lamarckian rather than Darwinian evolution, in which AI agents pass on their learned parameters to their descendants. Yet another approach is to train different AI subsystems separately (vision, locomotion, language, etc.) and then tack them on together in a final AI or robotic system. While these approaches speed up the process and reduce the costs of training and evolving AI agents, they also limit the flexibility and variety of results that can be achieved.

Deep evolutionary reinforcement learning

In their new work, the researchers at Stanford aim to bring AI research a step closer to the real evolutionary process while keeping the costs as low as possible. “Our goal is to elucidate some principles governing relations between environmental complexity, evolved morphology, and the learnability of intelligent control,” they wrote in their paper.

Within the DERL framework, each agent uses deep reinforcement learning to acquire the skills required to maximize its goals during its lifetime. DERL uses Darwinian evolution to search the morphological space for optimal solutions, which means that when a new generation of AI agents are spawned, they only inherit the physical and architectural traits of their parents (along with slight mutations). None of the learned parameters are passed on across generations.

“DERL opens the door to performing large-scale in silico experiments to yield scientific insights into how learning and evolution cooperatively create sophisticated relationships between environmental complexity, morphological intelligence, and the learnability of control tasks,” the researchers wrote.

Simulating evolution

For their framework, the researchers used MuJoCo, a virtual environment that provides highly accurate rigid-body physics simulation. Their design space is called Universal Animal (Unimal), in which the goal is to create morphologies that learn locomotion and object-manipulation tasks in a variety of terrains.

Each agent in the environment is composed of a genotype that defines its limbs and joints. The direct descendant of each agent inherits the parent’s genotype and goes through mutations that can create new limbs, remove existing limbs, or make small modifications to characteristics, such as the degrees of freedom or the size of limbs.

Each agent is trained with reinforcement learning to maximize rewards in various environments. The most basic task is locomotion, in which the agent is rewarded for the distance it travels during an episode. Agents whose physical structures are better suited for traversing terrain learn faster to use their limbs for moving around.

To test the system’s results, the researchers generated agents in three types of terrains: flat (FT), variable (VT), and variable terrains with modifiable objects (MVT). The flat terrain puts the least selection pressure on the agents’ morphology. The variable terrains, on the other hand, force the agents to develop a more versatile physical structure that can climb slopes and move around obstacles. The MVT variant has the added challenge of requiring the agents to manipulate objects to achieve their goals.

The benefits of DERL

An image of AI-generated shapes in different configurations and a set of data tables regarding their morphological results.

Above: Deep evolutionary reinforcement learning generates a variety of successful morphologies across different environments.

Image Credit: TechTalks

One of the interesting findings of DERL is the diversity of the results. Other approaches to evolutionary AI tend to converge on one solution because new agents directly inherit the physique and learnings of their parents. But in DERL, only morphological data is passed on to descendants; the system ends up creating a diverse set of successful morphologies, including bipeds, tripeds, and quadrupeds with and without arms.

At the same time, the system shows traits of the Baldwin effect, which suggests that agents that learn faster are more likely to reproduce and pass on their genes to the next generation. DERL shows that evolution “selects for faster learners without any direct selection pressure for doing so,” according to the Stanford paper.

“Intriguingly, the existence of this morphological Baldwin effect could be exploited in future studies to create embodied agents with lower sample complexity and higher generalization capacity,” the researchers wrote.

Finally, the DERL framework also validates the hypothesis that more complex environments will give rise to more intelligent agents. The researchers tested the evolved agents across eight different tasks, including patrolling, escaping, manipulating objects, and exploration. Their findings show that in general, agents that have evolved in variable terrains learn faster and perform better than AI agents that have only experienced flat terrain.

Their findings seem to be in line with another hypothesis by DeepMind researchers that a complex environment, a suitable reward structure, and reinforcement learning can eventually lead to the emergence of all kinds of intelligent behaviors.

AI and robotics research

The DERL environment only has a fraction of the complexities of the real world. “Although DERL enables us to take a significant step forward in scaling the complexity of evolutionary environments, an important line of future work will involve designing more open-ended, physically realistic, and multiagent evolutionary environments,” the researchers wrote.

In the future, the researchers plan to expand the range of evaluation tasks to better assess how the agents can enhance their ability to learn human-relevant behaviors.

The work could have important implications for the future of AI and robotics and push researchers to use exploration methods that are much more similar to natural evolution.

“We hope our work encourages further large-scale explorations of learning and evolution in other contexts to yield new scientific insights into the emergence of rapidly learnable intelligent behaviors, as well as new engineering advances in our ability to instantiate them in machines,” the researchers wrote.

Ben Dickson is a software engineer and the founder of TechTalks. He writes about technology, business, and politics.

This story originally appeared on Copyright 2021


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link


Streamlit, which helps data scientists build apps, hits version 1.0

Streamlit, a popular app framework for data science and machine learning, has reached its version 1.0 milestone. The open source project is curated by a company of the same name that offers a commercial service built on the platform. So far, the project has had more than 4.5 million GitHub downloads and is used by more than 10,000 organizations.

The framework fills a vital void between data scientists who want to develop a new analytics widget or app and the data engineering typically required to deploy these at scale. Data scientists can build web apps to access and explore machine-learning models, advanced algorithms, and complex data types without having to master back-end data engineering tasks.

Streamlit cofounder and CEO Adrien Treuille told VentureBeat that “the combination of the elegant simplicity of the Streamlit library and the fact that it is all in Python means developers can do things in hours that normally took weeks.”

Examples of this increased productivity boost include reducing data app development time from three and a half weeks to six hours or reducing 5,000 lines of JavaScript to 254 lines of Python in Streamlit, Treuille said.

The crowded landscape of data science apps

The San Francisco-based company joins a crowded landscape filled with dozens of DataOps tools that hope to streamline various aspects of AI, analytics, and machine-learning development. Treuille attributes the company’s quick growth to being able to fill the gap between data scientists’ tools for rapid exploration (Jupyter notebooks, for one example) and the complex technologies companies use to build robust internal tools (React and GraphQL), front-end interface (React and JavaScript), and data engineering tools (dbt and Spark). “This gap has been a huge pain point for companies and often means that rich data insights and models are siloed in the data team,” Treuille said.

The tools are used by everyone from data science students to large companies. The company is seeing the fastest growth in tech-focused enterprises with a large base of Python users and a need to rapidly experiment with new apps and analytics.

“Every company has the same problems with lots of data, lots of questions, and too little time to answer all of them,” Treuille said.

Improvements in v1.0 include faster app speed and responsiveness, improved customization, and support for statefulness. The company plans to enhance its widget library, improve the developer experience, and make it easier for data scientists to share code, components, apps, and answers next year in 2022.


VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative technology and transact.

Our site delivers essential information on data technologies and strategies to guide you as you lead your organizations. We invite you to become a member of our community, to access:

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Become a member

Repost: Original Source and Author Link


The data economy: How AI helps us understand and utilize our data 

This article is part of a Technology and Innovation Insights series paid for by Samsung. 

Similar to the relationship between an engine and oil, data and artificial intelligence (AI) are symbiotic. Data fuels AI, and AI helps us to understand the data available to us. Data and AI are two of the biggest topics in technology in recent years, as both work together to shape our lives on a daily basis. The sheer amount of data available right now is staggering and it doubles every two years. However, we currently only use about 2 percent of the data available to us. Much like when oil was first discovered, it is taking time for humans to figure out what to do with the new data available to us and how to make it useful.

Whether pulled from the cloud, your phone, TV, or an IoT device, the vast range of connected streams provide data on just about everything that goes on in our daily lives. But what do we do with it?

Earlier this month, HARMAN’s Chairman Young Sohn sat down with international journalist Ali Aslan in Berlin, Germany at the “New Data Economy and its Consequences” video symposium held by Global Bridges. Young and Ali discussed the importance of data, why AI without data is useless, and what needs to be considered when we look at the ethical use of data and AI — including bias, privacy, and security.


Unlike humans, technology and data are not inherently bias. As the old adage goes — data never lies. Bias in data and AI comes into play when humans train an AI algorithm or interpret data. Much of what we are consuming is influenced based on where the data is coming from and what data is going into the system. Understanding and eliminating our bias are essential to ensuring a neutral algorithm and system.

Controlling data access and permissions are a key first step to remove bias. Having a diverse and inclusive team when developing algorithms and systems is essential. Not everyone has lived the same experiences and backgrounders. Diversity in both can help curb biases by providing different ways of interpreting data inputs and outputs.


Permission and access are paramount when we look at the privacy aspect of data. Privacy is extremely important in our increasingly digital society. As such, consumers should have a choice at the beginning of a relationship with an organization and be asked whether they want to opt-in, rather than having to opt-out. GDPR has been a good first step in helping to protect consumers in regards to the capture and use of their data. While GDPR has many well-designed and important initiatives, the legislation could be more efficient.


Whereas data privacy is more of a concern to consumers and individuals, data security has become a global concern for consumers, organizations, and nation-states.

It seems like every day we are reading about another cyber-attack or threat that we should be aware of. Chief among these concerns are the influx of ransomware attacks. Companies and individuals are paying increasingly large amounts of money to bad actors in an attempt to mitigate risk, attention, and embarrassment. These attacks are being carried out by individuals, collectives, and even nation-states in an attempt to cripple the systems of enemies, gather classified information, or garner capital gains.

So how do we trust our data and information is safe and what can we do to be better protected? While there may be bad actors using technology and data for their own nefarious devices, there are also many positive uses for technology. The amount of education and investments being made in the cybersecurity space have helped many organizations to train employees and invest in technologies that are designed to prevent cybercrime at the source — human error. And while we may not be able to stop all cybercrime, we are making progress.

Data and AI for good

While data — both from a collection and storage viewpoint — and AI have gotten negative press around biases, privacy, and security, both can also be used to do an immense amount of good. For example, both data and AI have been crucial in the biomedical and agtech industries. Whether it’s COVID-19 detection and vaccine creation or the creation of biomes and removal of toxins in soil, data and AI have incredible potential. However, one cannot move forward without the other. A solid and stable infrastructure and network are also needed to ensure that we can make use of the other 98 percent of the global data available.

VB Lab Insights content is created in collaboration with a company that is either paying for the post or has a business relationship with VentureBeat, and they’re always clearly marked. Content produced by our editorial team is never influenced by advertisers or sponsors in any way. For more information, contact

Repost: Original Source and Author Link

Tech News

Here’s how deep learning helps computers detect objects

Deep neural networks have gained fame for their capability to process visual information. And in the past few years, they have become a key component of many computer vision applications.

Among the key problems neural networks can solve is detecting and localizing objects in images. Object detection is used in many different domains, including autonomous driving, video surveillance, and healthcare.

In this post, I will briefly review the deep learning architectures that help computers detect objects.

Convolutional neural networks

One of the key components of most deep learning–based computer vision applications is the convolutional neural network (CNN). Invented in the 1980s by deep learning pioneer Yann LeCun, CNNs are a type of neural network that is efficient at capturing patterns in multidimensional spaces. This makes CNNs especially good for images, though they are used to process other types of data too. (To focus on visual data, we’ll consider our convolutional neural networks to be two-dimensional in this article.)

Every convolutional neural network is composed of one or several convolutional layers, a software component that extracts meaningful values from the input image. And every convolution layer is composed of several filters, square matrices that slide across the image and register the weighted sum of pixel values at different locations. Each filter has different values and extracts different features from the input image. The output of a convolution layer is a set of “feature maps.”

When stacked on top of each other, convolutional layers can detect a hierarchy of visual patterns. For instance, the lower layers will produce feature maps for vertical and horizontal edges, corners, and other simple patterns. The next layers can detect more complex patterns such as grids and circles. As you move deeper into the network, the layers will detect complicated objects such as cars, houses, trees, and people.