altReboot
  • Startup
  • Growth Hacking
  • Marketing
  • Automation
  • Blockchain Tech
  • Artificial Intelligence
  • Contact
    • Write For Us
No Result
View All Result
  • Startup
  • Growth Hacking
  • Marketing
  • Automation
  • Blockchain Tech
  • Artificial Intelligence
  • Contact
    • Write For Us
No Result
View All Result
altReboot
No Result
View All Result
Home Artificial Intelligence

AI armed with multiple senses could gain more flexible intelligence

Katie McLean by Katie McLean
February 24, 2021
in Artificial Intelligence
Public policies in the age of digital disruption
2
SHARES
14
VIEWS
Share on FacebookShare on Twitter

This post originally appeared on MIT Technology Review

In late 2012, AI scientists first figured out how to get neural networks to “see.” They proved that software designed to loosely mimic the human brain could dramatically improve existing computer-vision systems. The field has since learned how to get neural networks to imitate the way we reason, hear, speak, and write.

But while AI has grown remarkably human-like—even superhuman—at achieving a specific task, it still doesn’t capture the flexibility of the human brain. We can learn skills in one context and apply them to another. By contrast, though DeepMind’s game-playing algorithm AlphaGo can beat the world’s best Go masters, it can’t extend that strategy beyond the board. Deep-learning algorithms, in other words, are masters at picking up patterns, but they cannot understand and adapt to a changing world.

Related articles

Geoffrey Hinton has a hunch about what’s next for AI

Geoffrey Hinton has a hunch about what’s next for AI

April 16, 2021
Building a high-performance data and AI organization

Building a high-performance data and AI organization

April 15, 2021

Researchers have many hypotheses about how this problem might be overcome, but one in particular has gained traction. Children learn about the world by sensing and talking about it. The combination seems key. As kids begin to associate words with sights, sounds, and other sensory information, they are able to describe more and more complicated phenomena and dynamics, tease apart what is causal from what reflects only correlation, and construct a sophisticated model of the world. That model then helps them navigate unfamiliar environments and put new knowledge and experiences in context.

AI systems, on the other hand, are built to do only one of these things at a time. Computer-vision and audio-recognition algorithms can sense things but cannot use language to describe them. A natural-­language model can manipulate words, but the words are detached from any sensory reality. If senses and language were combined to give an AI a more human-like way to gather and process new information, could it finally develop something like an understanding of the world?

The hope is that these “multimodal” systems, with access to both the sensory and linguistic “modes” of human intelligence, should give rise to a more robust kind of AI that can adapt more easily to new situations or problems. Such algorithms could then help us tackle more complex problems, or be ported into robots that can communicate and collaborate with us in our daily life.

New advances in language-­processing algorithms like OpenAI’s GPT-3 have helped. Researchers now understand how to replicate language manipulation well enough to make combining it with sensing capabilities more potentially fruitful. To start with, they are using the very first sensing capability the field achieved: computer vision. The results are simple bimodal models, or visual-language AI.

In the past year, there have been several exciting results in this area. In September, researchers at the Allen Institute for Artificial Intelligence, AI2, created a model that can generate an image from a text caption, demonstrating the algorithm’s ability to associate words with visual information. In November, researchers at the University of North Carolina, Chapel Hill, developed a method that incorporates images into existing language models, which boosted the models’ reading comprehension.

OpenAI then used these ideas to extend GPT-3. At the start of 2021, the lab released two visual-language models. One links the objects in an image to the words that describe them in a caption. The other generates images based on a combination of the concepts it has learned. You can prompt it, for example, to produce “a painting of a capybara sitting in a field at sunrise.” Though it may have never seen this before, it can mix and match what it knows of paintings, capybaras, fields, and sunrises to dream up dozens of examples.

Achieving more flexible intelligence wouldn’t just unlock new AI applications: it would make them safer, too.

More sophisticated multimodal systems will also make possible more advanced robotic assistants (think robot butlers, not just Alexa). The current generation of AI-powered robots primarily use visual data to navigate and interact with their surroundings. That’s good for completing simple tasks in constrained environments, like fulfilling orders in a warehouse. But labs like AI2 are working to add language and incorporate more sensory inputs, like audio and tactile data, so the machines can understand commands and perform more complex operations, like opening a door when someone is knocking.

In the long run, multimodal breakthroughs could help overcome some of AI’s biggest limitations. Experts argue, for example, that its inability to understand the world is also why it can easily fail or be tricked. (An image can be altered in a way that’s imperceptible to humans but makes an AI identify it as something completely different.) Achieving more flexible intelligence wouldn’t just unlock new AI applications: it would make them safer, too. Algorithms that screen résumés wouldn’t treat irrelevant characteristics like gender and race as signs of ability. Self-driving cars wouldn’t lose their bearings in unfamiliar surroundings and crash in the dark or in snowy weather. Multimodal systems might become the first AIs we can really trust with our lives.

Share1Tweet1

Related Posts

Geoffrey Hinton has a hunch about what’s next for AI

Geoffrey Hinton has a hunch about what’s next for AI

by Siobhan Roberts
April 16, 2021
0

Back in November, the computer scientist and cognitive psychologist Geoffrey Hinton had a hunch. After a half-century’s worth of attempts—some...

Building a high-performance data and AI organization

Building a high-performance data and AI organization

by MIT Technology Review Insights
April 15, 2021
0

CxOs and boards recognize that their organization’s ability to generate actionable insights from data, often in real-time, is of the...

Public policies in the age of digital disruption

Podcast: What’s AI doing in your wallet?

by Anthony Green
April 14, 2021
0

Our entire financial system is built on trust. We can exchange otherwise worthless paper bills for fresh groceries, or swipe...

Public policies in the age of digital disruption

The new lawsuit that shows facial recognition is officially a civil rights issue

by Tate Ryan-Mosley
April 14, 2021
0

On January 9, 2020, Detroit Police drove to the suburb of Farmington Hill and arrested Robert Williams in his driveway...

Public policies in the age of digital disruption

Big Tech’s guide to talking about AI ethics

by Karen Hao
April 13, 2021
0

AI researchers often say good machine learning is really more art than science. The same could be said for effective...

Load More
  • Trending
  • Comments
  • Latest
7 Advanced SEO Strategies I’m Trying to Implement Before 2020

7 Advanced SEO Strategies I’m Trying to Implement Before 2020

September 10, 2019
What Do Successful Sales Look Like for the Rest of 2020?

13 Expert Tips to Increase Online Conversions in 2020

September 26, 2020
Creating SEO-friendly how-to content

Creating SEO-friendly how-to content

October 24, 2019

How to Start a Multimillion-Dollar Amazon Business With Less Than $2,000

November 1, 2019
A Beginner’s Guide to Facebook Insights

A Beginner’s Guide to Facebook Insights

0

Which Social Media Sites Really Matter and Why

0
The 12 Ironclad Rules for Issuing Press Releases

The 12 Ironclad Rules for Issuing Press Releases

0
How to Get Started Building Links for SEO

How to Get Started Building Links for SEO

0

How to Use Amazon Audio Ads

April 16, 2021
How to Use a Web Cache Viewer: Everything You Need to Know

How to Use a Web Cache Viewer: Everything You Need to Know

April 16, 2021
How to Create the Perfect H1 Tag for SEO

How to Create the Perfect H1 Tag for SEO

April 16, 2021
Top platforms marketers can use for infographics

The search dilemma: looking beyond Google’s third-party cookie death

April 16, 2021
altReboot




altREBOOT is committed to sharing the game changing advancements that are revolutionizing how you do business. From startup to goliath, innovations in technology are changing the face of the business landscape. We are committed to exploring these and how to apply them to your business at any stage of development.





Categories

  • Artificial Intelligence
  • Blockchain Tech
  • Growth Hacking
  • Marketing
  • Startup
  • Uncategorized

Tags

blockchain branding guest post marketing mobile apps
  • Home
  • Topics
  • Write For Us
  • Privacy Policy
  • Contact

Powered By Treehouse 51

No Result
View All Result
  • Startup
  • Growth Hacking
  • Marketing
  • Automation
  • Blockchain Tech
  • Artificial Intelligence
  • Contact
    • Write For Us