altReboot
  • Startup
  • Growth Hacking
  • Marketing
  • Automation
  • Blockchain Tech
  • Artificial Intelligence
  • Contact
    • Write For Us
No Result
View All Result
  • Startup
  • Growth Hacking
  • Marketing
  • Automation
  • Blockchain Tech
  • Artificial Intelligence
  • Contact
    • Write For Us
No Result
View All Result
altReboot
No Result
View All Result
Home Artificial Intelligence

An AI helps you summarize the latest in AI

Karen Hao by Karen Hao
November 18, 2020
in Artificial Intelligence
An AI helps you summarize the latest in AI
1
SHARES
6
VIEWS
Share on FacebookShare on Twitter

This post originally appeared on MIT Technology Review

The news: A new AI model for summarizing scientific literature can now assist researchers in wading through and identifying the latest cutting-edge papers they want to read. On November 16, the Allen Institute for Artificial Intelligence (AI2) rolled out the model onto its flagship product, Semantic Scholar, an AI-powered scientific paper search engine. It provides a one-sentence tl;dr (too long; didn’t read) summary under every computer science paper (for now) when users use the search function or go to an author’s page. The work was also accepted to the Empirical Methods for Natural Language Processing conference this week.

A screenshot of the tl;dr feature in Semantic Scholar.
AI2

The context: In an era of information overload, using AI to summarize text has been a popular natural-language processing (NLP) problem. There are two general approaches to this task. One is called “extractive,” which seeks to find a sentence or set of sentences from the text verbatim that captures its essence. The other is called “abstractive,” which involves generating new sentences. While extractive techniques used to be more popular due to the limitations of NLP systems, advances in natural language generation in recent years have made the abstractive one a whole lot better.

Related articles

Meet the AI algorithms that judge how beautiful you are

Meet the AI algorithms that judge how beautiful you are

March 5, 2021
Public policies in the age of digital disruption

“We’ll never have true AI without first understanding the brain”

March 3, 2021

How they did it: AI2’s abstractive model uses what’s known as a transformer—a type of neural network architecture first invented in 2017 that has since powered all of the major leaps in NLP, including OpenAI’s GPT-3. The researchers first trained the transformer on a generic corpus of text to establish its baseline familiarity with the English language. This process is known as “pre-training” and is part of what makes transformers so powerful. They then fine-tuned the model—in other words, trained it further—on the specific task of summarization.

The fine-tuning data: The researchers first created a dataset called SciTldr, which contains roughly 5,400 pairs of scientific papers and corresponding single-sentence summaries. To find these high-quality summaries, they first went hunting for them on OpenReview, a public conference paper submission platform where researchers will often post their own one-sentence synopsis of their paper. This provided a couple thousand pairs. The researchers then hired annotators to summarize more papers by reading and further condensing the synopses that had already been written by peer reviewers.

To supplement these 5,400 pairs even further, the researchers compiled a second dataset of 20,000 pairs of scientific papers and their titles. The researchers intuited that because titles themselves are a form of summary, they would further help the model improve its results. This was confirmed through experimentation.

Semantic Scholar's TLDR feature on mobile.
The tl;dr feature is particularly useful for skimming papers on mobile.
AI2

Extreme summarization: While many other research efforts have tackled the task of summarization, this one stands out for the level of compression it can achieve. The scientific papers included in the SciTldr dataset average 5,000 words. Their one-sentence summaries average 21. This means each paper is compressed on average to 238 times its size. The next best abstractive method is trained to compress scientific papers by an average of only 36.5 times. During testing, human reviewers also judged the model’s summaries to be more informative and accurate than previous methods.

Next steps: There are already a number of ways that AI2 is now working to improve their model in the short term, says Daniel Weld, a professor at the University of Washington and manager of the Semantic Scholar research group. For one, they plan to train the model to handle more than just computer science papers. For another, perhaps in part due to the training process, they’ve found that the tl;dr summaries sometimes overlap too much with the paper title, diminishing their overall utility. They plan to update the model’s training process to penalize such overlap so it learns to avoid repetition over time.

In the long-term, the team will also work summarizing multiple documents at a time, which could be useful for researchers entering a new field or perhaps even for policymakers wanting to get quickly up to speed. “What we’re really excited to do is create personalized research briefings,” Weld says, “where we can summarize not just one paper, but a set of six recent advances in a particular sub-area.”

ShareTweet

Related Posts

Meet the AI algorithms that judge how beautiful you are

Meet the AI algorithms that judge how beautiful you are

by Tate Ryan-Mosley
March 5, 2021
0

I first came across Qoves Studio through its popular YouTube channel, which offers polished videos like “Does the hairstyle make...

Public policies in the age of digital disruption

“We’ll never have true AI without first understanding the brain”

by Will Heaven
March 3, 2021
0

The search for AI has always been about trying to build machines that think—at least in some sense. But the...

Public policies in the age of digital disruption

What is an “algorithm”? It depends whom you ask

by Amy Nordrum
February 26, 2021
0

Describing a decision-making system as an “algorithm” is often a way to deflect accountability for human decisions. For many, the...

Public policies in the age of digital disruption

An AI is training counselors to deal with teens in crisis

by Abby Ohlheiser
February 26, 2021
0

Counselors volunteering at the Trevor Project need to be prepared for their first conversation with an LGBTQ teen who may...

Why covid-19 might finally usher in the era of health care based on a patient’s data

Why covid-19 might finally usher in the era of health care based on a patient’s data

by Katie McLean
February 24, 2021
0

Back in the 1990s, Lee Hood, a technologist and immunologist famous for co-­inventing the automated DNA sequencer, made a bold...

Load More
  • Trending
  • Comments
  • Latest
7 Advanced SEO Strategies I’m Trying to Implement Before 2020

7 Advanced SEO Strategies I’m Trying to Implement Before 2020

September 10, 2019
What Do Successful Sales Look Like for the Rest of 2020?

13 Expert Tips to Increase Online Conversions in 2020

September 26, 2020
Creating SEO-friendly how-to content

Creating SEO-friendly how-to content

October 24, 2019

How to Start and Fund a Coffee Shop

September 30, 2019
A Beginner’s Guide to Facebook Insights

A Beginner’s Guide to Facebook Insights

0

Which Social Media Sites Really Matter and Why

0
The 12 Ironclad Rules for Issuing Press Releases

The 12 Ironclad Rules for Issuing Press Releases

0
How to Get Started Building Links for SEO

How to Get Started Building Links for SEO

0

Shopify Review

March 7, 2021
The Buddha and the Business

How to Estimate the Development Cost of Your New Mobile App

March 7, 2021

A2 Hosting Review

March 6, 2021
How to Use Data Visualization in Your Content to Increase Readers and Leads

How to Use Data Visualization in Your Content to Increase Readers and Leads

March 5, 2021
altReboot




altREBOOT is committed to sharing the game changing advancements that are revolutionizing how you do business. From startup to goliath, innovations in technology are changing the face of the business landscape. We are committed to exploring these and how to apply them to your business at any stage of development.





Categories

  • Artificial Intelligence
  • Blockchain Tech
  • Growth Hacking
  • Marketing
  • Startup
  • Uncategorized

Tags

blockchain branding guest post marketing mobile apps
  • Home
  • Topics
  • Write For Us
  • Privacy Policy
  • Contact

Powered By Treehouse 51

No Result
View All Result
  • Startup
  • Growth Hacking
  • Marketing
  • Automation
  • Blockchain Tech
  • Artificial Intelligence
  • Contact
    • Write For Us