Sunday, June 8, 2025
No Result
View All Result
newshub
  • Global news
  • Financial insights
    • Africa
    • Asia
    • Australia
    • Central Banks
    • China
    • Commodities
    • Europe
    • Banking
    • Corporate
    • Neobanking
    • Investment
    • Japan
    • South East Asia
    • Stock of the week
    • UK
    • US
  • Fin & tech
    • AI
    • Blockchain
    • Crypto
    • MSTRpay
    • Tech
  • Climate & energy
    • Climate
    • Carbon
    • Coal
    • Disruptive
    • Gas
    • Nuclear
    • Oil
    • Solar
    • Water
    • Waves
    • Wind
    • Renewable
    • South America
  • Lifestyle
    • Best chefs
    • Cocktail of the week
    • History
    • Influential women
  • WEX
    • Alt Kap Holding AB
    • Digital Network Holding, Inc.
    • Fantas-E AB
    • International Clean Energy Inc.
    • Intritum Partner Limited
    • Intritum Recycling GH Limited
    • MSTRpay AB
    • SWAP Services, Inc.
    • VMT Holding, Inc.
    • Universal Streaming Technologies – USTA
    • TC Unterhaltungselektronik AG
  • Global news
  • Financial insights
    • Africa
    • Asia
    • Australia
    • Central Banks
    • China
    • Commodities
    • Europe
    • Banking
    • Corporate
    • Neobanking
    • Investment
    • Japan
    • South East Asia
    • Stock of the week
    • UK
    • US
  • Fin & tech
    • AI
    • Blockchain
    • Crypto
    • MSTRpay
    • Tech
  • Climate & energy
    • Climate
    • Carbon
    • Coal
    • Disruptive
    • Gas
    • Nuclear
    • Oil
    • Solar
    • Water
    • Waves
    • Wind
    • Renewable
    • South America
  • Lifestyle
    • Best chefs
    • Cocktail of the week
    • History
    • Influential women
  • WEX
    • Alt Kap Holding AB
    • Digital Network Holding, Inc.
    • Fantas-E AB
    • International Clean Energy Inc.
    • Intritum Partner Limited
    • Intritum Recycling GH Limited
    • MSTRpay AB
    • SWAP Services, Inc.
    • VMT Holding, Inc.
    • Universal Streaming Technologies – USTA
    • TC Unterhaltungselektronik AG
No Result
View All Result
newshub
No Result
View All Result
ADVERTISEMENT

MIT researchers make language models scalable self-learners

2023/06/21/09:37
in AI
Reading Time: 5 mins read
235 18
A A
MIT researchers make language models scalable self-learners
MSTRpay MSTRpay MSTRpay
ADVERTISEMENT

The scientists used a natural language-based logical inference dataset to create smaller language models that outperformed much larger counterparts

Socrates once said: “It is not the size of a thing, but the quality that truly matters. For it is in the nature of substance, not its volume, that true value is found.”

Does size always matter for large language models (LLMs)? In a technological landscape bedazzled by LLMs taking center stage, a team of MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) researchers think smaller models shouldn’t be overlooked, especially for natural language understanding products widely deployed in the industry.

To that end, the researchers cooked up an approach to long-standing problems of inefficiency and privacy associated with big, text-based AI models — a logic-aware model that outperforms 500-times-bigger counterparts on some language understanding tasks without human-generated annotations, while preserving privacy and robustness with high performance.

LLMs, which have shown some promising skills in generating language, art, and code, are computationally expensive, and their data requirements can risk privacy leaks when using application programming interfaces for data upload. Smaller models have been historically less capable, particularly in multitasking and weakly supervised tasks, compared to their larger counterparts.

So what’s helping these smaller models act so mighty, then? Something called “textual entailment,” a way to help these models understand a variety of language tasks, where if one sentence (the premise) is true, then the other sentence (the hypothesis) is likely to be true as well. For example, if the premise is, “all cats have tails” then the hypothesis “a tabby cat has a tail” would be entailed by the premise. This concept is used to train an “entailment model” that proved to be less biased than other language models, from the team’s previous research. They then created “prompts” that the models can use to figure out if certain information is entailed by a given sentence or phrase according to different tasks. This method improved the model’s ability to adapt to different tasks without any additional training, known as zero-shot adaptation.

In the realm of “natural language understanding,” there are various applications that hinge on determining the relationship between two pieces of text. For example, in sentiment classification, a statement like “I think the movie is good” can be inferred or entailed from a movie review that says, “I like the story and the acting is great,” indicating a positive sentiment. Another is news classification, where the topic of a news article can be inferred from its content. For example, a statement like “the news article is about sports” can be entailed if the main content of the article reports on an NBA game. The key insight was that many existing natural language understanding tasks could be recast as an entailment (i.e., logical inference in natural language) task. 

“Our research is about improving the ability of computer programs to understand and process natural language — the way humans speak and write. Our self-trained, 350-million-parameter entailment models, without human-generated labels, outperform supervised language models with 137 to 175 billion parameters,” says MIT CSAIL postdoc Hongyin Luo, lead author on a new paper about the study. “This has potential to reshape the landscape of AI and machine learning, providing a more scalable, trustworthy, and cost-effective solution to language modeling,” says Luo. “By proving that smaller models can perform at the same level as larger ones for language understanding, this work paves the way for more sustainable and privacy-preserving AI technologies.” 

The team discovered that they could improve the model’s performance even more by using a technique called “self-training,” where the model uses its own predictions to teach itself, effectively learning without human supervision and additional annotated training data.The self-training method significantly improved performance on a bunch of downstream tasks, including sentiment analysis, question-answering, and news classification. It outperformed both Google’s LaMDA and FLAN in zero-shot capabilities, GPT models, and other supervised algorithms. 

However, one challenge with self-training is that the model can sometimes generate incorrect or noisy labels that harm performance. To overcome this, they developed a new algorithm called ‘SimPLE’ (Simple Pseudo-Label Editing), a process to review and modify the pseudo-labels made in initial rounds of learning. By correcting any mislabeled instances, it improved the overall quality of the self-generated labels. This not only made the models more effective at understanding language, but more robust when faced with adversarial data. 

As with most research, there are some limitations. The self-training on multi-class classification tasks didn’t perform as well as on binary natural language understanding tasks, indicating the challenge of applying entailment models to multi-choice tasks.

“This research presents an efficient and effective way to train large language models (LLMs) by formulating natural language understanding tasks as contextual entailment problems and employing a pseudo-labeling self-training mechanism to incorporate large quantities of unlabelled text data in the training process,” adds CSAIL Senior Research Scientist James Glass, who is also an author on the paper. “While the field of LLMs is undergoing rapid and dramatic changes, this research shows that it is possible to produce relatively compact language models that perform very well on benchmark understanding tasks compared to their peers of roughly the same size, or even much larger language models.”

“Entailment task is a popular proxy to evaluate “understanding” of a given context by an AI model,” says Leonid Karlinsky, research staff member at the MIT-IBM Watson AI Lab. “It is used in many areas analyzing models with unimodal, like LLMs, and and multi-modal, like VLMs [visual language models]inputs, simplifying the task of question-answering about a given input context to a binary classification problem — does this context entail a certain (e.g., text) conclusion or not? This paper makes two contributions in this space. First, it proposes a way to improve the zero-shot (without additional tuning) NLU performance and robustness to adversarial attacks via tuning with synthesized (specialized) entailment tasks generated for the primal NLU task. Second, it offers a self-supervised SimPLE method including pseudo-labeling and confidence-based filtering to further improve large LLMs’ NLU performance.”

Luo and Glass wrote the paper with Yoon Kim, a CSAIL member and assistant professor in MIT’s Department of Electrical Engineering and Computer Science, and Jiaxin Ge of Peking University. Their work will be presented at the meeting of the Association for Computational Linguistics in Toronto, Ontario this July. This research was supported by a grant from the Hong Kong Innovation AI program.

Source: MiT

Related Posts

AI turns rogue with blackmail over programmer’s extramarital affair
AI

AI turns rogue with blackmail over programmer’s extramarital affair

by newshub
2 weeks ago

A new AI programme, faced with the threat of being replaced, took an alarming turn by resorting to blackmail, threatening...

Read moreDetails
Google unveils AI Mode: a conversational leap in search powered by Gemini 2.5

Google unveils AI Mode: a conversational leap in search powered by Gemini 2.5

3 weeks ago
US tech firms strike AI deals as Trump tours Gulf states

US tech firms strike AI deals as Trump tours Gulf states

4 weeks ago
NVIDIA Dynamo: Scaling AI inference with open-source efficiency

NVIDIA Dynamo: Scaling AI inference with open-source efficiency

3 months ago
OpenAI and Musk agree to fast tracked trial over for-profit shift

OpenAI and Musk agree to fast tracked trial over for-profit shift

3 months ago
Oracle launches GenAI-based agents to fight financial crime

Oracle launches GenAI-based agents to fight financial crime

3 months ago
No Result
View All Result

Recent Posts

  • Strike‑Star ignites creativity at Edna Manley College toghether with MSTRpay
  • Kabul may become the first modern city to run out of water
  • U2 and Bono: From post-punk rebellion to global force and Ireland’s cultural catalyst
  • Bitcoin sits near $105,000 as market eyes possible rally to $150K
  • Europe sees fragile recovery as growth edges upward

Recent Comments

    Archives

    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022

    Categories

    • Africa
    • AI
    • An diesem Tag
    • Asia
    • Australia
    • Banking
    • Best chefs
    • Biden
    • Blockchain
    • Blockchain technology
    • Carbon
    • Central Banks
    • China
    • Climate
    • Climate & Energy
    • Coal
    • Cocktail of the week
    • Commodities
    • Corporate
    • Crypto
    • Deutsch
    • Deutsch PR
    • English PR
    • Europe
    • Financial insights
    • Focus on neobanking
    • Gas
    • Global news
    • Harris
    • History
    • India
    • Influential women
    • Invest and Rest
    • Italiano PR
    • Japan
    • Lifestyle
    • Metaverse
    • MSTRpay
    • Neobanking
    • News
    • newshub special
    • newshub-special
    • NFT
    • Nobel Prizes 2024
    • Nuclear
    • Oil
    • Press
    • Press releases
    • Pressroom
    • Renewable
    • Russia
    • Solar
    • South America
    • South East Asia
    • Stock of the week
    • Stocks
    • Svensk PR
    • Tech
    • Trump
    • Trump trials
    • UFO
    • UK
    • UK News
    • Ukraine
    • US
    • US politics
    • Waves
    • WEX
    • Wind

    Meta

    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org

    Recent Posts

    • Strike‑Star ignites creativity at Edna Manley College toghether with MSTRpay
    • Kabul may become the first modern city to run out of water
    • U2 and Bono: From post-punk rebellion to global force and Ireland’s cultural catalyst
    • Bitcoin sits near $105,000 as market eyes possible rally to $150K
    • Europe sees fragile recovery as growth edges upward

    Categories

    • Africa
    • AI
    • An diesem Tag
    • Asia
    • Australia
    • Banking
    • Best chefs
    • Biden
    • Blockchain
    • Blockchain technology
    • Carbon
    • Central Banks
    • China
    • Climate
    • Climate & Energy
    • Coal
    • Cocktail of the week
    • Commodities
    • Corporate
    • Crypto
    • Deutsch
    • Deutsch PR
    • English PR
    • Europe
    • Financial insights
    • Focus on neobanking
    • Gas
    • Global news
    • Harris
    • History
    • India
    • Influential women
    • Invest and Rest
    • Italiano PR
    • Japan
    • Lifestyle
    • Metaverse
    • MSTRpay
    • Neobanking
    • News
    • newshub special
    • newshub-special
    • NFT
    • Nobel Prizes 2024
    • Nuclear
    • Oil
    • Press
    • Press releases
    • Pressroom
    • Renewable
    • Russia
    • Solar
    • South America
    • South East Asia
    • Stock of the week
    • Stocks
    • Svensk PR
    • Tech
    • Trump
    • Trump trials
    • UFO
    • UK
    • UK News
    • Ukraine
    • US
    • US politics
    • Waves
    • WEX
    • Wind

    Archives

    • June 2025
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    newshub

    © 2023-2025
    MSTRpay & PAXIT
    Legal & Disclosure

    • Global news
    • Financial insights
    • Fin & tech
    • Climate & energy
    • Lifestyle
    • WEX

    Welcome Back!

    Login to your account below

    Forgotten Password?

    Retrieve your password

    Please enter your username or email address to reset your password.

    Log In

    Add New Playlist

    No Result
    View All Result
    • Global news
    • Financial insights
      • Africa
      • Asia
      • Australia
      • Central Banks
      • China
      • Commodities
      • Europe
      • Banking
      • Corporate
      • Neobanking
      • Investment
      • Japan
      • South East Asia
      • Stock of the week
      • UK
      • US
    • Fin & tech
      • AI
      • Blockchain
      • Crypto
      • MSTRpay
      • Tech
    • Climate & energy
      • Climate
      • Carbon
      • Coal
      • Disruptive
      • Gas
      • Nuclear
      • Oil
      • Solar
      • Water
      • Waves
      • Wind
      • Renewable
      • South America
    • Lifestyle
      • Best chefs
      • Cocktail of the week
      • History
      • Influential women
    • WEX
      • Alt Kap Holding AB
      • Digital Network Holding, Inc.
      • Fantas-E AB
      • International Clean Energy Inc.
      • Intritum Partner Limited
      • Intritum Recycling GH Limited
      • MSTRpay AB
      • SWAP Services, Inc.
      • VMT Holding, Inc.
      • Universal Streaming Technologies – USTA
      • TC Unterhaltungselektronik AG

    © 2023-2025
    MSTRpay & PAXIT
    Legal & Disclosure