Close Menu
Entertainment Industry Reporter
    Facebook X (Twitter) Instagram
    Entertainment Industry Reporter
    • Home
    • Film
    • Television
    • Box Office
    • Reality TV
    • Music
    • Horror
    • Politics
    • Books
    • Technology
    • Popular Music Videos
    • Cover Story
    • Contact
      • About
      • Amazon Disclaimer
      • DMCA / Copyright Disclaimer
      • Privacy Policy
      • Terms and Conditions
    Entertainment Industry Reporter
    You are at:Home»Technology»Small Language Models Are the New Rage, Researchers Say
    Technology

    Small Language Models Are the New Rage, Researchers Say

    By AdminApril 13, 2025
    Facebook Twitter Pinterest LinkedIn Tumblr Email
    Small Language Models Are the New Rage, Researchers Say


    The original version of this story appeared in Quanta Magazine.

    Large language models work well because they’re so large. The latest models from OpenAI, Meta, and DeepSeek use hundreds of billions of “parameters”—the adjustable knobs that determine connections among data and get tweaked during the training process. With more parameters, the models are better able to identify patterns and connections, which in turn makes them more powerful and accurate.

    But this power comes at a cost. Training a model with hundreds of billions of parameters takes huge computational resources. To train its Gemini 1.0 Ultra model, for example, Google reportedly spent $191 million. Large language models (LLMs) also require considerable computational power each time they answer a request, which makes them notorious energy hogs. A single query to ChatGPT consumes about 10 times as much energy as a single Google search, according to the Electric Power Research Institute.

    In response, some researchers are now thinking small. IBM, Google, Microsoft, and OpenAI have all recently released small language models (SLMs) that use a few billion parameters—a fraction of their LLM counterparts.

    Small models are not used as general-purpose tools like their larger cousins. But they can excel on specific, more narrowly defined tasks, such as summarizing conversations, answering patient questions as a health care chatbot, and gathering data in smart devices. “For a lot of tasks, an 8 billion–parameter model is actually pretty good,” said Zico Kolter, a computer scientist at Carnegie Mellon University. They can also run on a laptop or cell phone, instead of a huge data center. (There’s no consensus on the exact definition of “small,” but the new models all max out around 10 billion parameters.)

    To optimize the training process for these small models, researchers use a few tricks. Large models often scrape raw training data from the internet, and this data can be disorganized, messy, and hard to process. But these large models can then generate a high-quality data set that can be used to train a small model. The approach, called knowledge distillation, gets the larger model to effectively pass on its training, like a teacher giving lessons to a student. “The reason [SLMs] get so good with such small models and such little data is that they use high-quality data instead of the messy stuff,” Kolter said.

    Researchers have also explored ways to create small models by starting with large ones and trimming them down. One method, known as pruning, entails removing unnecessary or inefficient parts of a neural network—the sprawling web of connected data points that underlies a large model.

    Pruning was inspired by a real-life neural network, the human brain, which gains efficiency by snipping connections between synapses as a person ages. Today’s pruning approaches trace back to a 1989 paper in which the computer scientist Yann LeCun, now at Meta, argued that up to 90 percent of the parameters in a trained neural network could be removed without sacrificing efficiency. He called the method “optimal brain damage.” Pruning can help researchers fine-tune a small language model for a particular task or environment.

    For researchers interested in how language models do the things they do, smaller models offer an inexpensive way to test novel ideas. And because they have fewer parameters than large models, their reasoning might be more transparent. “If you want to make a new model, you need to try things,” said Leshem Choshen, a research scientist at the MIT-IBM Watson AI Lab. “Small models allow researchers to experiment with lower stakes.”

    The big, expensive models, with their ever-increasing parameters, will remain useful for applications like generalized chatbots, image generators, and drug discovery. But for many users, a small, targeted model will work just as well, while being easier for researchers to train and build. “These efficient models can save money, time, and compute,” Choshen said.


    Original story reprinted with permission from Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences.



    Original Source Link

    Share. Facebook Twitter LinkedIn Email Telegram WhatsApp

    Related Posts

    Waymo Is Trying to Crack Down on Solo Kids in Driverless Cars

    Microsoft’s Xbox Mode Starts Making Its Way To Windows 11 PCs

    Good Luck Getting a Mac Mini for the Next ‘Several Months’

    YouTube’s Picture-In-Picture Mode Is Rolling Out To All Users Worldwide

    Emergency First Responders Say Waymos Are Getting Worse

    Texas Instruments made a new flagship graphing calculator: the TI-84 Evo

    Popular Posts

    AI order by Trump might be illegal: Democrats, consumer groups

    The Numbers Business Report reviews 2024 at the box office

    International Civility Consultant Calls for Kindness & Mutual Respect in the Workplace

    Chandler’s “Nightmare” Sees the Light After Decades in the Shadows

    Peter Hujar’s Day review – Ira Sachs’ best film…

    Below Deck Mediterranean Season 9, Episode 14 Recap

    Trump third White House term eyed by House resolution

    Categories
    • Books (2,075)
    • Box Office (1,482)
    • Cover Story (40)
    • Events (31)
    • Featured (42)
    • Film (2,093)
    • Horror (2,079)
    • Lifestyle (9)
    • Music (2,162)
    • Politics (1,221)
    • Popular Music Videos (1,512)
    • Reality TV (1,536)
    • Technology (2,086)
    • Television (1,859)
    • Uncategorized (1)
    Archives
    Useful Links
    • About
    • Contact
    • Privacy Policy
    • DMCA / Copyright Disclaimer
    • Amazon Disclaimer
    • Terms and Conditions
    Categories
    • Books (2,075)
    • Box Office (1,482)
    • Cover Story (40)
    • Events (31)
    • Featured (42)
    • Film (2,093)
    • Horror (2,079)
    • Lifestyle (9)
    • Music (2,162)
    • Politics (1,221)
    • Popular Music Videos (1,512)
    • Reality TV (1,536)
    • Technology (2,086)
    • Television (1,859)
    • Uncategorized (1)
    Popular Posts

    Daniel Blumberg: ​‘The ingredients were there to…

    New Date Will Be Announced

    Apple reportedly testing out four different styles for its smart glasses that will rival Meta Ray-Bans

    Book Riot’s Deals of the Day for April 26, 2025

    © 2026 Entertainment Industry Reporter. All rights reserved. All articles, images, product names, logos, and brands are property of their respective owners. All company, product and service names used in this website are for identification purposes only. Use of these names, logos, and brands does not imply endorsement unless specified. By using this site, you agree to the Terms & Conditions and Privacy Policy.

    Type above and press Enter to search. Press Esc to cancel.

    We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
    Cookie SettingsAccept All
    Manage consent

    Privacy Overview

    This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
    Necessary
    Always Enabled
    Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
    CookieDurationDescription
    cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
    cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
    cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
    cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
    cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
    viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
    Functional
    Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
    Performance
    Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
    Analytics
    Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
    Advertisement
    Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
    Others
    Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
    SAVE & ACCEPT