Meta proposes new scalable memory layers that improve knowledge and reduce hallucinations

Date:



As enterprises proceed to adopt large language models (LLM) across a wide range of applications, one in all the important thing challenges they face is improving model material knowledge and reducing hallucinations. In a new article, scientists from Meta-artificial AI propose “scalable memory tiers”, which could also be one in all several possible solutions to this problem.

Scalable memory layers add more parameters to LLMs to extend their learning capability without requiring additional computational resources. The architecture is beneficial in applications where additional memory might be saved for factual knowledge, but at the identical time the speed of inference and the agility of models are required.

- Advertisement -

Dense and memory layers

Traditional language models use “dense layers” to encode massive amounts of data into their parameters. In dense layers, all parameters are used at full capability and are most frequently activated at the identical time during inference. As they grow, dense layers can learn more complex functions, but increasing their size requires additional computational and energy resources.

In contrast, for easy factual knowledge, much simpler layers with an associative memory architecture resembling lookup tables can be more efficient and easier to interpret. This is what memory layers do. They use easy, sparse activations and key-value lookup mechanisms to encode and retrieve knowledge. Sparse layers take up more memory than dense layers, but only use a small fraction of the parameters at a time, making them far more computationally efficient.

Memory layers have been around for several years, but are rarely utilized in modern deep learning architectures. They are usually not optimized for current hardware accelerators.

Current pioneering LLMs typically use some type of “mix of experts” (MoE) architecture, which uses a mechanism somewhat much like memory tiers. MoE models consist of many smaller expert components that focus on specific tasks. At the time of inference, the routing engine determines which expert can be activated based on the input sequence. PEER, an architecture recently developed by Google DeepMind, extends MoE to tens of millions of experts, providing more granular control over the parameters activated during inference.

Updating memory layers

Memory tiers eat little processing power but are memory-intensive, creating particular challenges for current hardware and software structures. In their paper, Meta researchers propose several modifications that address these challenges and enable their large-scale use.

First, the researchers configured the memory layers for parallelism, spreading them across several GPUs to store tens of millions of key-value pairs without changing other layers within the model. They also implemented a special CUDA kernel to handle operations requiring high memory bandwidth. They also developed a parameter sharing mechanism that supports a single set of memory parameters across multiple memory layers inside the model. This means that the keys and values ​​used for searching are common across layers.

These modifications enable the implementation of memory tiers inside LLM without slowing down the model.

“Memory layers with their sparse activations nicely complement dense networks, providing increased capacity for knowledge acquisition while providing low computational overhead,” the researchers write. “They can scale efficiently and provide practitioners with a compelling new direction in the memory-computing trade-off.”

To test the memory layers, researchers modified Lamy’s models by replacing a number of dense layers with a shared memory layer. They compared memory-enhanced models with dense LLMs, in addition to MoE and PEER models, on several tasks, including answering fact-based questions, science and common sense knowledge, and coding.

Memory model and dense layers

Their findings show that memory models improve significantly over dense baselines and compete with models that use 2-4 times more computational power. They also match the performance of MoE models that have the identical computational budget and variety of parameters. The model’s performance is especially noteworthy for tasks that require material knowledge. For example, with regards to answering fact-based questions, the 1.3 billion-parameter memory model approaches the performance of Llama-2-7B, which was trained on twice as many tokens and 10 times as much processing power.

Moreover, the researchers found that the advantages of memory models remained consistent with model size after they scaled their experiments from 134 million to eight billion parameters.

“Given these findings, we strongly advocate the integration of memory layers into all next-generation AI architectures,” the researchers write, while adding that there remains to be far more room for improvement. “In particular, we hope to develop new learning methods that will further improve the effectiveness of these layers, enabling less forgetting, less hallucinations, and continuous learning.”

Rome
Romehttps://globalcmd.com/
Rome: Visionary Founder of the GlobalCommand Ecosystem (GlobalCmd.com | GLCND.com | GlobalCmd A.I.) Rome is the innovative mind behind the GlobalCommand Ecosystem, a dynamic suite of platforms designed to revolutionize productivity for entrepreneurs, freelancers, small business owners, and forward-thinking individuals. Through his visionary leadership, Rome has developed tools and content that eliminate complexity, empower decision-making, and accelerate success. The Powerhouse of Productivity: GlobalCmd.com At the heart of Rome’s vision is GlobalCmd.com, an intuitive AI-powered platform designed to simplify decision-making and streamline workflows. Whether you’re solving complex business challenges, scaling a new idea, or optimizing daily operations, GlobalCmd.com transforms inputs into actionable, results-driven solutions. Rome’s approach is straightforward yet transformative: provide users with tools that deliver clarity, save time, and empower them to focus on growth and achievement. With GlobalCmd.com, users no longer have to navigate overwhelming tools or inefficient processes—Rome has redefined productivity for real-world needs. An Ecosystem Built for Excellence Rome’s vision extends far beyond productivity tools. The GlobalCommand Ecosystem includes platforms that address every step of the user’s journey: • GLCND.com: A professional blog and content hub offering expert insights and actionable advice across business, science, health, and more. GLCND.com inspires users to explore new ideas, sharpen their skills, and stay ahead in their fields. • GlobalCmd A.I.: The innovative AI engine powering GlobalCmd.com, designed to turn user inputs into tailored recommendations, predictive insights, and actionable strategies. Built on the cutting-edge RAD² Framework, this AI simplifies even the most complex decisions with precision and ease. The Why Behind GlobalCmd.com Rome understands the pressure and challenges of running a business, launching projects, and making impactful decisions in real time. His mission was to create a platform that eliminates unnecessary complexity and provides clear, practical solutions for users. Whether users are tackling new ventures, refining operations, or handling day-to-day decisions, Rome has designed the GlobalCommand Ecosystem to meet real-world needs with innovative, results-oriented tools. Empowering Success Through Simplicity Rome’s ultimate goal is to empower individuals with the right tools, insights, and strategies to take control of their work and achieve success. By combining the strengths of GlobalCmd.com, GLCND.com, and GlobalCmd A.I., Rome has created an ecosystem that transforms how people work, think, and grow. Start your journey to smarter decisions and greater success today. Visit GlobalCmd.com and take control of your future.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Share post:

Our Newsletter

Subscribe Us To Receive Our Latest News Directly In Your Inbox!

We don’t spam! Read our privacy policy for more info.

Advertisement

Popular

More like this
Related

Market: Insurance sector shows promise despite market decline: Sandip Sabharwal

“The government knows there's a slowdown, but still government...

How much money, GPUs and gigawatts will be needed for Stargate AI’s $500 billion data centers?

A controversial artificial intelligence project price $500 billion was...

Understanding the user needs necessary to generate meaningful engagement

Author: Li AnxinDrawing on his expertise as a digital...

Brazilian Entrepreneur’s Clever Innovation Revolutionizes Shopping Experience😍 #shorts

A Brazilian entrepreneur has earned 1000's by installing a...