Published inSyncedReview·11 hours agoMember-onlyMeta AI’s Novel Setup Reveals The Structure and Evolution of TransformersIn recent years, large language models (LLMs) have demonstrated a strong capability to learn vast amounts of ‘global’ knowledge from their training data and have shown the ability to quickly adapt to new information based on given contexts or prompts. Despite their impressive ‘in-context’ learning capabilities, their internal mechanisms remain…Large Language Models3 min readLarge Language Models3 min read
Published inSyncedReview·2 days agoMember-onlyMicrosoft’s LLaVA-Med Trains a Large Language-and-Vision Assistant for Biomedicine Within 15 HoursConversational generative large multimodal models (LMMs) have achieved impressive performance on a wide variety of vision-language tasks. Despite the success of these LMMs in general domain, they normally have worse performance on biomedical field with domain specific biomedical image-text pairs. In an effort to bridge this gap, a new paper…Generative Model3 min readGenerative Model3 min read
Published inSyncedReview·2 days agoMember-onlyDeepMind, Mila & Montreal U’s Bigger, Better, Faster RL Agent Achieves Super-human Performance on Atari 100KDeep reinforcement learning (RL) is a trending machine learning algorithm that aims at solving complex decision-making tasks at a human or super-human level performance. …Reinforcement Learning3 min readReinforcement Learning3 min read
Published inSyncedReview·6 days agoMember-onlyGoogle & Waterloo U Scales Generative Retrieval to Handle 8.8M PassagesIn recent years, there has been a surge of interest in generative retrieval approaches, which represent a fresh paradigm aiming to transform traditional information retrieval methods. These approaches leverage the power of a single sequence-to-sequence Transformer model to encode and process an entire document corpus. …Generative Model3 min readGenerative Model3 min read
Published inSyncedReview·Jun 1Member-onlyGoogle & Stanford U’s DoReMi Significantly Speeds Up Language Model PretrainingLarge language models (LLMs) pretrained on massive data are being used in countless real-world applications. …Large Language Models3 min readLarge Language Models3 min read
Published inSyncedReview·May 31Member-onlyTool Up! DeepMind, Princeton & Stanford’s LATM Enables LLMs to Make Their Own ToolsThe 19th-century British philosopher Thomas Carlyle ascribed human progress to a key historical development: “Man is a tool-using animal. Without tools he is nothing, with tools he is all.” While today’s large language models (LLMs) have demonstrated impressive generative and problem-solving capabilities, recent research suggests they could take a similar…Language Model3 min readLanguage Model3 min read
Published inSyncedReview·May 30Member-onlyMeta AI’s READ Method for Fine-Tuning Large Transformers Cuts GPU Energy Costs by 84%Fine-tuning large-scale pretrained transformers enables them to adapt to and perform better on downstream tasks. While this fine-tuning is crucial for countless real-world applications, fully fine-tuning all model parameters becomes increasingly challenging as models scale to ever-increasing sizes. This has led to the development of parameter-efficient transfer learning (PETL) techniques…Artificial Intelligence4 min readArtificial Intelligence4 min read
Published inSyncedReview·May 26Member-onlyMeta AI’s Massively Multilingual Speech Project Scales Speech Technology to 1000+ LanguagesSpeech technologies such as automatic speech recognition (ASR) and speech synthesis or text-to-speech (TTS) are playing an increasingly important role in many real-world applications. Contemporary speech technology systems however support only about one hundred languages at best — a tiny fraction of the over 7,000 languages spoken worldwide. A Meta…Multilingual Model3 min readMultilingual Model3 min read
Published inSyncedReview·May 25Member-onlyAlibaba & HUST’s ONE-PEACE: Toward a General Representation Model For Unlimited ModalitiesThe recent rapid rise of large language models (LLMs) has piqued research interest regarding the power and potential of representation models, which are designed to decode and understand data. While contemporary representation models have achieved outstanding performance on unimodal tasks, they typically remain underequipped for handling multimodal tasks. In the…Large Language Models4 min readLarge Language Models4 min read
Published inSyncedReview·May 24Member-onlyGoogle’s PaLM 2 Technical Report Details the New Model Family’s Research AdvancesIn April 2022, Google unveiled its 540 billion parameter Pathways Language Model (PaLM), which they developed using a novel Pathways (Barham et al., 2022) approach that enables efficient model training across multiple TPU v4 Pods (in PaLM’s case, 6144 TPU v4 chips). …Large Language Models3 min readLarge Language Models3 min read