DeepSeek-V3 New Paper is coming!A newly released 14-page technical paper from the team behind DeepSeek-V3, with DeepSeek CEO Wenfeng Liang as a co-author, sheds light on…2d ago2d ago
DeepSeek Unveils DeepSeek-Prover-V2: Advancing Neural Theorem Proving with Recursive Proof Search…DeepSeek AI has announced the release of DeepSeek-Prover-V2, a groundbreaking open-source large language model specifically designed for…Apr 30Apr 30
Can GRPO be 10x Efficient? Kwai AI’s SRPO Suggests Yes with SRPOThe remarkable success of OpenAI’s o1 series and DeepSeek-R1 has unequivocally demonstrated the power of large-scale reinforcement learning…Apr 24Apr 24
No Moat, Huh?Forget the model size and parameter counts. The real battleground in the burgeoning AI landscape might just be… content communities. As…Apr 15Apr 15
Published inSyncedReviewDeepSeek Signals Next-Gen R2 Model, Unveils Novel Approach to Scaling Inference with SPCTDeepSeek AI, a prominent player in the large language model arena, has recently published a research paper detailing a new technique aimed…Apr 13Apr 13
Published inSyncedReviewAutomating Artificial Life Discovery: The Power of Foundation ModelsThe recent Nobel Prize for groundbreaking advancements in protein discovery underscores the transformative potential of foundation models…Dec 31, 2024Dec 31, 2024
Published inSyncedReviewLlama 3 Meets MoE: Pioneering Low-Cost High-Performance AIDec 28, 2024Dec 28, 2024
Published inSyncedReviewDeepMind’s JetFormer: Unified Multimodal Models Without Modelling ConstraintsRecent advancements in training large multimodal models have been driven by efforts to eliminate modeling constraints and unify…Dec 26, 2024Dec 26, 2024
Published inSyncedReviewNVIDIA’s nGPT: Revolutionizing Transformers with Hypersphere RepresentationThe Transformer architecture, introduced by Vaswani et al. in 2017, serves as the backbone of contemporary language models. Over the years…Dec 23, 20241Dec 23, 20241
Published inSyncedReviewFrom Token to Conceptual: Meta Introduces Large Concept Models in Multilingual AILarge Language Models (LLMs) have become indispensable tools for diverse natural language processing (NLP) tasks. Traditional LLMs operate…Dec 18, 2024Dec 18, 2024