Google & IDSIA’s Block-Recurrent Transformer Dramatically Outperforms Transformers Over Very Long Sequences

The increasing popularity of transformer architectures in natural language processing (NLP) and other AI research areas is largely attributable to their superior expressive capability when handling long input sequences. A major drawback limiting transformer deployment is that the computational complexity of…

--

--

--

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Using Machine Learning to Classify Server Incidents

10 Trends in AI Landscape -2020

A different class of Machine Learning

Building an Operating System for AI

AI & Machine Learning Use Cases in Retail and Other Industries

Introduction to Node Embeddings part 1(Graph Theory)

Efficient Image Gallery Representations at Scale Through Multi-Task Learning

[ Archived Post ] Computer-Aided Detection of Prostate Cancer inMRI

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Synced

Synced

AI Technology & Industry Review — syncedreview.com | Newsletter: http://bit.ly/2IYL6Y2 | Share My Research http://bit.ly/2TrUPMI | Twitter: @Synced_Global

More from Medium

CMU & Google Extend Pretrained Models to Thousands of Underrepresented Languages Without Using…

Speeding up cross-encoders for both training and inference

[Paper review] Matching networks for one shot learning

Implementing a Transformer From Scratch