Name	Name	Last commit message	Last commit date
parent directory ..
images	images
README.md	README.md

Name

Last commit message

Last commit date

MoE (Mixture of Experts)

🔥 YouTube Video: Research Paper Deep Dive - The Sparsely-Gated Mixture-of-Experts (MoE)

The Problem:

We already know that neural networks can achieve impressive results on a wide range of tasks, such as image classification, machine translation, and protein folding prediction, with the use of inductive biases such as convolutions or sequence attention), increasingly large datasets, and more specialized hardware. The magic behind these amazing results are super massive models with massive collection of paramterrs.

We all know that large model sizes is necessary for strong generalization and robustness, so training large models while limiting resource requirements is becoming increasingly important.

There is a hidden problem underneath these superstar, eyepoping results and the problem is significant use of computation resources or the requirements, which includes supermassive hardware and that includes logisting, cost, power requirement, and top of the above feasibity to even move it outside labs..

The solution:

One promising approach is to use conditional computation:

Rather than activating the whole network for every single input, different parts of the model are activated for different inputs.
Most import things to note here is that MoEs is used a general purpose neural network component.

What everybody wants:

Scale up the model....
Adding model capacity (scaling up) without adding computations resourcs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

MoE (Mixture of Experts)

The Problem:

The solution:

What everybody wants:

Research Papers & GitHub Source Code(s)

Resources

FilesExpand file tree

MoE

Directory actions

More options

Directory actions

More options

Latest commit

History

MoE

Folders and files

parent directory

README.md

MoE (Mixture of Experts)

The Problem:

The solution:

What everybody wants:

Research Papers & GitHub Source Code(s)

Resources