diff --git a/index.md b/index.md index 9e099df537c625b9fed6c928d5d083b0813fa796..c0ccfda294e07906ab3ad532899e152810f94744 100644 --- a/index.md +++ b/index.md @@ -131,10 +131,16 @@ date: December 4th, 2024 - Mistral is a close second - also no training data - Google is catching up FAST on closed models (and leaving open behind) - AllenAI is catching up and it's open + +--- + +# The LLM ecosystem (cont) + - Microsoft has tiny, bad ones (but I wouldn't bet against them) - Apple is going their own way + using ChatGPT - Twitter/X has Grok-2 for paying customers, Grok-1 is enormous and "old" (from march) - Anthropic is receiving billions from Amazon but Claude is completely closed +- Amazon released its "Nova" models yesterday. 100% closed, interesting tiering (similar to blablador) --- @@ -164,7 +170,7 @@ date: December 4th, 2024 --- -# Open Source? +## Open Source? - Very few models are really open source - Most are "open source" in the sense that you can download the weights @@ -172,7 +178,7 @@ date: December 4th, 2024 --- -# Open Source +## Open Source - German academia: [OpenGPT-X](https://opengpt-x.de/en/) - Trained in Jülich and Dresden @@ -198,6 +204,16 @@ date: December 4th, 2024 --- +# Nous Research + +- Training live, on the interent +- Previous models were Llama finetuned on synthetic data + +{width=500px} + + +--- + # Non-transformer architectures - First one was Mamba in February diff --git a/public/images/nous-distro.png b/public/images/nous-distro.png new file mode 100644 index 0000000000000000000000000000000000000000..296fa89ad6df2c8e3b7168ac6f8c256aee235e93 Binary files /dev/null and b/public/images/nous-distro.png differ diff --git a/public/index.html b/public/index.html index e7d0b8c9a471a0381daf151e88fc59dccee18990..e50b76e63641e150946a7803608c911b65488ae1 100644 --- a/public/index.html +++ b/public/index.html @@ -392,6 +392,11 @@ data</li> <li class="fragment">Google is catching up FAST on closed models (and leaving open behind)</li> <li class="fragment">AllenAI is catching up and it’s open</li> +</ul> +</section> +<section id="the-llm-ecosystem-cont" class="slide level1"> +<h1>The LLM ecosystem (cont)</h1> +<ul> <li class="fragment">Microsoft has tiny, bad ones (but I wouldn’t bet against them)</li> <li class="fragment">Apple is going their own way + using ChatGPT</li> @@ -399,6 +404,8 @@ against them)</li> is enormous and “old” (from march)</li> <li class="fragment">Anthropic is receiving billions from Amazon but Claude is completely closed</li> +<li class="fragment">Amazon released its “Nova” models yesterday. 100% +closed, interesting tiering (similar to blablador)</li> </ul> </section> <section id="the-llm-ecosystem-1" class="slide level1"> @@ -433,8 +440,9 @@ href="https://lmarena.ai">https://lmarena.ai</a></li> <p><img data-src="images/llm-leaderboard-2024-11.png" /></p> </section> -<section id="open-source" class="slide level1"> -<h1>Open Source?</h1> +<section class="slide level1"> + +<h2 id="open-source">Open Source?</h2> <ul> <li class="fragment">Very few models are really open source</li> <li class="fragment">Most are “open source” in the sense that you can @@ -442,8 +450,9 @@ download the weights</li> <li class="fragment">Either the code isn’t available, or the data</li> </ul> </section> -<section id="open-source-1" class="slide level1"> -<h1>Open Source</h1> +<section class="slide level1"> + +<h2 id="open-source-1">Open Source</h2> <ul> <li class="fragment">German academia: <a href="https://opengpt-x.de/en/">OpenGPT-X</a> @@ -486,6 +495,15 @@ across continents and 96% when training only in the USA</li> </ul></li> </ul> </section> +<section id="nous-research" class="slide level1"> +<h1>Nous Research</h1> +<ul> +<li class="fragment">Training live, on the interent</li> +<li class="fragment">Previous models were Llama finetuned on synthetic +data</li> +</ul> +<p><img data-src="images/nous-distro.png" width="500" /></p> +</section> <section id="non-transformer-architectures" class="slide level1"> <h1>Non-transformer architectures</h1> <ul>