diff --git a/index.md b/index.md index ab1fca4863407ca3318958b7eebfbeb181b3c249..42d81d33ed66940f5356174e6fcacd7c5978bf57 100644 --- a/index.md +++ b/index.md @@ -1,13 +1,23 @@ --- author: Alexandre Strube title: BLABLADOR { width=550px } -subtitle: JSC's experimental Large Language Model server +subtitle: JSC's Inference Infrastructure date: December 4th, 2024 --- -# Take the slides with you +## Website -{ width=500px } +{width=450px} + +- Play around! š¶ + +--- + +## OUTLINE + +- Past +- Present +- Future --- @@ -22,13 +32,17 @@ date: December 4th, 2024 --- + + +--- + # Blablador  --- -# Why? +# Past: Why? - AI is becoming basic infrastructure - Which historically is Open Source @@ -45,6 +59,10 @@ date: December 4th, 2024 --- + + +--- + # Why, part 2 - Projects like OpenGPT-X, TrustLLM need a place to run @@ -54,7 +72,7 @@ date: December 4th, 2024 --- -## Some facts +## Privacy is our selling point - No data collection at all. I don't keep ***ANY*** data whatsoever! - You can use it AND keep your data private @@ -75,12 +93,7 @@ date: December 4th, 2024 --- -## usage - - -- Web UI only -- API usage wasn't recorded until we moved to a new host, devs still migrating -- Some healthy usage by tools e.g. B2DROP assistant: Around 400 requests/day (count as a single ip from B2DROP's server) + --- @@ -92,14 +105,35 @@ date: December 4th, 2024 --- -# The LLM open ecosystem + -- If it isn't on huggingface, it doesn't exist +--- + +# PRESENT + +--- + +## Usage + + +- Web UI only +- API usage wasn't recorded until we moved to a new host, devs still migrating +- Some healthy usage by tools e.g. B2DROP assistant: Around 400 requests/day (count as a single ip from B2DROP's server) + +--- + + +# The LLM ecosystem + +- For open models: If it isn't on huggingface, it doesn't exist - The "open" ecosystem is dominated by a few big players: Meta, Mistral.AI, Google +- Meta has the best open frontier-level models; but no training data +- Mistral is a close second - also no training data +- Google is catching up FAST on closed models (and leaving open behind) +- AllenAI is catching up and it's open - Microsoft has tiny, bad ones (but I wouldn't bet against them) - Apple is going their own way + using ChatGPT - Twitter/X has Grok-2 for paying customers, Grok-1 is enormous and "old" (from march) -- Google is catching up FAST on closed models - Anthropic is receiving billions from Amazon but Claude is completely closed --- @@ -176,10 +210,16 @@ date: December 4th, 2024 # EU AI Act +- "GPAI models present systemic risks when the cumulative amount of compute used for its training is greater than 10^25 floating point operations (FLOPs)" +- Moore's law has something to say about this - TrustLLM and OpenGPT-X will have to comply - To be used commercially - Research-only models are exempt -- Bureaucratic shot in the foot: the EU AI Act will make it harder for EU models to compete with US ones +- Bureaucratic shot in the foot: the EU AI Act will make it harder for EU models to compete internationally + +--- + + --- @@ -193,6 +233,7 @@ date: December 4th, 2024 - Small cluster inherited from other projects - Started small, with three nodes +- Runs Slurm, has a parallel FS/Storage, uses EasyBuild --- @@ -215,15 +256,15 @@ date: December 4th, 2024 --- - +## Big models need big hardware ---- +- Llama 3.1 405b runs on WestAI nodes + - Launched with UNICORE +- QwQ runs there experimentally -## Website - -{width=450px} +--- -- Play around! š¶ + --- @@ -318,30 +359,73 @@ date: December 4th, 2024 --- +# FUTURE + +--- + + + +--- + ## Vision for the (near) future -- Blablador as an umbrella for inference -- Use cases: - - LLMs for science - - Nasa's Prithvi 3 (currently being trained here) - - ESA's upcomping model - - Health: Radiology with Aachen Uniklinik - - ... - - With privacy! +--- + +# Blablador, the brand, as an umbrella for *ALL* inference at JSC + +--- + +## JSC Inference umbrella + +- Blablador as LLM is the first step +- Grow to include other types of models for Science and Industry + +--- + +## Use cases so far + +- LLMs for science (e.g. OpenGPT-X, TrustLLM, CosmoSage) +- Ancillary models stemming from [Prithvi-EO-2.0](https://www.earthdata.nasa.gov/dashboard/labs/earthinsights/explore) (300M and 600M), EO foundation model jointly developed by IBM, NASA, and JSC: Release tomorrow, Dec. 5th. + - Weather (big) and geospatial downstreams tasks +- NASA's [Helio](https://arxiv.org/abs/2410.10841) (downstream tasks in Heliophysics) +- JSC/ESA/DLR's upcomping model [FAST-EO](https://eo4society.esa.int/projects/fast-eo/) +- Health: Radiology with Aachen Uniklinik +- JSC/Cern/ECMWF's [Atmorep](https://www.atmorep.org) +- Open models: + - Pango weather + - Graphcast +- With privacy! + +--- + + --- ## Todo -- Multi-modal models (text+image, text+audio, etc) +- Multi-modality: video, audio, text, images - Auto-RAG with privacy: - Easy to do badly. Hard to do securely. +- Everything from the previous slide + +--- + +## Potato + +{ width=350px } + +--- + +# Take the slides with you + +{ width=500px } --- ## Questions? - +{ width=500px } --- @@ -349,6 +433,14 @@ date: December 4th, 2024 --- + +## LLMOps resource + +A No-BS Database of How Companies Actually Deploy LLMs in Production: 300+ Technical Case Studies, Including Self-Hosted LLMs in [https://www.zenml.io/llmops-database](https://www.zenml.io/llmops-database) + +--- + + > "I think the complexity of Python package management holds down AI application development more than is widely appreciated. AI faces multiple bottlenecks ā we need more GPUs, better algorithms, cleaner data in large quantities. But when I look at the day-to-day work of application builders, thereās one additional bottleneck that I think is underappreciated: The time spent wrestling with version management is an inefficiency I hope we can reduce. " Andrew Ng, 28.02.2024 diff --git a/public/images/IMG_6426.mov b/public/images/IMG_6426.mov new file mode 100644 index 0000000000000000000000000000000000000000..a8904b58174d227534181ad361445e3052187f78 Binary files /dev/null and b/public/images/IMG_6426.mov differ diff --git a/public/images/IMG_6549.jpg b/public/images/IMG_6549.jpg new file mode 100644 index 0000000000000000000000000000000000000000..24d3d6a4ce682f4608f07768aa3dbc6c41fa3135 Binary files /dev/null and b/public/images/IMG_6549.jpg differ diff --git a/public/images/IMG_6551.jpg b/public/images/IMG_6551.jpg new file mode 100644 index 0000000000000000000000000000000000000000..9cc4c182270c82c293e5f8c7ab99584112ea0d1e Binary files /dev/null and b/public/images/IMG_6551.jpg differ diff --git a/public/images/IMG_6553.jpg b/public/images/IMG_6553.jpg new file mode 100644 index 0000000000000000000000000000000000000000..7c3d4b91c9aa1a31df8f29551b220c9d720edef1 Binary files /dev/null and b/public/images/IMG_6553.jpg differ diff --git a/public/images/IMG_6561.jpg b/public/images/IMG_6561.jpg new file mode 100644 index 0000000000000000000000000000000000000000..5e55a3cde4776cb783a97dd0c1ef81f280ed5fdf Binary files /dev/null and b/public/images/IMG_6561.jpg differ diff --git a/public/images/IMG_6562.jpg b/public/images/IMG_6562.jpg new file mode 100644 index 0000000000000000000000000000000000000000..660f0734caac0ec8cf7c38b29c91d21444022835 Binary files /dev/null and b/public/images/IMG_6562.jpg differ diff --git a/public/images/IMG_6573.jpg b/public/images/IMG_6573.jpg new file mode 100644 index 0000000000000000000000000000000000000000..b406485ba1ec428f4c9e7d5f0a17fa8636909061 Binary files /dev/null and b/public/images/IMG_6573.jpg differ diff --git a/public/images/IMG_6579.jpg b/public/images/IMG_6579.jpg new file mode 100644 index 0000000000000000000000000000000000000000..41dd9eac7a0943778828d1763a0d0a8c6d0636eb Binary files /dev/null and b/public/images/IMG_6579.jpg differ diff --git a/public/index.html b/public/index.html index f8fb30aa8d870a475c58ec6e3a101e7e8d4a43e8..a046cd1bc965424a7bb09ab89537e0c2bac700c4 100644 --- a/public/index.html +++ b/public/index.html @@ -226,19 +226,32 @@ <section id="title-slide"> <h1 class="title">BLABLADOR <img data-src="images/blablador.png" width="550" /></h1> - <p class="subtitle">JSCās experimental Large Language Model server</p> + <p class="subtitle">JSCās Inference Infrastructure</p> <p class="author">Alexandre Strube</p> <p class="date">December 4th, 2024</p> </section> -<section id="take-the-slides-with-you" class="slide level1"> -<h1>Take the slides with you</h1> +<section class="slide level1"> + +<h2 id="website">Website</h2> <figure> -<img data-src="images/talk-jsc-colloquium.png" width="500" -alt="https://go.fzj.de/2024-12-jsc-colloquium" /> +<img data-src="images/blablador-qrcode.png" width="450" +alt="https://helmholtz-blablador.fz-juelich.de" /> <figcaption -aria-hidden="true">https://go.fzj.de/2024-12-jsc-colloquium</figcaption> +aria-hidden="true">https://helmholtz-blablador.fz-juelich.de</figcaption> </figure> +<ul> +<li class="fragment">Play around! š¶</li> +</ul> +</section> +<section class="slide level1"> + +<h2 id="outline">OUTLINE</h2> +<ul> +<li class="fragment">Past</li> +<li class="fragment">Present</li> +<li class="fragment">Future</li> +</ul> </section> <section id="blablador" class="slide level1"> <h1>Blablador</h1> @@ -255,12 +268,17 @@ rank, some good, some awful)</li> and training code.</li> </ul> </section> +<section class="slide level1"> + +<p><video data-src="images/IMG_6426.mov" controls=""><a +href="images/IMG_6426.mov">Video</a></video></p> +</section> <section id="blablador-1" class="slide level1"> <h1>Blablador</h1> <p><img data-src="images/blablador-screenshot.png" /></p> </section> -<section id="why" class="slide level1"> -<h1>Why?</h1> +<section id="past-why" class="slide level1"> +<h1>Past: Why?</h1> <ul> <li class="fragment">AI is becoming basic infrastructure</li> <li class="fragment">Which historically is Open Source</li> @@ -280,6 +298,10 @@ target šÆšØ</li> </ul></li> </ul> </section> +<section class="slide level1"> + +<p><img data-src="images/IMG_6549.jpg" /></p> +</section> <section id="why-part-2" class="slide level1"> <h1>Why, part 2</h1> <ul> @@ -294,7 +316,7 @@ run</li> </section> <section class="slide level1"> -<h2 id="some-facts">Some facts</h2> +<h2 id="privacy-is-our-selling-point">Privacy is our selling point</h2> <ul> <li class="fragment">No data collection at all. I donāt keep <strong><em>ANY</em></strong> data whatsoever! @@ -324,15 +346,7 @@ it, contact me!</em></strong></li> </section> <section class="slide level1"> -<h2 id="usage">usage</h2> -<p><img data-src="images/usage_blablador_web.svg" /></p> -<ul> -<li class="fragment">Web UI only</li> -<li class="fragment">API usage wasnāt recorded until we moved to a new -host, devs still migrating</li> -<li class="fragment">Some healthy usage by tools e.g. B2DROP assistant: -Around 400 requests/day (count as a single ip from B2DROPās server)</li> -</ul> +<p><img data-src="images/IMG_6579.jpg" /></p> </section> <section class="slide level1"> @@ -345,23 +359,49 @@ Blabladorās API (VSCodeās Continue.dev, LangChain, etc)</li> than unique visits/day)</li> </ul> </section> -<section id="the-llm-open-ecosystem" class="slide level1"> -<h1>The LLM open ecosystem</h1> +<section class="slide level1"> + +<p><img data-src="images/IMG_6562.jpg" /></p> +</section> +<section id="present" class="slide level1"> +<h1>PRESENT</h1> +</section> +<section class="slide level1"> + +<h2 id="usage">Usage</h2> +<p><img data-src="images/usage_blablador_web.svg" /></p> +<ul> +<li class="fragment">Web UI only</li> +<li class="fragment">API usage wasnāt recorded until we moved to a new +host, devs still migrating</li> +<li class="fragment">Some healthy usage by tools e.g. B2DROP assistant: +Around 400 requests/day (count as a single ip from B2DROPās server)</li> +</ul> +</section> +<section id="the-llm-ecosystem" class="slide level1"> +<h1>The LLM ecosystem</h1> <ul> -<li class="fragment">If it isnāt on huggingface, it doesnāt exist</li> +<li class="fragment">For open models: If it isnāt on huggingface, it +doesnāt exist</li> <li class="fragment">The āopenā ecosystem is dominated by a few big players: Meta, Mistral.AI, Google</li> +<li class="fragment">Meta has the best open frontier-level models; but +no training data</li> +<li class="fragment">Mistral is a close second - also no training +data</li> +<li class="fragment">Google is catching up FAST on closed models (and +leaving open behind)</li> +<li class="fragment">AllenAI is catching up and itās open</li> <li class="fragment">Microsoft has tiny, bad ones (but I wouldnāt bet against them)</li> <li class="fragment">Apple is going their own way + using ChatGPT</li> <li class="fragment">Twitter/X has Grok-2 for paying customers, Grok-1 is enormous and āoldā (from march)</li> -<li class="fragment">Google is catching up FAST on closed models</li> <li class="fragment">Anthropic is receiving billions from Amazon but Claude is completely closed</li> </ul> </section> -<section id="the-llm-ecosystem" class="slide level1"> +<section id="the-llm-ecosystem-1" class="slide level1"> <h1>The LLM ecosystem</h1> <ul> <li class="fragment">Evaluation is HARD @@ -461,15 +501,23 @@ MIT</li> <section id="eu-ai-act" class="slide level1"> <h1>EU AI Act</h1> <ul> +<li class="fragment">āGPAI models present systemic risks when the +cumulative amount of compute used for its training is greater than 10^25 +floating point operations (FLOPs)ā</li> +<li class="fragment">Mooreās law has something to say about this</li> <li class="fragment">TrustLLM and OpenGPT-X will have to comply <ul> <li class="fragment">To be used commercially</li> </ul></li> <li class="fragment">Research-only models are exempt</li> <li class="fragment">Bureaucratic shot in the foot: the EU AI Act will -make it harder for EU models to compete with US ones</li> +make it harder for EU models to compete internationally</li> </ul> </section> +<section class="slide level1"> + +<p><img data-src="images/IMG_6553.jpg" /></p> +</section> <section id="juelich-supercomputing-centre" class="slide level1"> <h1>Juelich Supercomputing Centre</h1> <figure> @@ -482,6 +530,8 @@ make it harder for EU models to compete with US ones</li> <ul> <li class="fragment">Small cluster inherited from other projects</li> <li class="fragment">Started small, with three nodes</li> +<li class="fragment">Runs Slurm, has a parallel FS/Storage, uses +EasyBuild</li> </ul> </section> <section class="slide level1"> @@ -516,23 +566,21 @@ website</li> </section> <section class="slide level1"> -<figure> -<img data-src="images/jureca-dc.png" alt="Jureca-DC" /> -<figcaption aria-hidden="true">Jureca-DC</figcaption> -</figure> +<h2 id="big-models-need-big-hardware">Big models need big hardware</h2> +<ul> +<li class="fragment">Llama 3.1 405b runs on WestAI nodes +<ul> +<li class="fragment">Launched with UNICORE</li> +</ul></li> +<li class="fragment">QwQ runs there experimentally</li> +</ul> </section> <section class="slide level1"> -<h2 id="website">Website</h2> <figure> -<img data-src="images/blablador-qrcode.png" width="450" -alt="https://helmholtz-blablador.fz-juelich.de" /> -<figcaption -aria-hidden="true">https://helmholtz-blablador.fz-juelich.de</figcaption> +<img data-src="images/jureca-dc.png" alt="Jureca-DC" /> +<figcaption aria-hidden="true">Jureca-DC</figcaption> </figure> -<ul> -<li class="fragment">Play around! š¶</li> -</ul> </section> <section class="slide level1"> @@ -671,39 +719,106 @@ on their IEK7Cloud</li> <p><a href="https://github.com/haesleinhuepf/bia-bob/blob/main/README.md">https://github.com/haesleinhuepf/bia-bob</a></p> </section> +<section id="future" class="slide level1"> +<h1>FUTURE</h1> +</section> +<section class="slide level1"> + +<figure> +<img data-src="images/IMG_6573.jpg" alt="Pickle is hungry" /> +<figcaption aria-hidden="true">Pickle is hungry</figcaption> +</figure> +</section> <section class="slide level1"> <h2 id="vision-for-the-near-future">Vision for the (near) future</h2> +</section> +<section +id="blablador-the-brand-as-an-umbrella-for-all-inference-at-jsc" +class="slide level1"> +<h1>Blablador, the brand, as an umbrella for <em>ALL</em> inference at +JSC</h1> +</section> +<section class="slide level1"> + +<h2 id="jsc-inference-umbrella">JSC Inference umbrella</h2> <ul> -<li class="fragment">Blablador as an umbrella for inference</li> -<li class="fragment">Use cases: +<li class="fragment">Blablador as LLM is the first step</li> +<li class="fragment">Grow to include other types of models for Science +and Industry</li> +</ul> +</section> +<section class="slide level1"> + +<h2 id="use-cases-so-far">Use cases so far</h2> +<ul> +<li class="fragment">LLMs for science (e.g. OpenGPT-X, TrustLLM, +CosmoSage)</li> +<li class="fragment">Ancillary models stemming from <a +href="https://www.earthdata.nasa.gov/dashboard/labs/earthinsights/explore">Prithvi-EO-2.0</a> +(300M and 600M), EO foundation model jointly developed by IBM, NASA, and +JSC: Release tomorrow, Dec. 5th. <ul> -<li class="fragment">LLMs for science</li> -<li class="fragment">Nasaās Prithvi 3 (currently being trained -here)</li> -<li class="fragment">ESAās upcomping model</li> +<li class="fragment">Weather (big) and geospatial downstreams tasks</li> +</ul></li> +<li class="fragment">NASAās <a +href="https://arxiv.org/abs/2410.10841">Helio</a> (downstream tasks in +Heliophysics)</li> +<li class="fragment">JSC/ESA/DLRās upcomping model <a +href="https://eo4society.esa.int/projects/fast-eo/">FAST-EO</a></li> <li class="fragment">Health: Radiology with Aachen Uniklinik</li> -<li class="fragment">ā¦</li> -<li class="fragment">With privacy!</li> +<li class="fragment">JSC/Cern/ECMWFās <a +href="https://www.atmorep.org">Atmorep</a></li> +<li class="fragment">Open models: +<ul> +<li class="fragment">Pango weather</li> +<li class="fragment">Graphcast</li> </ul></li> +<li class="fragment">With privacy!</li> </ul> </section> <section class="slide level1"> +<figure> +<img data-src="images/IMG_6551.jpg" alt="What do we have to do?" /> +<figcaption aria-hidden="true">What do we have to do?</figcaption> +</figure> +</section> +<section class="slide level1"> + <h2 id="todo">Todo</h2> <ul> -<li class="fragment">Multi-modal models (text+image, text+audio, -etc)</li> +<li class="fragment">Multi-modality: video, audio, text, images</li> <li class="fragment">Auto-RAG with privacy: <ul> <li class="fragment">Easy to do badly. Hard to do securely.</li> </ul></li> +<li class="fragment">Everything from the previous slide</li> </ul> </section> <section class="slide level1"> +<h2 id="potato">Potato</h2> +<p><img data-src="images/IMG_6561.jpg" width="350" /></p> +</section> +<section id="take-the-slides-with-you" class="slide level1"> +<h1>Take the slides with you</h1> +<figure> +<img data-src="images/talk-jsc-colloquium.png" width="500" +alt="https://go.fzj.de/2024-12-jsc-colloquium" /> +<figcaption +aria-hidden="true">https://go.fzj.de/2024-12-jsc-colloquium</figcaption> +</figure> +</section> +<section class="slide level1"> + <h2 id="questions">Questions?</h2> -<p><img data-src="images/blablador-questions.jpg" /></p> +<figure> +<img data-src="images/blablador-questions.jpg" width="500" +alt="No dogs have been harmed for this presentation" /> +<figcaption aria-hidden="true">No dogs have been harmed for this +presentation</figcaption> +</figure> </section> <section class="slide level1"> @@ -711,6 +826,13 @@ etc)</li> </section> <section class="slide level1"> +<h2 id="llmops-resource">LLMOps resource</h2> +<p>A No-BS Database of How Companies Actually Deploy LLMs in Production: +300+ Technical Case Studies, Including Self-Hosted LLMs in <a +href="https://www.zenml.io/llmops-database">https://www.zenml.io/llmops-database</a></p> +</section> +<section class="slide level1"> + <blockquote> <p>āI think the complexity of Python package management holds down AI application development more than is widely appreciated. AI faces