Seminar/Proseminar: Large Language Models W24/25
to Whiteboard Site

Description

This seminar provides an exploration of large language models (LLMs), covering both foundational concepts and the latest advancements in the field. Participants will gain a comprehensive understanding of the architecture, training, and applications of LLMs, based on seminal research papers. The course will be organised as a journal club: students present individual papers, which are then discussed in the group to make sure we all get the ideas presented.

### Potential Topics

   - Neural networks and deep learning basics

   - Sequence modeling and RNNs (Recurrent Neural Networks)

   - Vaswani et al.'s "Attention is All You Need" paper

   - Self-attention mechanism

   - Multi-head attention and positional encoding

   - GPT-1: Radford et al.'s pioneering work

   - GPT-2: Scaling and implications

   - GPT-3: Architectural advancements and few-shot learning

   - BERT (Bidirectional Encoder Representations from Transformers)

   - T5 (Text-To-Text Transfer Transformer)

   - DistilBERT and efficiency improvements

   - Mamba:l and other SSMs: Design principles and performance

   - Flash Attention et al: Improving efficiency and scalability

   - Training regimes and resource requirements

   - Fine-tuning and transfer learning

- Emergence of new capabilities

Basic Course Info

Course No Course Type Hours
19334617 Seminar/Proseminar 2

Time Span 14.10.2024 - 10.02.2025
Instructors
Tim Landgraf

Study Regulation

0086c_k150 2014, BSc Informatik (Mono), 150 LPs
0086d_k135 2014, BSc Informatik (Mono), 135 LPs
0087d_k90 2015, BSc Informatik (Kombi), 90 LPs
0088d_m60 2015, MSc Informatik (Kombi), 60 LPs
0089c_MA120 2014, MSc Informatik (Mono), 120 LPs
0207b_m37 2015, MSc Informatik (Lehramt), 37 LPs
0208b_m42 2015, MSc Informatik (Lehramt), 42 LPs
0458a_m37 2015, MSc Informatik (Lehramt), 37 LPs
0471a_m42 2015, MSc Informatik (Lehramt), 42 LPs
0556a_m37 2018, M-Ed Fach 1 Informatik (Lehramt an Integrierten Sekundarschulen und Gymnasien), 37 LPs
0556b_m37 2023, M-Ed Informatik Fach 1 (Lehramt an Integrierten Sekundarschulen und Gymnasien), 37 LP
0557a_m42 2018, M-Ed Fach 2 Informatik (Lehramt an Integrierten Sekundarschulen und Gymnasien), 42 LPs
0557b_m42 2023, M-Ed Informatik Fach 2 Informatik (Lehramt an Integrierten Sekundarschulen und Gymnasien), 42 LPs
0590b_MA120 2021, MSc Data Science, 120 LP

Seminar/Proseminar: Large Language Models W24/25
to Whiteboard Site

Main Events

Day Time Location Details
Monday 10-12 A6/SR 007/008 Seminarraum 2024-10-14 - 2025-02-10

Seminar/Proseminar: Large Language Models W24/25
to Whiteboard Site

Most Recent Announcement

:  

Currently there are no public announcements for this course.


Older announcements

Seminar/Proseminar: Large Language Models W24/25
to Whiteboard Site

Currently there are no resources for this course available.
Or at least none which you're allowed to see with your current set of permissions.
Maybe you have to log in first.