This seminar is about random objects in very high-dimensional Euclidean spaces, as they appear for instance in data science. One intriguing property of the associated theory is that our low-dimensional intuition often fails spectacularly, as the following examples show.

- An n-dimensional standard normal distribution is highly concentrated around a sphere (i.e., the surface of a ball) of radius square root of n, rendering our „bell-curve“ intuition useless.
- Let us consider the unit sphere in high dimensions, and let A be a subset of it covering at least half of its area. Then any small neighborhood of A will be exponentially close (with respect to the size of the neighborhood) to covering the whole sphere.

We will explore the mathematical background of these phenomena and review, in particular, inequalities concerning concentration of probability. Depending on number of participants and interest we might even learn about their consequences for random graphs and matrix completion.

#### Organization

Time & location: Wed 16:15 - 17:45, Room: A3/HS001 (Hörsaal), **Exception: A6/SR 025/026 on June 15**

#### Contact

Péter Koltai (peter.koltai@fu-berlin.de)

#### Prerequisites

A rigorous course in probability theory, further undergraduate linear algebra and calculus.

#### Literature

[Ver] R. Vershynin: High-Dimensional Probability

https://www.math.uci.edu/~rvershyn/papers/HDP-book/HDP-book.pdf

[BHK] A. Blum, J. Hopcroft, and R. Kannan: Foundations of Data Science.

https://home.ttic.edu/~avrim/book.pdf

#### Procedure

The seminar will consist of __weekly student talks__ (~60 min) and following __discussion__. Each talk will be __moderated by another student participant__ of the seminar. Every speaker should present to P. Koltai a **detailed concept** **of their talk at least two weeks prior** **to the talk**. Please make an appointment via peter.koltai@fu-berlin.de.

The final grade will be composed from the results of the own talk(s).

Topics will be assigned during the ** first class on Wednesday, Apr 20.** These include (page numbers refer to the online version of the book [Ver]):

0. Appetizer: Probabilistic proof and approximate version of Caratheodory

1. Basics on random variables (pp. 6-12)

2. Hoeffding (pp. 13-19)

3. Chernoff + degrees of random graphs (pp. 19-23)

4. Sub-gaussian distributions: Definition and examples (pp. 24-29)

5. General Hoeffding’s and Khintchine’s inequalities, Centering, sub-exponential distributions (pp. 29-35)

6. Bernstein’s inequality & outlook (pp. 37-40)

7. Random vectors in high dimensions: norm & PCA (pp. 42-47)

8. TBA

8+. Johnson-Lindenstrauss lemma & others

#### Schedule

For the schedule, see this link.