Randomized linear algebra

Randomized linear algebra, aka sketching, uses randomized embeddings to the reduce the dimensionality, and improve the computational efficiency, of large-scale problems in numerical linear algebra.

Community activity

Simons Institute:

2013 Big Data Boot Camp: Drineas & Mahoney: Past, present and future of randomized numerical linear algebra (video)
2018 Foundations of Data Science Boot Camp (abstracts , video)
- Woodruff & Clarkson: Sketching for linear algebra
- Mahoney: Sampling for linear algebra, statistis, and optimization
2018 Workshop: Randomized numerical linear algebra and applications (abstracts , video)

Literature

Surveys

Halko, Martinsson, Tropp, 2011: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions (doi, arxiv)
Mahoney, 2011: Randomized algorithms for matrices and data (doi, arxiv)
- See also Mahoney’s 2013 lecture notes (arxiv)
- Review article: Drineas & Mahoney, 2016, in Communications of the ACM: RandNLA: randomized numerical linear algebra (doi)
Woodruff, 2014: Sketching as a tool for numerical linear algebra (doi, arxiv)
Kannan & Vempala, 2017: Randomized algorithms in numerical linear algebra (doi, pdf)

Random projections and the Johnson–Lindenstrauss lemma

There are now several textbook treatments:

Vempala, 2004: The Random Projection Method
Vershynin, 2018: High-dimensional Probability
- Sec 5.3: Application: Johnson–Lindenstrauss lemma
- Sec 9.3: The Johnson-Lindenstrauss lemma for infinite sets
Wainwright, 2019: High-dimensional Statistics, Example 2.12: Johnson–Lindenstrauss embedding

See also the literature on metric embeddings.

The fast Johnson–Lindenstrauss transform (FJLT) is still confined to the research literature:

Ailon & Chazelle, 2009: The fast Johnson–Lindenstrauss transform and approximate nearest neighbors (doi)
- Conference version: Ailon & Chazelle, 2006: Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform (doi)
- Review article: Ailon & Chazelle, 2010, in Communications of the ACM: Faster dimension reduction (doi)
Matoušek, 2008: On variants of the Johnson–Lindenstrauss lemma (doi, pdf)
- Alternate proof of, and slight improvement on, the FJLT
- Readable and self-contained exposition, but can made much simpler, nearly trivial, with the modern machinery of measure concentration
- All lemmas in Sec 2 are now standard (Vershynin, 2017, High-dimensional Probability, Chapter 2)
- The key Proposition 3.2 is Bernstein’s inequality for sub-exponential random variables (Vershynin, 2017, Theorem 2.8.2, esp. p. 35)
- With this background, Theorem 3.1 (the J-L lemma with independent sub-Gaussian projection coefficients) is proved in one paragraph
- Lemma 4.2 is equivalent to saying $Y$ has sub-exponential norm at most $C \alpha / \sqrt{2q}$ (Vershynin, 2017, Proposition 2.7.1 (v))
- Theorem 4.1 (a sparse J-L lemma for vectors with small $\ell_\infty$ norm) then follows quickly from Bernstein’s inequality (I don’t think Matoušek’s truncation argument is necessary)