Randomized linear algebra
Randomized linear algebra, aka sketching, uses randomized embeddings to the reduce the dimensionality, and improve the computational efficiency, of large-scale problems in numerical linear algebra.
Community activity
Simons Institute:
- 2013 Big Data Boot Camp: Drineas & Mahoney: Past, present and future of randomized numerical linear algebra (video)
- 2018 Foundations of Data Science Boot Camp (abstracts , video)
- Woodruff & Clarkson: Sketching for linear algebra
- Mahoney: Sampling for linear algebra, statistis, and optimization
- 2018 Workshop: Randomized numerical linear algebra and applications (abstracts , video)
Literature
Surveys
- Halko, Martinsson, Tropp, 2011: Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions (doi, arxiv)
- Mahoney, 2011: Randomized algorithms for matrices and data (doi, arxiv)
- Woodruff, 2014: Sketching as a tool for numerical linear algebra (doi, arxiv)
- Kannan & Vempala, 2017: Randomized algorithms in numerical linear algebra (doi, pdf)
Random projections and the Johnson–Lindenstrauss lemma
There are now several textbook treatments:
- Vempala, 2004: The Random Projection Method
- Vershynin, 2018: High-dimensional Probability
- Sec 5.3: Application: Johnson–Lindenstrauss lemma
- Sec 9.3: The Johnson-Lindenstrauss lemma for infinite sets
- Wainwright, 2019: High-dimensional Statistics, Example 2.12: Johnson–Lindenstrauss embedding
See also the literature on metric embeddings.
The fast Johnson–Lindenstrauss transform (FJLT) is still confined to the research literature:
- Ailon & Chazelle, 2009: The fast Johnson–Lindenstrauss transform and approximate nearest neighbors (doi)
- Matoušek, 2008: On variants of the Johnson–Lindenstrauss lemma (doi, pdf)
- Alternate proof of, and slight improvement on, the FJLT
- Readable and self-contained exposition, but can made much simpler, nearly trivial, with the modern machinery of measure concentration
- All lemmas in Sec 2 are now standard (Vershynin, 2017, High-dimensional Probability, Chapter 2)
- The key Proposition 3.2 is Bernstein’s inequality for sub-exponential random variables (Vershynin, 2017, Theorem 2.8.2, esp. p. 35)
- With this background, Theorem 3.1 (the J-L lemma with independent sub-Gaussian projection coefficients) is proved in one paragraph
- Lemma 4.2 is equivalent to saying \(Y\) has sub-exponential norm at most \(C \alpha / \sqrt{2q}\) (Vershynin, 2017, Proposition 2.7.1 (v))
- Theorem 4.1 (a sparse J-L lemma for vectors with small \(\ell_\infty\) norm) then follows quickly from Bernstein’s inequality (I don’t think Matoušek’s truncation argument is necessary)