Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 30c698b

Browse files
committed
notes updates
1 parent 0195ddf commit 30c698b

File tree

3 files changed

+73
-36
lines changed

3 files changed

+73
-36
lines changed

‎_includes/01_research.html

Lines changed: 47 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -21,19 +21,24 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
2121
are interested in interning at MSR, feel free to reach out over email :)</div>
2222

2323
<div class="research_box"><strong>🔎
24-
Interpretability.</strong> I'm interested in <a href="https://arxiv.org/abs/2402.01761">rethinking
25-
interpretability</a> in the context of LLMs
24+
Interpretability methods,</strong> especially <a href="https://arxiv.org/abs/2402.01761">LLM
25+
interpretability</a>.
2626
<br>
2727
<br>
2828
<a href="https://www.nature.com/articles/s41467-023-43713-1">augmented imodels</a> - use LLMs to build a
2929
transparent model<br>
30+
<!-- <a href="https://arxiv.org/abs/2310.14034">tree prompting</a> - improve black-box few-shot text classification -->
31+
<!-- with decision trees<br> -->
32+
<a href="https://arxiv.org/abs/2311.02262">attention steering</a> - mechanistically guide LLMs by
33+
emphasizing specific input
34+
spans<br>
3035
<a href="http://proceedings.mlr.press/v119/rieger20a.html">explanation penalization</a> - regularize
3136
explanations to align models with prior knowledge<br>
3237
<a href="https://proceedings.neurips.cc/paper/2021/file/acaa23f71f963e96c8847585e71352d6-Paper.pdf">adaptive
33-
wavelet distillation</a> - replace neural nets with simple, performant wavelet models
38+
wavelet distillation</a> - replace neural nets with transparent wavelet models
3439
</div>
3540

36-
<div class="research_box">
41+
<!-- <div >
3742
3843
<strong>🚗 LLM steering. </strong>Interpretability tools can provide ways to better guide and use LLMs (without
3944
needing gradients!)
@@ -46,43 +51,49 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
4651
spans<br>
4752
<a href="https://arxiv.org/abs/2210.01848">interpretable autoprompting</a> - automatically find fluent
4853
natural-language prompts<br>
49-
</div>
54+
</div> -->
5055

5156

5257
<div class="research_box">
5358

54-
<strong>🧠 Neuroscience. </strong> Since joining MSR, I've been focused on leveraging LLM interpretability
55-
to understand how the human brain represents language (using fMRI in collaboration with the <a
56-
href="https://www.cs.utexas.edu/~huth/index.html">Huth lab</a> at UT Austin).
59+
<strong>🧠 Semantic brain mapping, </strong> mostly using fMRI responses to language.
60+
<!-- Since joining MSR, I've been focused on leveraging LLM interpretability -->
61+
<!-- to understand how the human brain represents language (using fMRI in collaboration with the <a -->
62+
<!-- href="https://www.cs.utexas.edu/~huth/index.html">Huth lab</a> at UT Austin). -->
5763
<br>
5864
<br>
59-
<a href="https://arxiv.org/abs/2410.00812">explanation-mediated validation</a> - build and test fMRI
65+
<a href="https://arxiv.org/abs/2410.00812">explanation-mediated validation</a> - test fMRI
6066
explanations using LLM-generated stimuli<br>
61-
<a href="https://arxiv.org/abs/2405.16714">qa embeddings</a> - build interpretable fMRI encoding models by
67+
<a href="https://arxiv.org/abs/2405.16714">qa embeddings</a> - predict fMRI language responses by
6268
asking yes/no questions to LLMs<br>
6369
<a href="https://arxiv.org/abs/2305.09863">summarize &amp; score explanations</a> - generate natural-language
64-
explanations of fMRI encoding models
70+
explanations of fMRI encoding models<br>
6571
</div>
6672

6773

6874
<div class="research_box"><strong>💊
69-
Healthcare. </strong>I'm also actively working on how we can improve clinical decision instruments by using
70-
the information contained across various sources in the medical literature (in collaboration with <a
71-
href="https://profiles.ucsf.edu/aaron.kornblith">Aaron Kornblith</a> at UCSF and the MSR <a
72-
href="https://www.microsoft.com/en-us/research/group/real-world-evidence/">Health Futures team</a>).
75+
Clinical decision rules, </strong>can we improve them with data?
76+
<!-- I'm also actively working on how we can improve clinical decision -->
77+
<!-- instruments by using -->
78+
<!-- the information contained across various sources in the medical literature (in collaboration with <a -->
79+
<!-- href="https://profiles.ucsf.edu/aaron.kornblith">Aaron Kornblith</a> at UCSF and the MSR <a -->
80+
<!-- href="https://www.microsoft.com/en-us/research/group/real-world-evidence/">Health Futures team</a>). -->
7381
<br>
7482
<br>
83+
<a href="https://arxiv.org/pdf/2201.11931">greedy tree sums</a> - build accurate, compact tree-based clinical
84+
models<br>
7585
<a href="https://arxiv.org/abs/2306.00024">clinical self-verification</a> - self-verification improves
7686
performance and interpretability of clinical information extraction<br>
7787
<a href="https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000076">clinical rule
78-
vetting</a> - stress testing a clinical decision instrument performance for intra-abdominal injury
79-
88+
vetting</a> - stress testing a clinical decision instrument performance for intra-abdominal injury<br>
8089
</div>
8190

8291
<div style="width: 100%;padding: 8px;margin-bottom: 20px; text-align:center; font-size: large;">
83-
Across these areas, I'm interested in decision trees and how we can build flexible but accurate transparent
84-
models. I put a lot of my code into the <a href="https://github.com/csinva/imodels">imodels</a> and <a
85-
href="https://github.com/csinva/imodelsx">imodelsX</a> packages.</div>
92+
<!-- Across these areas, I'm interested in decision trees and how we can build flexible but accurate transparent -->
93+
<!-- models. -->
94+
Note: I put a lot of my code into the <a href="https://github.com/csinva/imodels">imodels</a> and <a
95+
href="https://github.com/csinva/imodelsx">imodelsX</a> packages.
96+
</div>
8697
</div>
8798

8899
<hr>
@@ -153,6 +164,18 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
153164
</tr>
154165
</thead>
155166
<tbody>
167+
<tr>
168+
<td class="center">'25</td>
169+
<td>Vector-ICL: In-context Learning with Continuous Vector Representations
170+
</td>
171+
<td>zhuang et al.</td>
172+
<td class="med">🔎🌀</td>
173+
<td class="center"><a href="https://arxiv.org/abs/2410.05629">iclr</a></td>
174+
<td class="big"><a href="https://github.com/EvanZhuang/vector-icl"><i class="fa fa-github fa-fw"></i></a>
175+
</td>
176+
<td class="med">
177+
</td>
178+
</tr>
156179
<tr>
157180
<td class="center">'24</td>
158181
<td>Interpretable Language Modeling via Induction-head Ngram Models
@@ -175,6 +198,8 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
175198
<td class="big"><a href="https://github.com/microsoft/automated-explanations"><i
176199
class="fa fa-github fa-fw"></i></a></td>
177200
<td class="med">
201+
<a href="https://docs.google.com/presentation/d/1bFZZ8-OwwNxN3DjPdyPFKuP16yyYTP9DlaTzKOaYVWI/"><i
202+
class="fa fa-desktop fa-fw"></i></a>
178203
</td>
179204
</tr>
180205
<tr>
@@ -187,6 +212,8 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
187212
<td class="big"><a href="https://github.com/csinva/interpretable-embeddings"><i
188213
class="fa fa-github fa-fw"></i></a></td>
189214
<td class="med">
215+
<a href="https://docs.google.com/presentation/d/1bFZZ8-OwwNxN3DjPdyPFKuP16yyYTP9DlaTzKOaYVWI/"><i
216+
class="fa fa-desktop fa-fw"></i></a>
190217
</td>
191218
</tr>
192219
<tr>
@@ -200,18 +227,6 @@ <h2 style="text-align: center; margin-top: -150px;"> Research</h2>
200227
<td class="med">
201228
</td>
202229
</tr>
203-
<tr>
204-
<td class="center">'24</td>
205-
<td>Vector-ICL: In-context Learning with Continuous Vector Representations
206-
</td>
207-
<td>zhuang et al.</td>
208-
<td class="med">🔎🌀</td>
209-
<td class="center"><a href="https://arxiv.org/abs/2410.05629">arxiv</a></td>
210-
<td class="big"><a href="https://github.com/EvanZhuang/vector-icl"><i class="fa fa-github fa-fw"></i></a>
211-
</td>
212-
<td class="med">
213-
</td>
214-
</tr>
215230
<tr>
216231
<td class="center">'24</td>
217232
<td>Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning

‎_notes/research_ovws/ovw_interp.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,12 @@ For an implementation of many of these models, see the python [imodels package](
309309
- longitudinal data, survival curves
310310

311311
- misc
312+
313+
- On the Power of Decision Trees in Auto-Regressive Language Modeling ([gan, galanti, poggio, malach, 2024](https://arxiv.org/pdf/2409.19150))
314+
- get token word embeddings
315+
- compute exp. weighted avg of embeddings (upweights most recent tokens)
316+
- predicts next embedding with XGBoost (regression loss) then finds closest token
317+
312318
- counterfactuals
313319
- [Counterfactual Explanations for Oblique Decision Trees: Exact, Efficient Algorithms](https://arxiv.org/abs/2103.01096) (2021)
314320
- [Optimal Counterfactual Explanations in Tree Ensembles](https://arxiv.org/abs/2106.06631)
@@ -322,7 +328,7 @@ For an implementation of many of these models, see the python [imodels package](
322328
1. feature-level: monotonicity, attribute costs, hierarchy/interaction, fairness, privacy
323329
2. structure-level - e.g. minimize #nodes
324330
3. instance-level - must (cannot) link, robust predictions
325-
331+
326332
- Analysis of Boolean functions ([wiki](https://en.wikipedia.org/wiki/Analysis_of_Boolean_functions))
327333

328334
- Every real-valued function $f:\{-1,1\}^n \rightarrow \mathbb{R}$ has a unique expansion as a multilinear polynomial:

‎_notes/research_ovws/ovw_llms.md

Lines changed: 19 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1283,7 +1283,7 @@ mixture of experts models have become popular because of the need for (1) fast s
12831283
- Ravel: Evaluating Interpretability Methods on Disentangling Language Model Representations ([huang, wu, potts, geva, & geiger, 2024](https://arxiv.org/pdf/2402.17700v1.pdf))
12841284

12851285

1286-
## directly learning algorithms / in-context
1286+
## directly learning algorithms
12871287

12881288
- Empirical results
12891289
- FunSearch: Mathematical discoveries from program search with LLMs ([deepmind, 2023](https://www.nature.com/articles/s41586-023-06924-6))
@@ -1294,6 +1294,10 @@ mixture of experts models have become popular because of the need for (1) fast s
12941294
- Alphafold
12951295
- Accurate proteome-wide missense variant effect prediction with AlphaMissense ([deepmind, 2023](https://www.science.org/doi/full/10.1126/science.adg7492)) - predict effects of varying single-amino acid changes
12961296
- Bridging the Human-AI Knowledge Gap: Concept Discovery and Transfer in AlphaZero ([schut...hessabis, paquet, & been kim, 2023](https://arxiv.org/abs/2310.16410))
1297+
- Learning a Decision Tree Algorithm with Transformers ([zhuang...gao, 2024](https://arxiv.org/abs/2402.03774))
1298+
1299+
## in-context learning
1300+
12971301
- What Can Transformers Learn In-Context? A Case Study of Simple Function Classes ([garg, tsipras, liang, & valiant, 2022](https://arxiv.org/abs/2208.01066)) - models can succesfully metalearn functions like OLS
12981302
- e.g. during training, learn inputs-outputs from different linear functions
12991303
- during testing, have to predict outputs for inputs from a different linear function
@@ -1328,6 +1332,12 @@ mixture of experts models have become popular because of the need for (1) fast s
13281332
- Transformers are Universal In-context Learners ([furuya...peyre, 2024](https://arxiv.org/abs/2408.01367)) - mathetmatically show that transformers are universal and can approximate continuous in-context mappings to arbitrary precision
13291333
- Limitations
13301334
- Faith and Fate: Limits of Transformers on Compositionality ([dziri...choi, 2023](https://arxiv.org/abs/2305.18654)) - LLMs can't (easily) be trained well for multiplication (and similar tasks)
1335+
- ICLR: In-Context Learning of Representations ([park...wattenberg, tanaka, 2024](https://arxiv.org/abs/2501.00070)) - showing pairs of words sampled from a graph can make the embeddings of those words match the structure of that graph
1336+
- Label Words are Anchors: An Information Flow Perspective for
1337+
Understanding In-Context Learning ([wang...sun, 2023](https://aclanthology.org/2023.emnlp-main.609.pdf))
1338+
- Correlation and Navigation in the Vocabulary Key Representation Space of Language Models ([peng...shang, 2024](https://arxiv.org/abs/2410.02284)) - some tokens are correlated in embedding space and wrong next-token completions can be highly ranked if their embeddings are correlated with correct ones
1339+
- as we sample tokens in context, we get more diverse completions, skipping nearby wrong next tokens
1340+
13311341

13321342
## cool tasks
13331343

@@ -1483,7 +1493,8 @@ mixture of experts models have become popular because of the need for (1) fast s
14831493
- TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data ([yin, neubig, ..., riedel, 2020](https://www.semanticscholar.org/paper/TaBERT%3A-Pretraining-for-Joint-Understanding-of-and-Yin-Neubig/a5b1d1cab073cb746a990b37d42dc7b67763f881))
14841494

14851495
- classification / predictions
1486-
- TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second ([hollman, ..., hutter, 2022](https://arxiv.org/abs/2207.01848))
1496+
- TabPFN v2: Accurate predictions on small data with a tabular foundation model ([hollman....hutter, 2025](https://www.nature.com/articles/s41586-024-08328-6))
1497+
- TabPFN v1: A Transformer That Solves Small Tabular Classification Problems in a Second ([hollman, ..., hutter, 2022](https://arxiv.org/abs/2207.01848))
14871498
- transformer takes in train + test dataset then outputs predictions
14881499
- each row (data example) is treated as a token and test points attend only to training t
14891500
- takes fixed-size 100 columns, with zero-padded columns at the end (during training, randomly subsample columns)
@@ -1494,7 +1505,7 @@ mixture of experts models have become popular because of the need for (1) fast s
14941505
- Language models are weak learners ([manikandan, jian, & kolter, 2023](https://arxiv.org/abs/2306.14101)) - use prompted LLMs as weak learners in boosting algorithm for tabular data
14951506
- TabRet: Pre-training Transformer-based Tabular Models for Unseen Columns ([onishi...hayashi, 2023](https://arxiv.org/abs/2303.15747))
14961507
- AnyPredict: A Universal Tabular Prediction System Based on LLMs https://openreview.net/forum?id=icuV4s8f2c - converting tabular data into machine-understandable prompts and fine-tuning LLMs to perform accurate predictions
1497-
1508+
14981509
- interpretability
14991510
- InterpreTabNet: Enhancing Interpretability of Tabular Data Using Deep Generative Models and LLM ([si...krishnan, 2023](https://openreview.net/pdf?id=kzR5Cj5blw)) - make attention sparse and describe it with GPT4
15001511

@@ -1519,6 +1530,11 @@ mixture of experts models have become popular because of the need for (1) fast s
15191530
- Embeddings for Tabular Data: A Survey ([singh & bedathur, 2023](https://arxiv.org/abs/2302.11777))
15201531
- Deep neural networks and tabular data: A survey ([borisov et al. 2022]()) - mostly compares performance on standard tasks (e.g. classification)
15211532

1533+
## education
1534+
1535+
- Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach ([jurenka...ibrahim, 2024](https://storage.googleapis.com/deepmind-media/LearnLM/LearnLM_paper.pdf))
1536+
- seven diverse educational benchmark
1537+
- The FACTS Grounding Leaderboard: Benchmarking LLMs' Ability to Ground Response?s to Long-Form Input ([jacovi...das, 2025](https://arxiv.org/abs/2501.03200)) - benchmark evaluates whether responses are consistent with a provided document as context
15221538

15231539
## llm limitations / perspectives
15241540

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /