Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commit 962b277

Browse files
committed
notes updates
1 parent 03c5c2f commit 962b277

File tree

6 files changed

+62
-16
lines changed

6 files changed

+62
-16
lines changed

‎_blog/misc/20_ml_coding_tips.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ displays
3737
- [napari](https://github.com/napari/napari) - image viewer
3838
- [python-fire](https://github.com/google/python-fire) - passing cmd line args
3939
- [auto-sklearn](https://github.com/automl/auto-sklearn) - automatically select hyperparams / classifiers using bayesian optimization
40-
- [venv](https://docs.python.org/3/library/venv.html) - manage your python packages (or something similar like pipenv)
40+
- use `uv` rather than `pip` / `venv`
4141

4242

4343

‎_blog/misc/23_paper_writing_tips.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ In general, I highly recommend the book [storytelling with data](https://github.
3535

3636
### Writing
3737

38-
- Here's are some [nice, brief notes](https://www.cs.ubc.ca/~schmidtm/Courses/Notes/writing.pdf) on academic writing
38+
- Here are some [nice, brief notes](https://www.cs.ubc.ca/~schmidtm/Courses/Notes/writing.pdf) on academic writing
3939
- Write in present tense
4040
- First-person is fine when it simplifies writing
4141
- Make your abstract really good: most people will only ever read the abstract

‎_notes/ml/nlp.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,3 +238,7 @@ Nice repo keeping track of progress [here](https://github.com/sebastianruder/NLP
238238
- **phonology** - study of the organization of sounds independent of their physical realization in speech / sign language
239239
- **morphology** - study of the internal structure of words in a language
240240
- **pragmatics** - study of the way in which context during communication contributes to meaning
241+
- What we mean when we say semantic: Toward a multidisciplinary semantic glossary ([reilly...venson, 2024](https://link.springer.com/article/10.3758/s13423-024-02556-7)) - when a linguist refers to semantics, they are talking about word meaning, but a semantic memory is typically referencing concepts
242+
- abstract - The quality of a concept (or word) whose meaning is understood primarily on the basis of language, but also draws from interoceptive experiences, including emotion, introspection, and metacognition
243+
- Concepts - coherent, relatively stable (but not static) units of knowledge in long-term memory that provide the elements from which more complex thoughts can be constructed. A concept captures commonalities and distinctions between a set of objects, events, relations, properties, and
244+
states. Concepts allow for the transfer and generalization of information without requiring explicit learning of every new instance

‎_notes/neuro/comp_neuro.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -947,11 +947,13 @@ subtitle: Diverse notes on various topics in computational neuro, data-driven ne
947947
- DEAP: A Database for Emotion Analysis ;Using Physiological Signals ([koelstra...ebrahimi, 2012](https://ieeexplore.ieee.org/abstract/document/5871728)) - 32-channel system
948948
- SEED: Investigating Critical Frequency Bands and Channels for EEG-Based Emotion Recognition with Deep Neural Networks ([zheng & lu, 2015](https://ieeexplore.ieee.org/abstract/document/7104132)) - 64-channel system
949949
- HBN-EEG dataset ([shirazi...makeig, 2024](https://www.biorxiv.org/content/10.1101/2024.10.03.615261v2)) - EEG recordings from over 3,000 participants across six distinct cognitive tasks [used in eeg2025 NeurIPS competition]
950+
- YOTO (You Only Think Once): A Human EEG Dataset for Multisensory Perception and Mental Imagery ([chang...wei, 2025](https://www.biorxiv.org/content/10.1101/2025.04.17.645384v1))
950951
- ECoG
951952
- The "Podcast" ECoG dataset for modeling neural activity during
952953
natural language comprehension ([zada...hasson, 2025](https://www.biorxiv.org/content/10.1101/2025.02.14.638352v1.full.pdf)) - 9 subjects listening to the same story
953954
- 30-min story (1330 total electrodes, ~5000 spoken words (non-unique)) has female interviewer/voiceover and a male speaker, occasionally background music
954955
- contextual word embeddings from GPT-2 XL (middle layer) accounted for most of the variance across nearly all the electrodes tested
956+
- single-subject intracortical words: https://www.kaggle.com/competitions/brain-to-text-25 (from [card et al. 2024](https://www.nejm.org/doi/full/10.1056/NEJMoa2314132))
955957
- Brain Treebank: Large-scale intracranial recordings from naturalistic language stimuli ([wang...barbu, 2024](https://arxiv.org/pdf/2411.08343))
956958
- Some works on this dataset
957959
- BrainBERT: Self-supervised representation learning for intracranial recordings ([wang...barbu, 2023](https://arxiv.org/abs/2302.14367))
@@ -1113,10 +1115,15 @@ subtitle: Diverse notes on various topics in computational neuro, data-driven ne
11131115
- for stimulus-based experiments, encoding model is the causal direction
11141116
- language
11151117
- Semantic reconstruction of continuous language from non-invasive brain recordings ([tang, lebel, jain, & huth, 2023](https://www.nature.com/articles/s41593-023-01304-9)) - reconstruct continuous natural language from fMRI, including to imagined speech
1118+
- Generative language reconstruction from brain recordings ([ye...ruotsalo, 2025](https://www.nature.com/articles/s42003-025-07731-7)) - map embedding into token space and finetune LM to decode text conditioned on the tokens (solves the token timing issue)
11161119
- Brain-to-Text Decoding: A Non-invasive Approach via Typing ([levy...king, 2025](https://scontent.fphl1-1.fna.fbcdn.net/v/t39.2365-6/475464888_600710912891423_9108680259802499048_n.pdf?_nc_cat=102&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=EryvneL7DMcQ7kNvgFI6M7D&_nc_oc=Adi15_Ln_aPZ_nUY7RyiXzmEzdKu0opFDIwv3J7P55siQ-yn-FUdKQ6_H6PZBKiwBiY&_nc_zt=14&_nc_ht=scontent.fphl1-1.fna&_nc_gid=A441zcs56M0HTpo4ZEEWBSk&oh=00_AYAZ7fX4RhYWqMu2aMria3GoOB6uMNIiIciUQzU0vXy3Tw&oe=67AC0C96)) - decode characters typed from MEG/EEG
11171120
- From Thought to Action: How a Hierarchy of Neural Dynamics Supports Language Production ([zhang, levy, ...king, 2025](https://ai.meta.com/research/publications/from-thought-to-action-how-a-hierarchy-of-neural-dynamics-supports-language-production/)) - when decoding during typing, first decode phrase, then word, then syllable, then letter
11181121
- Decoding speech from non-invasive brain recordings ([defossez, caucheteux, ..., remi-king, 2022](https://arxiv.org/abs/2208.12266))
1119-
1122+
- fNIRS
1123+
- MindSpeech: Continuous Imagined Speech Decoding using High-Density fNIRS and Prompt Tuning for Advanced Human-AI Interaction (MindPortal; [zhang...dehghani, 2024](https://arxiv.org/abs/2408.05362))
1124+
- prompts participants to imagine sentences of different topics by providing topic words & keywords - afterwards, participants type out the sentence
1125+
- MindGPT: Advancing Human-AI Interaction with Non-Invasive fNIRS-Based Imagined Speech Decoding (MindPortal; [zhang...dehghani, 2024](https://arxiv.org/abs/2408.05361)) - classify semantically different sentences from fNIRS during imagined speech
1126+
11201127
- vision
11211128
- Decoding the Semantic Content of Natural Movies from Human Brain Activity ([huth...gallant, 2016](https://www.frontiersin.org/journals/systems-neuroscience/articles/10.3389/fnsys.2016.00081/full)) - direct decoding of concepts from movies using hierarchical logistic regression
11221129
- interpreting weights from a decoding model can be tricky, even if if a concept is reflected in the voxel, it may not be uniquely reflected in the voxel and therefore assigned low weight

‎_notes/research_ovws/ovw_interp.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -464,6 +464,7 @@ Symbolic regression learns a symbolic expression for a function (e.g. a mathemat
464464
- BC-LLM: Bayesian Concept Bottleneck Models with LLM Priors ([feng...tan, 2024](https://arxiv.org/abs/2410.15555)) - use LLM to generate questions from extracted keywords, then iterate on fitting predictive models and searching for new concepts with a Bayesian approach
465465
- LaBO: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification ([yang...yatskar, 2022](https://arxiv.org/pdf/2211.11158.pdf)) - generate prompt-based features using GPT-3 (e.g. "brown head with white stripes") and use CLIP to check for the presence of those features, all before learning simple linear model
466466
- Knowledge-enhanced Bottlenecks (KnoBo) - A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis ([yang...yatskar, 2024](https://yueyang1996.github.io/papers/knobo.pdf)) - CBMs that incorporate knowledge priors that constrain it to reason with clinically relevant factors found in medical textbooks or PubMed
467+
- Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models ([gandikota...torralba, bau, 2024](https://link.springer.com/chapter/10.1007/978-3-031-73661-2_10))
467468
- CB-LLM: Crafting Large Language Models for Enhanced Interpretability ([sun...lily weng, 2024](https://lilywenglab.github.io/WengLab_2024_CBLLM.pdf))
468469
- compute embedding similarity of concepts and input, and train layer to predict each of these similarity scores as concept bottleneck
469470
- before training bottleneck, use ChatGPT to help correct any concept scores that seem incorrect
@@ -1098,6 +1099,13 @@ How interactions are defined and summarized is a very difficult thing to specify
10981099
- VANISH: Variable Selection Using Adaptive Nonlinear Interaction Structures in High Dimensions ([radchenko & james, 2012](https://www.tandfonline.com/doi/abs/10.1198/jasa.2010.tm10130?casa_token=HhKY3HXj0fYAAAAA:vTRgqAqWy3DZ9r9vXEinQOZbWuctLPA3J9bACTbrnKIkUPV19yqaDV5zr9dD6IiTrYXsj6HT_kDYNN8)) - learns pairwise interactions via basis function expansions
10991100
- uses hierarchiy constraint
11001101
- Coefficient tree regression: fast, accurate and interpretable predictive modeling ([surer, apley, & malthouse, 2021](https://link-springer-com.libproxy.berkeley.edu/article/10.1007/s10994-021-06091-7)) - iteratively group linear terms with similar coefficients into a bigger term
1102+
- The Most Important Features in GAMs Might Be Groups of Features ([bosschieter...caruana, pohl, 2025](https://arxiv.org/pdf/2506.19937)) - define importance of a group of features as the the sum over samples of the absolute value of the sum of the component functions for the group
1103+
- single-feature importance for feature $x_j$:
1104+
- $I_{x_j}=\frac{1}{|\mathscr{T}|} \sum_{t \in \mathscr{T}}\left|f_j\left(t_j\right)\right|$
1105+
1106+
- group feature importance for group $G=\left\{x_{i_1}, \ldots, x_{i_k}\right\}$:
1107+
- $I_G:=\frac{1}{|\mathscr{T}|} \sum_{t \in \mathscr{T}}\left|f_{i_1}\left(t_{i_1}\right)+\cdots+f_{i_k}\left(t_{i_k}\right)\right|$
1108+
11011109

11021110
## finding influential examples
11031111

0 commit comments

Comments
(0)

AltStyle によって変換されたページ (->オリジナル) /