Commit 962b277

committed

notes updates

1 parent 03c5c2f commit 962b277Copy full SHA for 962b277

File tree

6 files changed

+62

-16

lines changed

_blog/misc
- 20_ml_coding_tips.md
- 23_paper_writing_tips.md
_notes
- ml
  - nlp.md
- neuro
  - comp_neuro.md
- research_ovws
  - ovw_interp.md
  - ovw_llms.md

6 files changed

+62

-16

lines changed

`‎_blog/misc/20_ml_coding_tips.md`

Lines changed: 1 addition & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -37,7 +37,7 @@ displays`
`37`	`37`	`- [napari](https://github.com/napari/napari) - image viewer`
`38`	`38`	`- [python-fire](https://github.com/google/python-fire) - passing cmd line args`
`39`	`39`	`- [auto-sklearn](https://github.com/automl/auto-sklearn) - automatically select hyperparams / classifiers using bayesian optimization`
`40`		`-- [venv](https://docs.python.org/3/library/venv.html) - manage your python packages (or something similar like pipenv)`
	`40`	+- use `uv` rather than `pip` / `venv`
`41`	`41`
`42`	`42`
`43`	`43`

`‎_blog/misc/23_paper_writing_tips.md`

Lines changed: 1 addition & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -35,7 +35,7 @@ In general, I highly recommend the book [storytelling with data](https://github.`
`35`	`35`
`36`	`36`	`### Writing`
`37`	`37`
`38`		`-- Here's are some [nice, brief notes](https://www.cs.ubc.ca/~schmidtm/Courses/Notes/writing.pdf) on academic writing`
	`38`	`+- Here are some [nice, brief notes](https://www.cs.ubc.ca/~schmidtm/Courses/Notes/writing.pdf) on academic writing`
`39`	`39`	`- Write in present tense`
`40`	`40`	`- First-person is fine when it simplifies writing`
`41`	`41`	`- Make your abstract really good: most people will only ever read the abstract`

`‎_notes/ml/nlp.md`

Lines changed: 4 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -238,3 +238,7 @@ Nice repo keeping track of progress [here](https://github.com/sebastianruder/NLP`
`238`	`238`	`- phonology - study of the organization of sounds independent of their physical realization in speech / sign language`
`239`	`239`	`- morphology - study of the internal structure of words in a language`
`240`	`240`	`- pragmatics - study of the way in which context during communication contributes to meaning`
	`241`	`+- What we mean when we say semantic: Toward a multidisciplinary semantic glossary ([reilly...venson, 2024](https://link.springer.com/article/10.3758/s13423-024-02556-7)) - when a linguist refers to semantics, they are talking about word meaning, but a semantic memory is typically referencing concepts`
	`242`	`+ - abstract - The quality of a concept (or word) whose meaning is understood primarily on the basis of language, but also draws from interoceptive experiences, including emotion, introspection, and metacognition`
	`243`	`+ - Concepts - coherent, relatively stable (but not static) units of knowledge in long-term memory that provide the elements from which more complex thoughts can be constructed. A concept captures commonalities and distinctions between a set of objects, events, relations, properties, and`
	`244`	`+ states. Concepts allow for the transfer and generalization of information without requiring explicit learning of every new instance`

`‎_notes/neuro/comp_neuro.md`

Lines changed: 8 additions & 1 deletion

Original file line number	Diff line number	Diff line change
`@@ -947,11 +947,13 @@ subtitle: Diverse notes on various topics in computational neuro, data-driven ne`
`947`	`947`	`- DEAP: A Database for Emotion Analysis ;Using Physiological Signals ([koelstra...ebrahimi, 2012](https://ieeexplore.ieee.org/abstract/document/5871728)) - 32-channel system`
`948`	`948`	`- SEED: Investigating Critical Frequency Bands and Channels for EEG-Based Emotion Recognition with Deep Neural Networks ([zheng & lu, 2015](https://ieeexplore.ieee.org/abstract/document/7104132)) - 64-channel system`
`949`	`949`	`- HBN-EEG dataset ([shirazi...makeig, 2024](https://www.biorxiv.org/content/10.1101/2024.10.03.615261v2)) - EEG recordings from over 3,000 participants across six distinct cognitive tasks [used in eeg2025 NeurIPS competition]`
	`950`	`+ - YOTO (You Only Think Once): A Human EEG Dataset for Multisensory Perception and Mental Imagery ([chang...wei, 2025](https://www.biorxiv.org/content/10.1101/2025.04.17.645384v1))`
`950`	`951`	`- ECoG`
`951`	`952`	`- The "Podcast" ECoG dataset for modeling neural activity during`
`952`	`953`	`natural language comprehension ([zada...hasson, 2025](https://www.biorxiv.org/content/10.1101/2025.02.14.638352v1.full.pdf)) - 9 subjects listening to the same story`
`953`	`954`	`- 30-min story (1330 total electrodes, ~5000 spoken words (non-unique)) has female interviewer/voiceover and a male speaker, occasionally background music`
`954`	`955`	`- contextual word embeddings from GPT-2 XL (middle layer) accounted for most of the variance across nearly all the electrodes tested`
	`956`	`+ - single-subject intracortical words: https://www.kaggle.com/competitions/brain-to-text-25 (from [card et al. 2024](https://www.nejm.org/doi/full/10.1056/NEJMoa2314132))`
`955`	`957`	`- Brain Treebank: Large-scale intracranial recordings from naturalistic language stimuli ([wang...barbu, 2024](https://arxiv.org/pdf/2411.08343))`
`956`	`958`	`- Some works on this dataset`
`957`	`959`	`- BrainBERT: Self-supervised representation learning for intracranial recordings ([wang...barbu, 2023](https://arxiv.org/abs/2302.14367))`
`@@ -1113,10 +1115,15 @@ subtitle: Diverse notes on various topics in computational neuro, data-driven ne`
`1113`	`1115`	`- for stimulus-based experiments, encoding model is the causal direction`
`1114`	`1116`	`- language`
`1115`	`1117`	`- Semantic reconstruction of continuous language from non-invasive brain recordings ([tang, lebel, jain, & huth, 2023](https://www.nature.com/articles/s41593-023-01304-9)) - reconstruct continuous natural language from fMRI, including to imagined speech`
	`1118`	`+ - Generative language reconstruction from brain recordings ([ye...ruotsalo, 2025](https://www.nature.com/articles/s42003-025-07731-7)) - map embedding into token space and finetune LM to decode text conditioned on the tokens (solves the token timing issue)`
`1116`	`1119`	- Brain-to-Text Decoding: A Non-invasive Approach via Typing ([levy...king, 2025](https://scontent.fphl1-1.fna.fbcdn.net/v/t39.2365-6/475464888_600710912891423_9108680259802499048_n.pdf?_nc_cat=102&ccb=1-7&_nc_sid=3c67a6&_nc_ohc=EryvneL7DMcQ7kNvgFI6M7D&_nc_oc=Adi15_Ln_aPZ_nUY7RyiXzmEzdKu0opFDIwv3J7P55siQ-yn-FUdKQ6_H6PZBKiwBiY&_nc_zt=14&_nc_ht=scontent.fphl1-1.fna&_nc_gid=A441zcs56M0HTpo4ZEEWBSk&oh=00_AYAZ7fX4RhYWqMu2aMria3GoOB6uMNIiIciUQzU0vXy3Tw&oe=67AC0C96)) - decode characters typed from MEG/EEG
`1117`	`1120`	`- From Thought to Action: How a Hierarchy of Neural Dynamics Supports Language Production ([zhang, levy, ...king, 2025](https://ai.meta.com/research/publications/from-thought-to-action-how-a-hierarchy-of-neural-dynamics-supports-language-production/)) - when decoding during typing, first decode phrase, then word, then syllable, then letter`
`1118`	`1121`	`- Decoding speech from non-invasive brain recordings ([defossez, caucheteux, ..., remi-king, 2022](https://arxiv.org/abs/2208.12266))`
`1119`		`-`
	`1122`	`+ - fNIRS`
	`1123`	`+ - MindSpeech: Continuous Imagined Speech Decoding using High-Density fNIRS and Prompt Tuning for Advanced Human-AI Interaction (MindPortal; [zhang...dehghani, 2024](https://arxiv.org/abs/2408.05362))`
	`1124`	`+ - prompts participants to imagine sentences of different topics by providing topic words & keywords - afterwards, participants type out the sentence`
	`1125`	`+ - MindGPT: Advancing Human-AI Interaction with Non-Invasive fNIRS-Based Imagined Speech Decoding (MindPortal; [zhang...dehghani, 2024](https://arxiv.org/abs/2408.05361)) - classify semantically different sentences from fNIRS during imagined speech`
	`1126`	`+`
`1120`	`1127`	`- vision`
`1121`	`1128`	`- Decoding the Semantic Content of Natural Movies from Human Brain Activity ([huth...gallant, 2016](https://www.frontiersin.org/journals/systems-neuroscience/articles/10.3389/fnsys.2016.00081/full)) - direct decoding of concepts from movies using hierarchical logistic regression`
`1122`	`1129`	`- interpreting weights from a decoding model can be tricky, even if if a concept is reflected in the voxel, it may not be uniquely reflected in the voxel and therefore assigned low weight`

`‎_notes/research_ovws/ovw_interp.md`

Lines changed: 8 additions & 0 deletions

Original file line number	Diff line number	Diff line change
`@@ -464,6 +464,7 @@ Symbolic regression learns a symbolic expression for a function (e.g. a mathemat`
`464`	`464`	`- BC-LLM: Bayesian Concept Bottleneck Models with LLM Priors ([feng...tan, 2024](https://arxiv.org/abs/2410.15555)) - use LLM to generate questions from extracted keywords, then iterate on fitting predictive models and searching for new concepts with a Bayesian approach`
`465`	`465`	`- LaBO: Language in a Bottle: Language Model Guided Concept Bottlenecks for Interpretable Image Classification ([yang...yatskar, 2022](https://arxiv.org/pdf/2211.11158.pdf)) - generate prompt-based features using GPT-3 (e.g. "brown head with white stripes") and use CLIP to check for the presence of those features, all before learning simple linear model`
`466`	`466`	`- Knowledge-enhanced Bottlenecks (KnoBo) - A Textbook Remedy for Domain Shifts: Knowledge Priors for Medical Image Analysis ([yang...yatskar, 2024](https://yueyang1996.github.io/papers/knobo.pdf)) - CBMs that incorporate knowledge priors that constrain it to reason with clinically relevant factors found in medical textbooks or PubMed`
	`467`	`+ - Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models ([gandikota...torralba, bau, 2024](https://link.springer.com/chapter/10.1007/978-3-031-73661-2_10))`
`467`	`468`	`- CB-LLM: Crafting Large Language Models for Enhanced Interpretability ([sun...lily weng, 2024](https://lilywenglab.github.io/WengLab_2024_CBLLM.pdf))`
`468`	`469`	`- compute embedding similarity of concepts and input, and train layer to predict each of these similarity scores as concept bottleneck`
`469`	`470`	`- before training bottleneck, use ChatGPT to help correct any concept scores that seem incorrect`
`@@ -1098,6 +1099,13 @@ How interactions are defined and summarized is a very difficult thing to specify`
`1098`	`1099`	`- VANISH: Variable Selection Using Adaptive Nonlinear Interaction Structures in High Dimensions ([radchenko & james, 2012](https://www.tandfonline.com/doi/abs/10.1198/jasa.2010.tm10130?casa_token=HhKY3HXj0fYAAAAA:vTRgqAqWy3DZ9r9vXEinQOZbWuctLPA3J9bACTbrnKIkUPV19yqaDV5zr9dD6IiTrYXsj6HT_kDYNN8)) - learns pairwise interactions via basis function expansions`
`1099`	`1100`	`- uses hierarchiy constraint`
`1100`	`1101`	`- Coefficient tree regression: fast, accurate and interpretable predictive modeling ([surer, apley, & malthouse, 2021](https://link-springer-com.libproxy.berkeley.edu/article/10.1007/s10994-021-06091-7)) - iteratively group linear terms with similar coefficients into a bigger term`
	`1102`	`+- The Most Important Features in GAMs Might Be Groups of Features ([bosschieter...caruana, pohl, 2025](https://arxiv.org/pdf/2506.19937)) - define importance of a group of features as the the sum over samples of the absolute value of the sum of the component functions for the group`
	`1103`	`+ - single-feature importance for feature $x_j$:`
	`1104`	`+ - $I_{x_j}=\frac{1}{\|\mathscr{T}\|} \sum_{t \in \mathscr{T}}\left\|f_j\left(t_j\right)\right\|$`
	`1105`	`+`
	`1106`	`+ - group feature importance for group $G=\left\{x_{i_1}, \ldots, x_{i_k}\right\}$:`
	`1107`	`+ - $I_G:=\frac{1}{\|\mathscr{T}\|} \sum_{t \in \mathscr{T}}\left\|f_{i_1}\left(t_{i_1}\right)+\cdots+f_{i_k}\left(t_{i_k}\right)\right\|$`
	`1108`	`+`
`1101`	`1109`
`1102`	`1110`	`## finding influential examples`
`1103`	`1111`

0 commit comments

Comments

(0)

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Commit 962b277

File tree

6 files changed

6 files changed

`‎_blog/misc/20_ml_coding_tips.md`

`‎_blog/misc/23_paper_writing_tips.md`

`‎_notes/ml/nlp.md`

`‎_notes/neuro/comp_neuro.md`

`‎_notes/research_ovws/ovw_interp.md`

0 commit comments