BioLLM
BioLLM
SAE (Sparse Autoencoder)
Lecture: CS294A Lecture notes by Andrew Ng
SAE survey
Paper: A Survey on Sparse Autoencoders: Interpreting the Internal Mechanisms of Large Language Models

Transcoder
Paper: Transcoders Find Interpretable LLM Feature Circuits
Code: Transcoder-circuits: reverse-engineering LLM circuits with transcoders
Sparse Crosscoders
Paper: Sparse Crosscoders for Cross-Layer Features and Model Diffing
Evo2
Paper: Genome modeling and design across all domains of life with Evo 2
Code: https://github.com/ArcInstitute/evo2
Evo 2: Genome modeling and design across all domains of life
CellVerse
Paper: CellVerse: Do Large Language Models Really Understand Cell Biology?

C2S (cell2sentence)
Paper: Scaling Large Language Models for Next-Generation Single-Cell Analysis
model: C2S-Scale-Gemma-2-2B, [C2S-Scale-Gemma-2-27B)(https://huggingface.co/vandijklab/C2S-Scale-Gemma-2-27B)
Code: https://github.com/vandijklab/cell2sentence

Training Transcoder on C2S
Paper: Transcoder-based Circuit Analysis for Interpretable Single-Cell Foundation Models
model: vandijklab/C2S-Pythia-410m-cell-type-prediction
Training
SmolLM2
Paper: SmolLM2: When Smol Goes Big — Data-Centric Training of a Small Language Model
Model: SmolLM2
State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M
Dataset: EleutherAI/SmolLM2-135M-10B
sparify
Dataset :
Code: https://github.com/EleutherAI/sparsify
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B")
inputs = tokenizer("Hello, world!", return_tensors="pt")
with torch.inference_mode():
model = AutoModelForCausalLM.from_pretrained("meta-llama/Meta-Llama-3-8B")
outputs = model(**inputs, output_hidden_states=True)
latent_acts = []
for sae, hidden_state in zip(saes.values(), outputs.hidden_states):
# (N, D) input shape expected
hidden_state = hidden_state.flatten(0, 1)
latent_acts.append(sae.encode(hidden_state))
# Do stuff with the latent activations
This site was last updated October 26, 2025.