Experts' AI timelines are longer than you have been told? — LessWrong
Published on January 16, 2025 6:03 PM GMTThis is a linkpost for How should we analyse...
Numberwang: LLMs Doing Autonomous Research, and a Call for Input — LessWrong
Published on January 16, 2025 5:20 PM GMTSummaryCan LLMs science? The answer to this question can...
Topological Debate Framework — LessWrong
Published on January 16, 2025 5:19 PM GMTI would like to thank Professor Vincent Conitzer, Caspar...
AI #99: Farewell to Biden — LessWrong
Published on January 16, 2025 2:20 PM GMTThe fun, as it were, is presumably about to...
Deceptive Alignment and Homuncularity — LessWrong
Published on January 16, 2025 1:55 PM GMTNB this dialogue occurred at the very end of...
Introducing the WeirdML Benchmark — LessWrong
Published on January 16, 2025 11:38 AM GMTWeirdML websiteRelated posts:How good are LLMs at doing ML...
Replicators, Gods and Buddhist Cosmology — LessWrong
Published on January 16, 2025 10:51 AM GMTFrom the earliest days of evolutionary thinking, we’ve used...
Quantum without complication — LessWrong
Published on January 16, 2025 8:53 AM GMTLearning quantum mechanics involves two things:learning the fundamentals, the...
Permanents: much more than you wanted to know — LessWrong
Published on January 16, 2025 8:04 AM GMTToday's "nanowrimo" post is a fun longform introduction to permanents...
Gaming TruthfulQA: Simple Heuristics Exposed Dataset Weaknesses — LessWrong
Published on January 16, 2025 2:14 AM GMT(Explanation. Also I have no reason to think they...
Applications Open for the Cooperative AI Summer School 2025! — LessWrong
Published on January 15, 2025 6:16 PM GMTApplications are now open for the Cooperative AI Summer School,...
List of AI safety papers from companies, 2023–2024 — LessWrong
Published on January 15, 2025 6:00 PM GMTI'm collecting (x-risk-relevant) safety research from frontier AI companies...
AI Alignment Meme Viruses — LessWrong
Published on January 15, 2025 3:55 PM GMTSome fraction of the time, LLMs naturally go on...
Looking for humanness in the world wide social — LessWrong
Published on January 15, 2025 2:50 PM GMTSocial networks have shaped me since a young age....
On the OpenAI Economic Blueprint — LessWrong
Published on January 15, 2025 2:30 PM GMTTable of Contents Man With a Plan. Oh the...
A problem shared by many different alignment targets — LessWrong
Published on January 15, 2025 2:22 PM GMTThe first section describes problems with a few different...
LLMs for language learning — LessWrong
Published on January 15, 2025 2:08 PM GMTMy current outlook on LLMs is that they are...
Feature request: comment bookmarks — LessWrong
Published on January 15, 2025 6:45 AM GMTSometimes I see a comment I'd like to bookmark,...
How do fictional stories illustrate AI misalignment? — LessWrong
Published on January 15, 2025 6:11 AM GMTThis is an article in the featured articles series...
We probably won't just play status games with each other after AGI — LessWrong
Published on January 15, 2025 4:56 AM GMTThere is a view I’ve encountered somewhat often,[1] which can...
Progress links and short notes, 2025-01-13 — LessWrong
Published on January 13, 2025 6:35 PM GMTMuch of this content originated on social media. To follow...
Better antibodies by engineering targets, not engineering antibodies (Nabla Bio) — LessWrong
Published on January 13, 2025 3:05 PM GMTNote: Thank you to Surge Biswas (founder of Nabla...
Emergent effects of scaling on the functional hierarchies within large language models — LessWrong
Published on January 13, 2025 2:31 PM GMTNote: I am a postdoc in fMRI neuroscience. I...
Zvi’s 2024 In Movies — LessWrong
Published on January 13, 2025 1:40 PM GMTNow that I am tracking all the movies I...
Paper club: He et al. on modular arithmetic (part I) — LessWrong
Published on January 13, 2025 11:18 AM GMTIn this post we’ll be looking at the recent...
Moderately More Than You Wanted To Know: Depressive Realism — LessWrong
Published on January 13, 2025 2:57 AM GMTDepressive realism is the idea that depressed people have...
Applying traditional economic thinking to AGI: a trilemma — LessWrong
Published on January 13, 2025 1:23 AM GMTTraditional economics thinking has two strong principles, each based...
Do Antidepressants work? (First Take) — LessWrong
Published on January 12, 2025 5:11 PM GMTI've been researching the controversy over whether antidepressants truly...
AI Developed: A Novel Idea for Harnessing Magnetic Reconnection as an Energy Source — LessWrong
Published on January 12, 2025 5:11 PM GMTIntroductionMagnetic reconnection—the sudden rearrangement of magnetic field lines—drives dramatic...
Building AI Research Fleets — LessWrong
Published on January 12, 2025 6:23 PM GMTFrom AI scientist to AI research fleetResearch automation is...
Near term discussions need something smaller and more concrete than AGI — LessWrong
Published on January 11, 2025 6:24 PM GMTMotivationI want a more concrete concept than AGI[1] to talk...
A proposal for iterated interpretability with known-interpretable narrow AIs — LessWrong
Published on January 11, 2025 2:43 PM GMTI decided, as a challenge to myself, to spend...
We need a universal definition of 'agency' and related words — LessWrong
Published on January 11, 2025 3:22 AM GMTAnd by "we" I mean "I". I'm the one...
AI for medical care for hard-to-treat diseases? — LessWrong
Published on January 10, 2025 11:55 PM GMTWith LLM-based AI passing benchmarks that would challenge people...
Beliefs and state of mind into 2025 — LessWrong
Published on January 10, 2025 10:07 PM GMTThis post is to record the state of my...
Is AI Alignment Enough? — LessWrong
Published on January 10, 2025 6:57 PM GMTVirtually everyone I see in the AI safety community...
Recommendations for Technical AI Safety Research Directions — LessWrong
Published on January 10, 2025 7:34 PM GMTAnthropic’s Alignment Science team conducts technical research aimed at...
What are some scenarios where an aligned AGI actually helps humanity, but many/most people don't like it? — LessWrong
Published on January 10, 2025 6:13 PM GMTOne can call it "deceptive misalignment": the aligned AGI...
Human takeover might be worse than AI takeover — LessWrong
Published on January 10, 2025 4:53 PM GMTEpistemic status -- sharing rough notes on an important...
The Alignment Mapping Program: Forging Independent Thinkers in AI Safety - A Pilot Retrospective — LessWrong
Published on January 10, 2025 4:22 PM GMTThe Alignment Mapping Program: Forging Independent Thinkers in AI...
Discursive Warfare and Faction Formation — LessWrong
Published on January 9, 2025 4:47 PM GMTResponse to Discursive Games, Discursive WarfareThe discursive distortions you...
Can we rescue Effective Altruism? — LessWrong
Published on January 9, 2025 4:40 PM GMTLast year Timothy Telleen-Lawton and I recorded a podcast...
AI #98: World Ends With Six Word Story — LessWrong
Published on January 9, 2025 4:30 PM GMTThe world is kind of on fire. The world...
Many Worlds and the Problems of Evil — LessWrong
Published on January 9, 2025 4:10 PM GMTSummary: The Many-Worlds interpretation of quantum mechanics helps us...
PIBBSS Fellowship 2025: Bounties and Cooperative AI Track Announcement — LessWrong
Published on January 9, 2025 2:23 PM GMTWe're excited to announce that the PIBBSS Fellowship 2025 now...
Thoughts on the In-Context Scheming AI Experiment — LessWrong
Published on January 9, 2025 2:19 AM GMTThese are thoughts in response to the paper "Frontier...
A Systematic Approach to AI Risk Analysis Through Cognitive Capabilities — LessWrong
Published on January 9, 2025 12:18 AM GMTA Systematic Approach to AI Risk Analysis Through Cognitive...
Aristocracy and Hostage Capital — LessWrong
Published on January 8, 2025 7:38 PM GMTThere’s a conventional narrative by which the pre-20th century...
What is the most impressive game LLMs can play well? — LessWrong
Published on January 8, 2025 7:38 PM GMTEpistemic status: This is an off-the-cuff question.~5 years ago...
Ann Altman has filed a lawsuit in US federal court alleging that she was sexually abused by Sam Altman — LessWrong
Published on January 8, 2025 2:59 PM GMTOn January 6, 2025, Ann Altman filed a lawsuit...