~www_lesswrong_com | Bookmarks (669)
-
How I Learned To Stop Trusting Prediction Markets and Love the Arbitrage — LessWrong
Published on August 6, 2024 2:32 AM GMTThis is a story about a flawed Manifold market,...
-
John Schulman leaves OpenAI for Anthropic — LessWrong
Published on August 6, 2024 1:23 AM GMTSchulman writes:I shared the following note with my OpenAI...
-
Self-explaining SAE features — LessWrong
Published on August 5, 2024 10:20 PM GMTTL;DRWe apply the method of SelfIE/Patchscopes to explain SAE...
-
Value fragility and AI takeover — LessWrong
Published on August 5, 2024 9:28 PM GMT1. Introduction“Value fragility,” as I’ll construe it, is the...
-
Madrid - ACX Meetups Everywhere Fall 2024 — LessWrong
Published on August 5, 2024 6:36 PM GMTThis year's Fall ACX Meetup in Madrid.Location: El Retiro...
-
Circular Reasoning — LessWrong
Published on August 5, 2024 6:10 PM GMT The idea that circular reasoning is bad is widespread....
-
Fear of centralized power vs. fear of misaligned AGI: Vitalik Buterin on 80,000 Hours — LessWrong
Published on August 5, 2024 3:38 PM GMTVitalik Buterin wrote an impactful blog post, My techno-optimism....
-
Four Phases of AGI — LessWrong
Published on August 5, 2024 1:15 PM GMTAGI is not discrete, and different phases lead to...
-
AI Safety at the Frontier: Paper Highlights, July '24 — LessWrong
Published on August 5, 2024 1:00 PM GMTI'm starting a new blog where I post my...
-
Game Theory and Society — LessWrong
Published on August 5, 2024 4:27 AM GMTGame theory is a branch of mathematics that deals...
-
Near-mode thinking on AI — LessWrong
Published on August 4, 2024 8:47 PM GMTThere is a stark difference between rehearsing classical AI...
-
We’re not as 3-Dimensional as We Think — LessWrong
Published on August 4, 2024 2:39 PM GMTWhile thinking about high-dimensional spaces and their less intuitive properties,...
-
You don't know how bad most things are nor precisely how they're bad. — LessWrong
Published on August 4, 2024 2:12 PM GMTTL;DR: Your discernment in a subject often improves as...
-
Can We Predict Persuasiveness Better Than Anthropic? — LessWrong
Published on August 4, 2024 2:05 PM GMTThere is an interesting paragraph in Anthropic's most recent...
-
Why do Minimal Bayes Nets often correspond to Causal Models of Reality? — LessWrong
Published on August 3, 2024 12:39 PM GMTChapter 2 of Pearl's Causality book claims you can...
-
PIZZA: An Open Source Library for Closed LLM Attribution (or “why did ChatGPT say that?”) — LessWrong
Published on August 3, 2024 12:07 PM GMT From the research & engineering team at Leap Laboratories...
-
Cooperation and Alignment in Delegation Games: You Need Both! — LessWrong
Published on August 3, 2024 10:16 AM GMTThis work was facilitated by the Oxford AI Safety...
-
SRE's review of Democracy — LessWrong
Published on August 3, 2024 7:20 AM GMTDay OneWe've been handed this old legacy system called...
-
I didn't think I'd take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is! — LessWrong
Published on August 2, 2024 10:35 PM GMTBasically, the user is shown a splatter of colored...
-
Evaluating Sparse Autoencoders with Board Game Models — LessWrong
Published on August 2, 2024 7:50 PM GMTThis blog post discusses a collaborative research paper on...
-
Ethical Deception: Should AI Ever Lie? — LessWrong
Published on August 2, 2024 5:53 PM GMTEthical Deception: Should AI Ever Lie?Personal Artificial Intelligence Assistants...
-
The Bitter Lesson for AI Safety Research — LessWrong
Published on August 2, 2024 6:39 PM GMTRead the associated paper "Safetywashing: Do AI Safety BenchmarksActually...
-
Request for AI risk quotes, especially around speed, large impacts and black boxes — LessWrong
Published on August 2, 2024 5:49 PM GMT@KatjaGrace, Josh Hart I are finding quotes around different...
-
A Simple Toy Coherence Theorem — LessWrong
Published on August 2, 2024 5:47 PM GMTThis post presents a simple toy coherence theorem, and...