~www_lesswrong_com | Bookmarks (706)
-
Spencer Greenberg hiring a personal/professional/research remote assistant for 5-10 hours per week — LessWrong
Published on March 2, 2025 6:01 PM GMTDiscuss
-
Will LLM agents become the first takeover-capable AGIs? — LessWrong
Published on March 2, 2025 5:15 PM GMTOne of my takeaways from EA Global this year...
-
Not-yet-falsifiable beliefs? — LessWrong
Published on March 2, 2025 2:11 PM GMTI recently encountered an unusual argument in favor of...
-
Saving Zest — LessWrong
Published on March 2, 2025 12:00 PM GMT I realized I've been eating oranges wrong for...
-
Open Thread Spring 2025 — LessWrong
Published on March 2, 2025 2:33 AM GMTIf it’s worth saying, but not worth its own...
-
help, my self image as rational is affecting my ability to empathize with others — LessWrong
Published on March 2, 2025 2:06 AM GMTThere is some part of me, which cannot help...
-
Maintaining Alignment during RSI as a Feedback Control Problem — LessWrong
Published on March 2, 2025 12:21 AM GMTCrossposted from my personal blog.Recent advances have begun to...
-
Share AI Safety Ideas: Both Crazy and Not — LessWrong
Published on March 1, 2025 7:08 PM GMTAI safety is one of the most critical issues...
-
Historiographical Compressions: Renaissance as An Example — LessWrong
Published on March 1, 2025 6:21 PM GMTI’ve been reading Ada Palmer’s great “Inventing The Renaissance”,...
-
Real-Time Gigstats — LessWrong
Published on March 1, 2025 2:10 PM GMT For a while ( 2014, 2015, 2016, 2017,...
-
An Open Letter To EA and AI Safety On Decelerating AI Development — LessWrong
Published on February 28, 2025 5:21 PM GMTTl;dr: when it comes to AI, we need to...
-
Dance Weekend Pay II — LessWrong
Published on February 28, 2025 3:10 PM GMT The world would be better with a lot...
-
Existentialists and Trolleys — LessWrong
Published on February 28, 2025 2:01 PM GMTHow might an existentialist approach this notorious thought experiment...
-
On Emergent Misalignment — LessWrong
Published on February 28, 2025 1:10 PM GMTOne hell of a paper dropped this week. It...
-
Do safety-relevant LLM steering vectors optimized on a single example generalize? — LessWrong
Published on February 28, 2025 12:01 PM GMTThis is a linkpost for our recent paper on...
-
Cycles (a short story by Claude 3.7 and me) — LessWrong
Published on February 28, 2025 7:04 AM GMTContent warning: this story is AI generated slop.The kitchen...
-
January-February 2025 Progress in Guaranteed Safe AI — LessWrong
Published on February 28, 2025 3:10 AM GMTOk this one got too big, I’m done grouping...
-
Weirdness Points — LessWrong
Published on February 28, 2025 2:23 AM GMTVegans are often disliked. That's what I read online...
-
[New Jersey] HPMOR 10 Year Anniversary Party 🎉 — LessWrong
Published on February 27, 2025 10:30 PM GMTIt's been 10 years since the final chapter of...
-
OpenAI releases GPT-4.5 — LessWrong
Published on February 27, 2025 9:40 PM GMTThis is not o3; it is what they'd internally...
-
The non-tribal tribes — LessWrong
Published on February 26, 2025 5:22 PM GMTAuthor note: This is basically an Intro to the...
-
Fuzzing LLMs sometimes makes them reveal their secrets — LessWrong
Published on February 26, 2025 4:48 PM GMTScheming AIs may have secrets that are salient to...
-
You can just wear a suit — LessWrong
Published on February 26, 2025 2:57 PM GMTI like stories where characters wear suits.Since I like...
-
Minor interpretability exploration #1: Grokking of modular addition, subtraction, multiplication, for different activation functions — LessWrong
Published on February 26, 2025 11:35 AM GMTEpistemic status: small exploration without previous predictions, results low-stakes...