~www_lesswrong_com | Bookmarks (667)
-
Michael Dickens' Caffeine Tolerance Research — LessWrong
Published on September 4, 2024 3:41 PM GMTMichael Dickens has read the research and performed two...
-
LW editor bug? — LessWrong
Published on September 4, 2024 2:58 PM GMTCurrently within My Drafts in my LW user space,...
-
Are UV-C Air purifiers so useful? — LessWrong
Published on September 4, 2024 2:16 PM GMTDoes anyone know good practical research about the effect...
-
AI and the Technological Richter Scale — LessWrong
Published on September 4, 2024 2:00 PM GMTThe Technological Richter scale is introduced about 80% of...
-
Is there any rigorous work on using anthropic uncertainty to prevent situational awareness / deception? — LessWrong
Published on September 4, 2024 12:40 PM GMTAI systems up to some high level of intelligence...
-
Announcing the Ultimate Jailbreaking Championship — LessWrong
Published on September 4, 2024 12:35 AM GMTGray Swan AI is hosting an LLM jailbreaking championship,...
-
AI Safety at the Frontier: Paper Highlights, August '24 — LessWrong
Published on September 3, 2024 7:17 PM GMTThis is a selection of AI safety paper highlights...
-
The Checklist: What Succeeding at AI Safety Will Involve — LessWrong
Published on September 3, 2024 6:18 PM GMTCrossposted by habryka with Sam's permission. Expect lower probability...
-
Survey: How Do Elite Chinese Students Feel About the Risks of AI? — LessWrong
Published on September 2, 2024 6:11 PM GMTIntroIn April 2024, my colleague and I (both affiliated...
-
Data-driven donations to help Democrats win federal elections: an update — LessWrong
Published on September 2, 2024 4:32 PM GMTLinking to an update to an earlier post about...
-
My decomposition of the alignment problem — LessWrong
Published on September 2, 2024 12:21 AM GMTEpistemic staus: ExploratorySummary: In this post I will decompose...
-
What are the effective utilitarian pros and cons of having children (in rich countries)? — LessWrong
Published on September 2, 2024 10:01 AM GMTI have one child and do not want more,...
-
A primer on the next generation of antibodies — LessWrong
Published on September 1, 2024 10:37 PM GMTIntroductionIf you want a primer over antibodies, I recommend...
-
Who looked into extreme nuclear meltdowns? — LessWrong
Published on September 1, 2024 9:38 PM GMTDiscuss
-
Redundant Attention Heads in Large Language Models For In Context Learning — LessWrong
Published on September 1, 2024 8:08 PM GMTIn this post, I claim a few things and...
-
Book Review: What Even Is Gender? — LessWrong
Published on September 1, 2024 4:09 PM GMTI submitted this review to the 2024 ACX book...
-
Can a Bayesian Oracle Prevent Harm from an Agent? (Bengio et al. 2024) — LessWrong
Published on September 1, 2024 7:46 AM GMTYoshua Bengio wrote a blogpost about a new AI...
-
San Francisco ACX Meetup “First Saturday” — LessWrong
Published on September 1, 2024 4:48 AM GMTDate: Saturday, September 7th, 2024Time: 1 pm – 3...
-
Epistemic states as a potential benign prior — LessWrong
Published on August 31, 2024 6:26 PM GMTMalignancy in the prior seems like a strong crux...
-
My Model of Epistemology — LessWrong
Published on August 31, 2024 5:01 PM GMTI regularly get asked by friends and colleagues for...
-
Verification methods for international AI agreements — LessWrong
Published on August 31, 2024 2:58 PM GMTTLDR: A new paper summarizes some verification methods for...
-
Fake Blog Posts as a Problem Solving Device — LessWrong
Published on August 31, 2024 9:22 AM GMTThis is a very brief post about a simple...
-
Anthropic is being sued for copying books to train Claude — LessWrong
Published on August 31, 2024 2:57 AM GMTOpenAI faces 10 copyright lawsuits and Anthropic is starting...
-
Can Large Language Models effectively identify cybersecurity risks? — LessWrong
Published on August 30, 2024 8:20 PM GMT TL;DRI was interested in the ability of LLMs...