~www_lesswrong_com | Bookmarks (706)

2025 Alignment Predictions — LessWrong

lesswrong.com

Published on January 2, 2025 5:37 AM GMTI’m curious how alignment researchers would answer these two...
Published on January 2, 2025 5:37 AM GMTI’m curious how alignment researchers would answer these two questions: What alignment progress do you expect to see in 2025? What results in 2025 would you need to see for you to believe that we are on track to successfully align AGI? Discuss
1
Grading my 2024 AI predictions — LessWrong

lesswrong.com

Published on January 2, 2025 5:01 AM GMTOn Jan 8 2024, I wrote a Google doc...
Published on January 2, 2025 5:01 AM GMTOn Jan 8 2024, I wrote a Google doc with my AI predictions for the next 6 years (and slightly edited the doc on Feb 24). I’ve now quickly sorted each prediction into Correct, Incorrect, and Unclear. The following post includes all of my predictions for 2024 with the original text unedited and commentary in indented bullets.Correctthere...
1
Practicing Bayesian Epistemology with "Two Boys" Probability Puzzles — LessWrong

lesswrong.com

Published on January 2, 2025 4:42 AM GMTThe PuzzlesThere's a simple Monty Hall adjacent probability puzzle...
Published on January 2, 2025 4:42 AM GMTThe PuzzlesThere's a simple Monty Hall adjacent probability puzzle that goes like this:Puzzle 1I have two children, at least one of whom is a boy. What is the probability that both children are boys?A more complex variation recently went viral on Twitter:Puzzle 2I have two children, (at least) one of whom is a boy born on a...
1
DeekSeek v3: The Six Million Dollar Model — LessWrong

lesswrong.com

Published on December 31, 2024 3:10 PM GMTWhat should we make of DeepSeek v3? DeepSeek v3...
Published on December 31, 2024 3:10 PM GMTWhat should we make of DeepSeek v3? DeepSeek v3 seems to clearly be the best open model, the best model at its price point, and the best model with 37B active parameters, or that cost under $6 million. According to the benchmarks, it can play with GPT-4o and Claude Sonnet. Anecdotal reports and alternative benchmarks tells us...
1
I Recommend More Training Rationales — LessWrong

lesswrong.com

Published on December 31, 2024 2:06 PM GMTSome time ago I happened to read the concept...
Published on December 31, 2024 2:06 PM GMTSome time ago I happened to read the concept of training rationale described by Evan Hubinger, and I really liked it. In case you are not aware: training rationales are a bunch of questions that ML developers / ML teams should ask themselves in order to self-assess pros and cons when adopting a certain safety approach.I decided to take...
1
The Plan - 2024 Update — LessWrong

lesswrong.com

Published on December 31, 2024 1:29 PM GMTThis post is a follow-up to The Plan - 2023...
Published on December 31, 2024 1:29 PM GMTThis post is a follow-up to The Plan - 2023 Version. There’s also The Plan - 2022 Update and The Plan, but the 2023 version contains everything you need to know about the current Plan. Also see this comment and this comment on how my plans interact with the labs and other players, if you’re curious about that part.What Have You Been...
1
Zombies among us — LessWrong

lesswrong.com

Published on December 31, 2024 5:14 AM GMT I met a man in the Florida Keys...
Published on December 31, 2024 5:14 AM GMT I met a man in the Florida Keys who rents jet skis at $150/hour. Since nobody jet skis alone, he makes at least $300/hour. When there’s no customers he sits around watching sports. After work he plays with his two sons. I asked if he likes his lifestyle. He loves it. Later when I was in...
1
Two Weeks Without Sweets — LessWrong

lesswrong.com

Published on December 31, 2024 3:30 AM GMT I recently tried giving up sweets for two...
Published on December 31, 2024 3:30 AM GMT I recently tried giving up sweets for two weeks. In early December I attended a conference, which meant a break from my normal routine. After a few days I realized this was the longest I'd gone without eating any sweets in 2-3 decades. After getting home I decided to go a bit longer to see if...
1
Broken Latents: Studying SAEs and Feature Co-occurrence in Toy Models — LessWrong

lesswrong.com

Published on December 30, 2024 10:50 PM GMTThanks to Jean Kaddour, Tomáš Dulka, and Joseph Bloom...
Published on December 30, 2024 10:50 PM GMTThanks to Jean Kaddour, Tomáš Dulka, and Joseph Bloom for providing feedback on earlier drafts of this post.In a previous post on Toy Models of Feature Absorption, we showed that tied SAEs seem to solve feature absorption. However, when we tried to training some tied SAEs on Gemma 2 2b, these still appeared to suffer from absorption...
1
Genetically edited mosquitoes haven't scaled yet. Why? — LessWrong

lesswrong.com

Published on December 30, 2024 9:37 PM GMTA post on difficulty of eliminating malaria using gene...
Published on December 30, 2024 9:37 PM GMTA post on difficulty of eliminating malaria using gene drives: "I worked on gene drives for a number of years jointly as a member of George Church and Flaminia Catteruccia’s labs at Harvard. Most of my effort was spent primarily on an idea for an evolutionary stable gene drive, which didn’t work but we learned some stuff,...
1
The low Information Density of Eliezer Yudkowsky & LessWrong — LessWrong

lesswrong.com

Published on December 30, 2024 7:43 PM GMTTLDR:I think Eliezer Yudkowsky & many posts on LessWrong...
Published on December 30, 2024 7:43 PM GMTTLDR:I think Eliezer Yudkowsky & many posts on LessWrong are failing at keeping things concise and to the point. Actual post:I think the content from Eliezer Yudkowsky & on LessWrong in general is unnecessarily wordy. A counterexample of where Eliezer Yudkowsky actually managed to get to the point concisely was in this Ted Talk, where he had the external...
1
Linkpost: Look at the Water — LessWrong

lesswrong.com

Published on December 30, 2024 7:49 PM GMTThis is a linkpost for https://jbostock.substack.com/p/prologue-train-crashEpistemic status: fiction, satire...
Published on December 30, 2024 7:49 PM GMTThis is a linkpost for https://jbostock.substack.com/p/prologue-train-crashEpistemic status: fiction, satire even!I am writing a short story. This is the prologue. Most of it will just go on Substack, but I'll occasionally post sections on LessWrong, when they're particularly good.At some point in the past, canals and railways were almost equals: each had their merits and drawbacks, and their...
1
o3, Oh My — LessWrong

lesswrong.com

Published on December 30, 2024 2:10 PM GMTOpenAI presented o3 on the Friday before Christmas, at...
Published on December 30, 2024 2:10 PM GMTOpenAI presented o3 on the Friday before Christmas, at the tail end of the 12 Days of Shipmas. I was very much expecting the announcement to be something like a price drop. What better way to say ‘Merry Christmas,’ no? They disagreed. Instead, we got this (here’s the announcement, in which Sam Altman says ‘they thought it...
2
Could my work, "Beyond HaHa" benefit the LessWrong community? — LessWrong

lesswrong.com

Published on December 29, 2024 4:14 PM GMTI’m considering translating my work into English to share...
Published on December 29, 2024 4:14 PM GMTI’m considering translating my work into English to share it with the LessWrong community, but I’d like to first ask if it aligns with the community's interests and could be valuable. Below is a summary of the work to help evaluate its relevance: Beyond HaHa: Mapping the Causal Chain from Jokes to KnowledgeSummaryWe explore the specific causal mechanisms...
1
Book Summary: Zero to One — LessWrong

lesswrong.com

Published on December 29, 2024 4:13 PM GMTSummary. Zero to one is a collection of notes...
Published on December 29, 2024 4:13 PM GMTSummary. Zero to one is a collection of notes on startups by Peter Thiel (co-founder of PayPal and Palantir) that grew from a course taught by Thiel at Stanford in 2012. Its core thesis is that iterative progress is insufficient for meaningful progress. Thiel argues that the world can only become better if it changes dramatically, which...
1
Boston Solstice 2024 Retrospective — LessWrong

lesswrong.com

Published on December 29, 2024 3:40 PM GMT Last night was the ninth Boston Secular Solstice...
Published on December 29, 2024 3:40 PM GMT Last night was the ninth Boston Secular Solstice ( 2023, 2022, 2019, 2018). This is not counting the 2020 one which was everywhere nearly simultaneously, with carefully calibrated delay. Skyler continued as lead organizer, with me coordinating music and Taymon helping in a bunch of ways. If you were there, it would be great if you...
1
Some arguments against a land value tax — LessWrong

lesswrong.com

Published on December 29, 2024 3:17 PM GMTTo many people, the land value tax (LVT) has...
Published on December 29, 2024 3:17 PM GMTTo many people, the land value tax (LVT) has earned the reputation of being the "perfect tax." In theory, it achieves a rare trifecta: generating government revenue without causing deadweight loss, incentivizing the productive development of land by discouraging unproductive speculation, and disproportionately taxing the wealthy, who tend to own the most valuable land.That said, I personally...
1
Predictions of Near-Term Societal Changes Due to Artificial Intelligence — LessWrong

lesswrong.com

Published on December 29, 2024 2:53 PM GMTThe intended audience of this post is not the...
Published on December 29, 2024 2:53 PM GMTThe intended audience of this post is not the average Lesswrong reader; it is more for our family and friends that do not grasp the transformative power of AI in its current form. I am posting it here so that I can come back in a year and see how my predictions are doing. I hope that...
2
Considerations on orca intelligence — LessWrong

lesswrong.com

Published on December 29, 2024 2:35 PM GMTFollow up to: Could orcas be smarter than humans?(For...
Published on December 29, 2024 2:35 PM GMTFollow up to: Could orcas be smarter than humans?(For speed of writing, I mostly don't cite references. Feel free to ask me in the comments for references for some claims.)This post summarizes my current most important considerations on whether orcas might be more intelligent than humans.Evolutionary considerationsWhat caused humans to become so smart?(Note: AFAIK there's no scientific...
1
The Legacy of Computer Science — LessWrong

lesswrong.com

Published on December 29, 2024 1:15 PM GMT "Computer Science" is not a science, and its...
Published on December 29, 2024 1:15 PM GMT "Computer Science" is not a science, and its ultimate signiﬁcance has little to do with computers. The computer revolution is a revolution in the way we think and in the way we express what we think. The essence of this change is the emergence of what might best be called procedural epistemology—the study of the structure...
1
Shallow review of technical AI safety, 2024 — LessWrong

lesswrong.com

Published on December 29, 2024 12:01 PM GMTfrom aisafety.world The following is a list of live agendas in...
Published on December 29, 2024 12:01 PM GMTfrom aisafety.world The following is a list of live agendas in technical AI safety, updating our post from last year. It is “shallow” in the sense that 1) we are not specialists in almost any of it and that 2) we only spent about an hour on each entry. We also only use public information, so we are bound to...
1
Dishbrain and implications. — LessWrong

lesswrong.com

Published on December 29, 2024 10:42 AM GMTI believe that AI research has not given sufficient...
Published on December 29, 2024 10:42 AM GMTI believe that AI research has not given sufficient attention to learning directly from biology, particularly through the direct observation and manipulation of neurons in controlled environments. Furthermore, even after learning all that biology has to offer, neurons could still play a part in the post TAI world economy as they could be cheaper and faster to...
1
Notes on Altruism — LessWrong

lesswrong.com

Published on December 29, 2024 3:13 AM GMTThis post examines the virtue of altruism. I’m less...
Published on December 29, 2024 3:13 AM GMTThis post examines the virtue of altruism. I’m less interested in breaking new ground, more in synthesizing the wisdom I could find about this virtue and how to cultivate it.Much about altruism on LessWrong and nearby sites concerns “effective altruism,” which takes altruistic motivation as a given and investigates how to do it most efficiently. This post...
1
By default, capital will matter more than ever after AGI — LessWrong

lesswrong.com

Published on December 28, 2024 5:52 PM GMTI've heard many people say something like "money won't...
Published on December 28, 2024 5:52 PM GMTI've heard many people say something like "money won't matter post-AGI". This has always struck me as odd, and as most likely completely incorrect.First: labour means human mental and physical effort that produces something of value. Capital goods are things like factories, data centres, and software—things humans have built that are used in the production of goods...
1

~www_lesswrong_com | Bookmarks (706)

Domains