~www_lesswrong_com | Bookmarks (696)

Will AI R&D Automation Cause a Software Intelligence Explosion? — LessWrong

lesswrong.com

Published on March 26, 2025 6:12 PM GMTEmpirical evidence suggests that, if AI automates AI research,...
Published on March 26, 2025 6:12 PM GMTEmpirical evidence suggests that, if AI automates AI research, feedback loops could overcome diminishing returns, significantly accelerating AI progress.SummaryAI companies are increasingly using AI systems to accelerate AI research and development. These systems assist with tasks like writing code, analyzing research papers, and generating training data. While current systems struggle with longer and less well-defined tasks, future...
1
Why Does Unemployment Happen? — LessWrong

lesswrong.com

Published on March 26, 2025 6:02 PM GMTAnd specifically, what does this imply for AI? There...
Published on March 26, 2025 6:02 PM GMTAnd specifically, what does this imply for AI? There are two theories of equilibrium unemployment — search frictions, and efficiency wages — and they actually give diametrically opposite predictions for when search frictions in finding a new job fall. I conclude that frictions are the more likely explanation, but that LLMs may actually increase unemployment if our...
1
Apply to become a Futurekind AI Facilitator or Mentor (deadline: April 10) — LessWrong

lesswrong.com

Published on March 26, 2025 3:47 PM GMTWe are accepting applications for up to 12 paid...
Published on March 26, 2025 3:47 PM GMTWe are accepting applications for up to 12 paid facilitators & a number of mentors for an upcoming course on AI & animals.Facilitators lead 12 participants in structured discussions (following a provided template) on assigned readings. Facilitating is different from teaching: rather than convey your own knowledge, the goal is to help participants articulate their thoughts via...
1
Finding Emergent Misalignment — LessWrong

lesswrong.com

Published on March 26, 2025 5:33 PM GMTWe've recently published a paper on Emergent Misalignment, where...
Published on March 26, 2025 5:33 PM GMTWe've recently published a paper on Emergent Misalignment, where we show that models finetuned to write insecure code become broadly misaligned. Most people agree this is a very surprising observation. Some asked us, "But how did you find it?" There's a short version of the story on X. Here I describe it in more detail. TL;DR: I think...
1
Center on Long-Term Risk: Summer Research Fellowship 2025 - Apply Now — LessWrong

lesswrong.com

Published on March 26, 2025 5:29 PM GMTSummary: CLR is hiring for our Summer Research Fellowship....
Published on March 26, 2025 5:29 PM GMTSummary: CLR is hiring for our Summer Research Fellowship. Join us for eight weeks to work on s-risk motivated empirical AI safety research. Apply here by Tuesday 15th April 23:59 PT.We, the Center on Long-Term Risk, are looking for Summer Research Fellows to explore strategies for reducing suffering in the long-term future (s-risks) and work on technical AI safety...
1
Eukaryote Skips Town - Why I'm leaving DC — LessWrong

lesswrong.com

Published on March 26, 2025 5:16 PM GMTI’ve spent the past 7 years living in the...
Published on March 26, 2025 5:16 PM GMTI’ve spent the past 7 years living in the DC area. I moved out there from the Pacific Northwest to go to grad school – I got my masters in Biodefense from George Mason University, and then I stuck around, trying to move into the political/governance sphere. That sort of happened. But I will now be sort...
1
Language and My Frustration Continue in Our RSI — LessWrong

lesswrong.com

Published on March 26, 2025 2:13 PM GMTWhat's this post about?I make some rants and recommendations...
Published on March 26, 2025 2:13 PM GMTWhat's this post about?I make some rants and recommendations about terminology.This is written for AI-not-kill-everyone-ists. If you are worried about AI killing everyone and want us to prevent AI from killing everyone, this post is for you. If you don't have that agenda and instead have other agenda's, that's fine. It's great. But this post may fail...
1
Would it be effective to learn a language to improve cognition? — LessWrong

lesswrong.com

Published on March 26, 2025 10:17 AM GMTo1 has shown a strange behavior where it thinks...
Published on March 26, 2025 10:17 AM GMTo1 has shown a strange behavior where it thinks in Mandarin, while processing English prompts, and translates the results back to English for the output. I realized that the same could be possible for humans to utilize, speeding up conscious thought. [1]What makes Mandarin useful for this is that it: Has compact tokensHas compact grammarHas abundant training material onlineCan...
1
New AI safety treaty paper out! — LessWrong

lesswrong.com

Published on March 26, 2025 9:29 AM GMTLast year, we (the Existential Risk Observatory) published a Time...
Published on March 26, 2025 9:29 AM GMTLast year, we (the Existential Risk Observatory) published a Time Ideas piece proposing the Conditional AI Safety Treaty, a proposal to pause AI when AI safety institutes determine that its risks, including loss of control, have become unacceptable. Today, we publish our paper on the topic: “International Agreements on AI Safety: Review and Recommendations for a Conditional AI Safety...
1
Map of all 40 copyright suits v. AI in U.S. — LessWrong

lesswrong.com

Published on March 26, 2025 7:57 AM GMTDownload the latest PDF with links to court dockets...
Published on March 26, 2025 7:57 AM GMTDownload the latest PDF with links to court dockets here. Discuss
1
AI "Deep Research" Tools Reviewed — LessWrong

lesswrong.com

Published on March 24, 2025 6:40 PM GMTMidjourney: “an artificially intelligent researcher, library, posthuman archivist, mapping...
Published on March 24, 2025 6:40 PM GMTMidjourney: “an artificially intelligent researcher, library, posthuman archivist, mapping the noosphere”As regular readers are aware, I do a lot of informal lit review. So I was especially interested in checking out the various AI based “deep research” tools and seeing how they compare. I did a side-by-side comparison, using the same prompt, of Perplexity Deep Research, Gemini...
2
Notes on countermeasures for exploration hacking (aka sandbagging) — LessWrong

lesswrong.com

Published on March 24, 2025 6:39 PM GMTIf we naively apply RL to a scheming AI,...
Published on March 24, 2025 6:39 PM GMTIf we naively apply RL to a scheming AI, the AI may be able to systematically get low reward/performance while simultaneously not having this behavior trained out because it intentionally never explores into better behavior. As in, it intentionally puts very low probability on (some) actions which would perform very well to prevent these actions from being...
1
Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols? — LessWrong

lesswrong.com

Published on March 24, 2025 5:55 PM GMTWe recently released Subversion Strategy Eval: Can language models statelessly...
Published on March 24, 2025 5:55 PM GMTWe recently released Subversion Strategy Eval: Can language models statelessly strategize to subvert control protocols?, a major update to our previous paper/blogpost, evaluating a broader range of models (e.g. helpful-only Claude 3.5 Sonnet) in more diverse and realistic settings (e.g. untrusted monitoring).AbstractAn AI control protocol is a plan for usefully deploying AI systems that prevents an AI from intentionally...
1
From Loops to Klein Bottles: Uncovering Hidden Topology in High Dimensional Data — LessWrong

lesswrong.com

Published on March 24, 2025 5:09 PM GMTMotivationDimensionality reduction is vital to the analysis of high...
Published on March 24, 2025 5:09 PM GMTMotivationDimensionality reduction is vital to the analysis of high dimensional data, i.e. data with many features. It allows for better understanding of the data, so that one can formulate useful analyses. Dimensionality reduction that produces a set of points in a vector space of dimension n.mjx-chtml {display: inline-block; line-height: 0; text-indent: 0; text-align: left; text-transform: none; font-style:...
1
Will Jesus Christ return in an election year? — LessWrong

lesswrong.com

Published on March 24, 2025 4:50 PM GMTThanks to Jesse Richardson for discussion.Polymarket asks: will Jesus...
Published on March 24, 2025 4:50 PM GMTThanks to Jesse Richardson for discussion.Polymarket asks: will Jesus Christ return in 2025?In the three days since the market opened, traders have wagered over $100,000 on this question. The market traded as high as 5%, and is now stably trading at 3%. Right now, if you wanted to, you could place a bet that Jesus Christ will...
1
Sentinel's Global Risks Weekly Roundup #12/2025: Famine in Gaza, H7N9 outbreak, US geopolitical leadership weakening. — LessWrong

lesswrong.com

Published on March 24, 2025 4:46 PM GMTExecutive summaryForecasters believe there’s an 18% chance (range: 4%-50%)...
Published on March 24, 2025 4:46 PM GMTExecutive summaryForecasters believe there’s an 18% chance (range: 4%-50%) that there will be a famine in any part of Gaza by the end of 2025, according to the UN and its Integrated Food Security Phase Classification (IPC). A Category 5 rating would result in a positive resolution, with the last IPC update suggesting that all of Gaza...
1
Delicious Boy Slop - Boring Diet, Effortless Weightloss — LessWrong

lesswrong.com

Published on March 24, 2025 3:01 PM GMTYour beloved 34 year old author is never hungryI...
Published on March 24, 2025 3:01 PM GMTYour beloved 34 year old author is never hungryI often joke I’m the only traditional rationalist left. The original pitch was that you could radically improve your life by being more strategic. Huge piles of expected value were available. Everyone else seems to have given up, but I’m still a believer. For example in 2017 Scott Alexander...
1
More on Various AI Action Plans — LessWrong

lesswrong.com

Published on March 24, 2025 1:10 PM GMTLast week I covered Anthropic’s relatively strong submission, and...
Published on March 24, 2025 1:10 PM GMTLast week I covered Anthropic’s relatively strong submission, and OpenAI’s toxic submission. This week I cover several other submissions, and do some follow-up on OpenAI’s entry. Google Also Has Suggestions The most prominent remaining lab is Google. Google focuses on AI’s upside. The vibes aren’t great, but they’re not toxic. The key asks for their ‘pro-innovation’ approach...
1
Emergent scaling effects on the functional hierarchies within LLMs — LessWrong

lesswrong.com

Published on March 24, 2025 1:03 PM GMTI have been poking around with LLMs, and I...
Published on March 24, 2025 1:03 PM GMTI have been poking around with LLMs, and I found some results that seem broadly interestingSummaryIntroduction: Large language models (LLM) are usually structured as repeated transformer layers of the same size. However, this architecture is often described as functionally hierarchical with earlier layers focusing on small patches of text while later layers parse document-wide information. I revisited...
1
Recommender Alignment for Lock-In Risk — LessWrong

lesswrong.com

Published on March 24, 2025 12:56 PM GMTEpistemic status: my own research and reasoning about lock-in...
Published on March 24, 2025 12:56 PM GMTEpistemic status: my own research and reasoning about lock-in risk threat models, and how recommender systems connect to the threat model outlined. I'm fairly confident in the claims about the contribution of recommender systems to filter bubbles, less so on extreme and persuasive content selection effects.TL;DRWe believe lock-in risks are a pressing problem, and that algorithmic technologies...
1
What's the word for the amount of expertise that I, an experienced therapy patient and generally educated person, have on psychology topics? — LessWrong

lesswrong.com

Published on March 23, 2025 5:38 PM GMTEpistemic status: raising a question that I've found difficultThis...
Published on March 23, 2025 5:38 PM GMTEpistemic status: raising a question that I've found difficultThis topic has frustrated me some, and I think there are a variety of forces pointing in different directions.Maximally conservative approach"If you're not focused, I mean I can share what works for me but really there's a variety of mental illnesses that can cause lack of focus. I don't...
1
Probability Theory Fundamentals 102: Source of the Sample Space — LessWrong

lesswrong.com

Published on March 23, 2025 5:23 PM GMTThe usual explanation of probability theory goes like this:There...
Published on March 23, 2025 5:23 PM GMTThe usual explanation of probability theory goes like this:There is this thing called Probability Space, which consists of three other things:Sample Space - some non-empty setEvent Space - a set of subsets of the Sample SpaceProbability Function - a measure function over the elements of the Event Space.And then several examples of how we can merge this...
1
How to mitigate sandbagging — LessWrong

lesswrong.com

Published on March 23, 2025 5:19 PM GMTEpistemic status: I have worked on sandbagging for ~1...
Published on March 23, 2025 5:19 PM GMTEpistemic status: I have worked on sandbagging for ~1 year. I expect to be wrong in multiple ways, but I do think this post provides both a useful high-level model and a good place to discuss how to mitigate sandbagging. Better conceptual approaches probably exist, e.g., selecting different main factors.[1]TL;DR: Fine-tuning access, data quality, and scorability are...
1
Solving willpower seems easier than solving aging — LessWrong

lesswrong.com

Published on March 23, 2025 3:25 PM GMTI'm awake about 17 hours a day. Of those...
Published on March 23, 2025 3:25 PM GMTI'm awake about 17 hours a day. Of those I'm being productive maybe 10 hours a day.My working definition of productive is in the direction of: "things that I expect I will be glad I did once I've done them"[1].Things that I personally find productive includeChoresWorkEatingCookingReading a good bookWatching TV with my Wife/KidsPlaying with the kidsSocialising with...
1

~www_lesswrong_com | Bookmarks (696)

Domains