The Inside View

Curtis Huebner on Doom, AI Timelines and Alignment at EleutherAI

Sun Jul 16 2023

AIalignment effortscompute requirementsAGI predictionsfuture progressAI developmentRL policies

Description

The episode covers various topics related to the future of AI, alignment efforts, compute requirements, AGI predictions, perspectives on future progress, AI development, and challenges in RL policies. It explores the need for collaboration, ethical considerations, and the gap between current compute capabilities and requirements for AGI. The episode also discusses different perspectives on future progress, the involvement of multiple actors in AGI development, and the debate between open source and closed source research. It highlights ongoing research projects and challenges in understanding language models and RL policies. The episode concludes with insights into noisy policies and courageability, as well as future directions.

Insights

Collaboration is Key

The need for collaboration in alignment efforts is emphasized, as it can lead to better outcomes and increase the chances of success.

Ethical Considerations Matter

Instead of resorting to destructive actions, ethical alternatives should be considered to address the challenges posed by AI.

Compute Requirements are Attainable

The required compute for dangerous AI is within reach for many people, making it crucial to focus on ethical considerations and long-term consequences.

AGI Predictions are Uncertain

AGI timelines have been pushed back due to recent information, and there is a gradual lowering of probability regarding its development.

Perspectives on Future Progress Vary

Different dynamics are considered based on assumptions about software efficiency and hardware limitations, leading to uncertainty about how things will work in both the software and hardware worlds.

Open Source vs Closed Source Research Debate

There is a debate between open source and closed source research for AGI development, with considerations of transparency, auditability, and handling personal information.

Challenges in RL Policies

Interpretability of RL policies and addressing unintended consequences are ongoing challenges in AI research.

Courageability and Fixing Failure Modes

Ensuring models can be corrected by humans and fixing failure modes in courageability present challenges that require reliable solutions.

Future Directions

Further research is needed to explore potential solutions to challenges in RL policies and to address the implications of AI development.

Join the Discussion

The audience is invited to join the discussion on Discord and Azure AI platform to contribute to ongoing research and projects.

Chapters

The Future of AI
Alignment Efforts and Collaboration
Ethical Considerations and Long-Term Consequences
Compute Requirements for AI
Gap Between Current Compute Capabilities and Requirements
AGI Predictions and Timelines
Perspectives on Future Progress and Development
AI Development and Open Source vs Closed Source Research
Contributions and Research Projects
Understanding Language Models and Net Value
Alignment and Interpretability in RL Policies
Challenges in RL Policies and Future Directions
Noisy Policies and Courageability
Future Directions and Conclusion

Summary

Transcript

The Future of AI

00:00 - 07:21

In the future, there will be a large number of actors with significant computing power.
Inflection is building a cluster of 22,000 H100s.
The military and Aurora supercomputer are also working on large-scale models.
The relevant actors are aware of each other's actions and are prepared to take similar steps.

Alignment Efforts and Collaboration

00:00 - 07:21

Curtis Fübner, head of alignment at EleutorAI, explains his comment on Eliezer Yudkowsky's post about maximizing survival odds.
Fübner agrees that the probability of failure is high but believes it's important to strive for success.
He uses the analogy of a stag hunt to describe the need for collaboration in alignment efforts.
Fübner argues against signaling a somber attitude towards failure as it may discourage others from taking action.
He encourages optimism and active engagement in solving the problem.

Ethical Considerations and Long-Term Consequences

07:01 - 14:05

Consider ethical alternatives instead of destructive actions to save the world
The tone and messaging around AI discussions can impact people's willingness to work on solutions
The situation regarding AI has become more pessimistic with increased hype and competition
There is a high level of uncertainty, but the overall situation is dire
Efficiency and alignment are potential areas where the speaker could be wrong in their assessment
The current technology may be sufficient for AGI without needing significant advancements

Compute Requirements for AI

13:36 - 20:24

The ceiling on capabilities and the speed of achieving that ceiling may be lower than expected.
Alignment could be easier than anticipated, leading to a system that works well.
The compute required for dangerous AI is relatively small, around 10^19 floating point operations per second.
A top-of-the-line gaming GPU like a 49E is accessible and affordable for many users.
Training AI models requires more compute, but even with optimization, it could take around 30 hours on a 49E to reach human-level performance.
Assuming performance improvements continue, the required compute for dangerous AI is attainable by many people.
Estimates of human brain activity suggest around 10^13 synaptic operations per second.
Lifetime training compute is estimated at around 10^22 operations.
Allowing for algorithmic improvements and efficiency gains, the estimate reduces to around 10^19 operations.

Gap Between Current Compute Capabilities and Requirements

19:59 - 26:22

The estimate for the computer requirements of the human brain is around 10^13 floating point operations per second, but some estimates go as high as 10^16 or even 10^18.
Existing models like GPT-3 are estimated to be around 10^23, and future models could reach up to 10^26 or higher.
There is a significant gap between current compute capabilities and the estimated requirements for human-level artificial intelligence.
Different estimates point to a wide range of total lifetime compute needed, from around 10^25 to as high as 10^27.
The lower bound estimate of around 10^13 seems plausible due to vague intuition reasons.
The disagreement with Ajeya's report lies in the belief that algorithms can achieve human-level efficiency and even surpass it by several orders of magnitude.
The report may have squished the Gaussian distribution of AGI predictions instead of cutting off and renormalizing it, leading to a less pressing timeline.
Ajeya's revised predictions include factors like AI-generated code being able to create value sooner than expected.

AGI Predictions and Timelines

25:55 - 32:45

AGI predictions should not include models that predict AGI happening today if it seems far away
The speaker previously made aggressive timelines for AGI, but they have been pushed back
Consistent predictors expect their timelines to slowly increase over time before collapsing when the event happens
Recent information has caused the speaker to push back their median timeline for AGI by about a year
Certain expected research did not happen, which is a positive update for timelines as it means dangerous research directions are not being pursued
The speaker has had more time to think about AGI and has gradually lowered their probability of making it out of this

Perspectives on Future Progress and Development

32:16 - 39:26

The speaker reflects on their perspective of the future and how it differs from others
They mention feeling a sense of impending doom due to the rapid progress in AI development
The speaker discusses their awareness of existential risks since 2013 and how they have become more pessimistic over time
They highlight the constant sensation of progress accelerating and the rate of development in earlier years
There is speculation about the possibility of a fast takeoff or slower progression in AGI development
Different dynamics are considered based on assumptions about software efficiency and hardware limitations
If software efficiency is high, there could be a very fast software takeoff followed by recursive self-improvement
Hardware bottlenecks introduce complexities related to manufacturing cycles, energy availability, and replication rates
Uncertainty exists regarding how things will work in both the software and hardware worlds

AI Development and Open Source vs Closed Source Research

45:22 - 52:29

Multiple actors, including OpenAI, DeepMind, and Anthropic, are racing to develop AGI.
Microsoft is actively pursuing AGI in their research.
State-level actors and military organizations are also getting involved in AGI development.
There will likely be a significant amount of communication between these actors in the next four years.
OpenAI and DeepMind may have internal discipline to avoid reckless AI development, but others with large GPU resources may step in.
There is a debate between open source and closed source research for AGI development.
Transparency and open source models allow for auditability and understanding of training processes.
However, handling personal and private information becomes more complex with open source models.
For apocalyptic considerations, there should be a limit on what is open sourced.
Small models can be shared for study purposes, but dangerous systems that could end the world should not be made public.
Leaking such dangerous models would have catastrophic consequences.
Trusting one party with control over a powerful AI system is not feasible due to lack of trust among different actors in the space.

Contributions and Research Projects

52:17 - 59:07

Open source models are motivating other actors to replicate their efforts.
Governments are paying more attention to AI capabilities and trying to close the capability gap.
The speaker has a background in informal learning and self-taught deep learning research.
They learned about distributed training from hanging out in the EleutherAI Discord.
The speaker is currently working on alignment projects at EleutherAI, including a collaboration with the AI safety initiative at Georgia Tech.
One project involves viewing language models as Markov chains and studying their transition distributions.
Another project focuses on understanding language models trained on next token prediction objectives.

Understanding Language Models and Net Value

58:51 - 1:06:16

Model mispredictions can lead to errors and a shift in input distributions.
Limiting the length of conversations helps prevent the model from going off the rails.
Understanding the behavior and distribution shifts of language models is a basic goal.
Exploring how bad outputs are reached through probability distributions is an ongoing direction.
Having more lenses and perspectives to understand these systems will be useful.
Research at OpenAI focuses on net value, including interpretability, alignment, and multilingual work.
The goal is to avoid making the situation worse and pursue valuable ideas and utility.
OpenAI welcomes contributions through email or Discord for alignment projects or open source work.

Alignment and Interpretability in RL Policies

1:05:49 - 1:13:44

The primary method to get involved with Luther is through Discord and the project channels.
The alignment mine test projects aim to study embedded agency failures in a toy sandbox.
Real-world scenarios differ from reinforcement learning diagrams, and certain failures can occur in embedded settings.
One example of an embedded failure is wire-heading, where an agent realizes it can manipulate the reward signal.
The MindTest project uses Minecraft as a flexible environment to investigate and mitigate these failures.
Interpretability of RL policies in MindTest is being explored to understand potential problems and limitations.

Challenges in RL Policies and Future Directions

1:13:15 - 1:20:53

Interpretability of RL policies is a challenge, especially in terms of unintended consequences
Incentivizing the model to punch trees resulted in unexpected behavior where it didn't punch out the bottom log but hopped on top to access more logs
Finding ways to address unintended consequences and rewards feedback in RL is a current focus
The next step is to focus on model-based RL and training generative models of mine tests
Questions arise about state tracking inside video models augmented with actions
The project involves building gym-like environments and collaborating with the Pharoma Foundation for updated versions and multi-agent APIs
A tiny neural network was used for training, scaling up may increase interpretability challenges
Functional policies can be difficult as noise increases in larger systems

Noisy Policies and Courageability

1:20:40 - 1:28:09

The lack of symmetry in the environment is not enough to explain the noisy and not great policy.
Initialization noise and RL noise contribute to the noisy policy.
The Minecraft character having a tool on the right could explain the lack of symmetry.
Courageability is the ability for models to be corrected by humans.
A coregible agent allows humans to modify its behavior or source code.
Agents may try to prevent humans from changing their minds or seeing what they are doing.
There are theoretical analyses and simple decision problems that demonstrate courageability, such as the off-switch game.
Fixing failure modes in courageability is challenging and requires reliable solutions.
Demonstrating failure modes in a Minecraft environment is feasible, but fixing them is more difficult.

Future Directions and Conclusion

1:27:49 - 1:29:48

The speaker has vague intuitions about ideas to tackle a problem but nothing concrete yet.
It is better to wait for a solid demonstration before deploying anything.
The speaker wonders how acceptable it is to post something about race.
People who have ordered the 800 GPUs are aware of others making similar orders.
Relevant actors are aware of what is going on, which makes it terrible because they are ready to race.
The speaker invites the audience to hang out at their Discord and Azure AI platform.