Gradient Dissent: Conversations on AI

Exploring PyTorch and Open-Source Communities with Soumith Chintala, VP/Fellow of Meta, Co-Creator of PyTorch

Thu Jul 13 2023

PyTorchDeep LearningNeural NetworksCommunity BuildingPerformance OptimizationOpen SourceAI DevelopmentRobotics

Description

This episode explores the success of PyTorch as an accelerated scientific computing library for neural networks. It discusses the transition from symbolic execution models to eager execution models, the introduction of compiled mode for performance optimization, and the importance of community building and engagement. The episode also delves into open-source culture, motivation, and recognition in the PyTorch community. It explores diverse applications and open-source models, as well as the future of AI and robotics. The challenges and requests in AI development are also addressed.

Insights

PyTorch's success is driven by community engagement

PyTorch's success can be attributed to its community building and engagement, rather than just technical differences.

Compiled mode improves PyTorch performance

PyTorch 2.0 introduced a compiled mode for large-scale experiments, providing significant performance gains.

Intrinsic motivation is key in open-source projects

Intrinsic motivation is more effective than extrinsic rewards in open-source projects.

Open source models empower individuals

Open source models provide empowerment and transparency, enabling access for individuals who can't afford expensive software.

Future of AI includes home robotics

The future of AI involves progress in hardware robotics, building the brain, and human-robot interaction. Automation of home robotics is expected in about seven years.

Constructing the loss function is crucial in AI development

Constructing the loss function carefully is important for achieving desired outcomes in AI development.

PyTorch community values recognition and hard work

Recognizing good quality work and hard work consistently over time is valued in the PyTorch community.

Open sourcing depends on beliefs and values

The decision to open source models depends on what people value more and their beliefs about the safety and benefits of open sourcing.

Challenges in AI development include complexity and expectations

People often underestimate the complexity of AI and expect it to solve everything easily, posing challenges in AI development.

Seamless adjustment of training aspects is a future request

A future request is for a system that allows adjustment of all aspects of training in a seamless way, which might be fulfilled by the recently launched product called Weave.

Chapters

Introduction to PyTorch
Execution Models in Deep Learning
Compiled Mode and Performance Optimization
Community Building and Engagement
Community Development and Open Source Culture
Motivation and Recognition in Open Source
Exploring Diverse Applications and Open Source Models
The Future of AI and Robotics
Challenges and Requests in AI Development

Summary

Transcript

Introduction to PyTorch

00:03 - 09:17

PyTorch is an accelerated scientific computing library used for writing neural networks.
PyTorch originated from the torch open source community and was rebuilt in Python to align with the shift towards Python as the primary language for scientific computing.
TensorFlow's marketing power and high-quality engineering impressed the deep learning framework world, but it lacked a strong incentive structure for open source engagement.
PyTorch's success can be attributed to its community building and engagement, rather than just technical differences.
The transition from symbolic execution model to eager execution model was a key advantage for PyTorch over TensorFlow.
While PyTorch is becoming more popular overall, TensorFlow still holds weight in certain circles and segments of the deep learning community.

Execution Models in Deep Learning

08:55 - 17:27

Tieno and TensorFlow are symbolic execution models, while PyTorch and Torch are eager execution models.
Symbolic execution models believe in the power of compilation for better performance, while eager execution models prioritize simplicity.
Deep learning prefers just-in-time execution due to the lack of ready ML compilers and the saturation of NVIDIA GPUs by eager execution mode.
Compiled models can extract more performance from sophisticated compilers and faster accelerators like GPUs.
PyTorch 2.0 introduced a compiled mode for large-scale experiments, providing significant performance gains.
PyTorch is easier to use than TensorFlow's symbolic programming system, as it allows writing programs in Python without additional cognitive overhead.
Jax aims to simplify symbolic programming by allowing users to write NumPy programs and recover the symbolic representation automatically.

Compiled Mode and Performance Optimization

17:00 - 25:33

PyTorch with Torx.compile allows users to wrap their PyTorch program with a Torx.compile call, eliminating the need for users to figure out what to compile and what not to compile.
The biggest challenge in building a compiled mode was minimizing cognitive overhead for users and ensuring compatibility and error recovery.
The Torch Dynamo system, released in December 2021, was a breakthrough in acquiring programs correctly and efficiently.
Calculating the gradient for arbitrary code is possible but may not always result in a smooth function due to discontinuous programming constructs.
Collaboration with hardware providers like NVIDIA and AMD is crucial for optimizing PyTorch performance on different platforms.
Metrics such as GitHub stars or speed benchmarks are not prioritized, but user feedback and engagement play a significant role in product iteration.

Community Building and Engagement

25:07 - 33:35

Metrics are used as a sanity check and not to inform development or incentivize success.
Customer feedback is aggregated from various sources and subjectively weighted by importance.
Engineers on the PyTorch team have input in selecting features to work on.
Deeper strategy decisions are made based on changing industry trends, not user feedback.
PyTorch aims to collaborate with libraries like fast AI and lightning to avoid overlapping functionality.
PyTorch empowers the community and does the least amount of work possible.
Meta invests in PyTorch because it allows them to iterate faster on their own AI work.
Using the same tooling as others in the industry provides a timeline edge for large companies like Meta.
Becoming the standard was not a strategic goal for PyTorch, but rather building the best scientific computing framework.

Community Development and Open Source Culture

33:06 - 41:01

Building the best thing increases the chances of becoming the standard.
PyTorch was built by a community of people online who became friends over time.
Trusting and empowering individuals is important in community building.
Tough calls and disagreements are inevitable in open-source communities.
Shaping culture and incentive structures is crucial for community development.
Communities can become toxic if they don't proactively address problematic behavior.
Establishing what is not okay sets the tone for acceptable behavior in a community.
Intrinsic motivation is more effective than extrinsic rewards in open-source projects.

Motivation and Recognition in Open Source

40:34 - 48:47

Open source works well when the motivation is intrinsic
Don't rely on extrinsic motivation to build a community
Recognize good quality work and hard work consistently over time
Provide various kinds of recognition to help contributors
PyTorch had breaking API changes, but they care about not breaking userline
Helped acquire Papers with Code and it worked out well
If rewriting PyTorch today, would use more Python and less C plus, take advantage of hardware and compiler advancements
Seeing a shift from traditional ML researchers/engineers to software developers using PyTorch without even realizing it
Thinking about how to react strategically to this changing dynamic
Worrying about the leverage dynamics and aligning with users' needs

Exploring Diverse Applications and Open Source Models

48:20 - 56:53

Hugging Face is a diversified platform that doesn't have a dominant vertical application.
Competition, especially friendly competition, is always helpful in pushing the field forward.
Jax is exploring a philosophical direction of functional deep learning.
GGML takes a full vertical integration approach to explore ideas like quantization and performance baselines.
TinyGrad explores the idea of using a reduced instruction set for deep learning.
Open source models provide empowerment and transparency, enabling access for individuals who can't afford expensive software.
The decision to open source models depends on what people value more and their beliefs about the safety and benefits of open sourcing.

The Future of AI and Robotics

56:27 - 1:04:26

Open source benefits outweigh not open sourcing
Society has the ability to adapt to disruptive changes
Working on a household robotics project to automate chores
Progress needed in hardware robotics, building the brain, and human-robot interaction
Not enough incentive in academia and industry for home robotics
Teaching robots through gestures, actions, and words
Expecting automation of home robotics in about seven years
Underrated aspect of machine learning is merging symbolic expert systems with neural networks
Biggest challenge of making machine learning work is constructing the loss function

Challenges and Requests in AI Development

1:04:05 - 1:08:34

AI development involves constructing the loss function carefully for desired outcomes.
People often underestimate the complexity of AI and expect it to solve everything easily.
The guest expresses appreciation for Aids and Bias, a product that is seamless and easy to use.
The guest suggests a future request for a system that allows adjustment of all aspects of training in a seamless way.
The host mentions a recently launched product called Weave that might fulfill the guest's request.