#144 Intel CTO Steve Orrin on How Governments Can Navigate the Data & AI Revolution
The episode discusses the importance of data analytics and AI in government, challenges and opportunities in data management, real-time data management and privacy, applications and challenges in AI models, and data sharing and cooperation. It emphasizes the need for aligning data ambitions with mission, launching pilot projects, and investing in data architecture and services. The episode also highlights the challenges of data privacy, data governance, and sharing data across agencies. Practical applications of AI models in government are explored, along with the importance of choosing impactful missions and educating policy regulators. The episode concludes by discussing data sovereignty laws, GDPR, cooperation agreements, and the potential of large language models in government settings.
A data-driven government leverages data for mission effectiveness and citizen services.
Challenges in data management
Government agencies face challenges in data ingestion, creation, privacy, and regulation.
Real-time data management
Real-time data management requires a different approach and good quality data about the environment beforehand.
Privacy by design
Privacy by design involves building tools and architectures that incorporate privacy considerations from the start of the data life cycle.
Applications of AI models
Practical applications of large AI models in government include improving tax submissions, healthcare forms, infrastructure prediction analysis, and contract predictions.
Data sharing challenges
Data sovereignty laws create hurdles for information sharing and transfers.
- Data Analytics and AI in Government
- Challenges and Opportunities in Data Management
- Real-time Data Management and Privacy
- Applications and Challenges in AI Models
- Data Sharing and Cooperation
Data Analytics and AI in Government
00:00 - 13:29
- Playing with data analytics and data science tools doesn't require advanced PhDs.
- Data and AI are crucial for governments to provide value to stakeholders.
- Steve Orrin, Intel's Federal CTO, discusses challenges and opportunities in government data and AI transformation.
- Aligning data ambitions with mission is important for government agencies.
- Data privacy laws differ between the US, Europe, and China.
- Launching pilot projects is a recommended approach for government agencies.
- Intel Federal engages with the US government on technology adoption and requirements.
- A data-driven government leverages data for mission effectiveness and citizen services.
- Many organizations are drowning in data or don't know how to take full advantage of it.
- Some organizations are driving towards leveraging data to affect mission and improve citizen service.
Challenges and Opportunities in Data Management
00:00 - 19:40
- Data is recognized as a central component for the US government's mission.
- Public sector data scales are larger than most organizations, posing a bigger challenge.
- The relationship between an agency's mission and its data-driven ambitions varies across agencies.
- Data-driven agencies focus on missions where data is transformative.
- More agencies are learning from successful use cases and applying them in other areas.
- Advanced agencies have funding or a mission imperative that enables better use of data.
- Data can be an enabler for defense, intelligence, health, food safety, and more missions.
- Efficiencies gained from data applications on backend processes can impact the overall mission positively.
- Government agencies face challenges in data ingestion, creation, privacy, and regulation.
- Data wrangling, curation, management, and ingestion are crucial steps before AI implementation.
- The heavy lift in AI and machine learning is the data management and curation.
- Many organizations fail to transition cool AI projects from the lab to practice due to lack of data management infrastructure.
- Investment in data architecture and services is necessary for handling data ingestion at scale.
- Data sharing becomes challenging in government and regulated industries due to anonymization, classification, and sensitivities.
- Novel approaches are being explored for analytics across disconnected datasets.
- Data governance is becoming a key focus in public sector data architecture and management.
- Timeliness is crucial for agencies like FEMA during natural disasters. They need both pre- and post-disaster data for effective action.
- Advanced technology like drones can help gather real-time situational awareness data during disaster recovery missions.
- Managing structured pre-disaster data and categorizing less structured real-time data are essential for timely decision-making.
Real-time Data Management and Privacy
00:00 - 32:28
- Real-time data management requires a different approach and good quality data about the environment beforehand.
- Having access to multiple sensors with varying quality can enrich the data by comparing and detecting changes.
- Streaming in real-time data with different government controls is crucial for mission effectiveness.
- Combining human expertise with sensor data and labeling can aid in recovery efforts.
- Training models to recognize patterns at scale is important for future missions.
- Government agencies face challenges in data privacy due to different regulations and authorities within each agency.
- Data governance policies determine how data is collected, used, shared, and anonymized within agencies.
- Sharing data across agencies can be difficult due to privacy controls, but efforts are being made to enable information sharing while maintaining privacy.
- Privacy by design involves building tools and architectures that incorporate privacy considerations from the start of the data life cycle.
- Applying governance at the beginning of the process is a key challenge in implementing privacy by design.
- Privacy by design is a process that involves tagging data with policy controls and separating PII from other data.
- Building tools with privacy and security by design, including strong authentication and access control policies.
- Confidential computing and homomorphic encryption enable analytics on sensitive data without compromising privacy.
- Challenges arise when aggregating data from different sources with varying regulatory requirements.
- Data governance brokers can facilitate controlled data sharing between agencies without re-engineering systems.
- Key success factors for machine learning applications in the public sector include choosing impactful missions and not making it the most important thing being done.
Applications and Challenges in AI Models
00:00 - 48:55
- Choose an impactful mission that matters to people beyond AI scientists.
- Start with a medium level project, not the most important one, to avoid disrupting the agency or mission if things go wrong.
- Avoid analysis paralysis and aim for good enough accuracy rather than perfection.
- Example of using drones with recognition algorithm in forestry to detect blight on trees, improving coverage and reducing time before detection.
- Government agencies are exploring ways to make sense of data from various sensors for better situational awareness.
- Smart cities can use data from cameras, traffic flow analysis, and telematics to optimize traffic management.
- Different countries have different standards and regulations around data privacy and collection, which affects the type of use cases that can be deployed globally.
- Domain-specific language models are more beneficial than large language models for practical applications.
- Training a language model approach for a specific subdomain can reduce overall size and achieve good results without high costs.
- Practical applications of large AI models in government include improving tax submissions, healthcare forms, infrastructure prediction analysis, and contract predictions.
- Policy regulators are working to understand the ethics and biases of large language models but face challenges due to the technology's rapid advancements.
- Data scientists and experts should educate policy regulators and lawmakers about the capabilities and limitations of technology to develop effective policies.
- Encouragement for organizations to choose important projects, curate data, and experiment with AI machine learning approaches in real-world environments.
- Iterative learning is crucial for data scientists to determine what works and what doesn't with different models approaches.
- Playing with tools like PyTorch and Jupyter Notebooks does not require advanced degrees in data analytics or coding experience.
- Everyone can benefit from gaining more experience with these technologies.
Data Sharing and Cooperation
00:00 - 44:52
- Data sovereignty laws create hurdles for information sharing and transfers
- GDPR provides consumer rights around data protection
- Government services have strict controls on data residency and processing
- Cloud providers offer regions with governance controls to comply with regulations
- Cooperation agreements between governments enable data sharing in specific domains
- Healthcare and science collaboration are key areas for cooperation agreements
- Universities create safe environments for data sharing with strong access controls
- Sharing national defense data globally is a challenge but strong needs drive cooperation
- Traffic pattern analysis can be shared between organizations to improve efficiency
- Large language models like chat GPTs have various applications in government settings
- Generative models will change how citizen services are provided and improve prediction models