Speaking - Concordia AI

Concordia AI holds the AI Safety and Governance Forum at the World AI Conference 2025

Concordia AI — Mon, 01 Sep 2025 08:48:53 +0000

On July 27, 2025, Concordia AI hosted the AI Safety and Governance Forum at the World AI Conference in Shanghai. A special edition of our newsletter highlighted key AI safety updates from the conference; this post offers a comprehensive overview of the Forum itself, with links to video recordings of all speeches and remarks.

The Forum brought together around 30 distinguished experts from around the world, including Turing Award winner Yoshua Bengio; United Nations Under-Secretary-General Amandeep Singh Gill; Shanghai AI Lab Director ZHOU Bowen (周伯文); Special Envoy of the President of France for AI Anne Bouverot; Distinguished Professor of computer science at UC Berkeley Stuart Russell; Peng Cheng Laboratory Director GAO Wen (高文); CEO of the Partnership on AI, Rebecca Finlay; Shanghai Artificial Intelligence Strategic Advisory Expert Committee member, HE Jifeng (何积丰); and many more leading figures from government, industry, and research. Over 200 audience members joined in person, with over 14,000+ views of the livestream.

The forum was structured into four themes:

- Theme 1: The Science of AI Safety
- Theme 2: Emerging Challenges in AI Safety
- Theme 3: AI Risk Management in Practice
- Theme 4: International Governance of AI Safety

Group photo after the AI Safety and Governance Forum morning session.

Opening remarks

Concordia AI founder and CEO Brian TSE (谢旻希) gave a welcoming speech and shared four points to spark discussion. First, scientific consensus is the premise for driving AI safety research and governance. Second, we should urgently enhance risk monitoring and early warning due to the multifaceted challenges arising from cutting-edge large models. Third, AI safety needs to draw on global best practices in risk management. Fourth, AI safety is a challenge faced by all of humanity and requires global cooperation.

Theme 1: The Science of AI Safety

The opening speech was delivered by GAO Wen (高文), Academician of the Chinese Academy of Engineering and Director of Peng Cheng Laboratory. Gao noted that while the rapid development of AI creates immense opportunities, it also introduces uncontrollable security risks. His keynote centered on two key issues: compute sovereignty and trustworthy data sharing. He emphasized the importance of securing the foundations of compute, and highlighted Peng Cheng Laboratory’s work on privacy-preserving computation and data-sharing technologies, which enable data utilization while safeguarding privacy and security.

Stuart Russell, distinguished Professor of computer science at UC Berkeley, warned about AI systems exhibiting self-preservation and deception behaviors. He cautioned that the current AI development paradigm poses significant risks of catastrophic outcomes such as deception, self-replication, and loss of control. He called for setting red lines, increasing transparency, and establishing more stringent regulatory mechanisms, including hardware-enabled governance. He also proposed using “assistance games” — where AI systems are trained through collaboration with humans — to ensure AI systems serve human interests even when those interests are not precisely defined.

Turing Award winner Yoshua Bengio, founder and scientific director of Mila – Quebec Artificial Intelligence Institute, warned of the potential catastrophic risks of superintelligence. He observed that cutting-edge AI systems are already approaching human expert levels in multiple domains and may soon possess dangerous behaviors such as deception and autonomous replication. He introduced the International AI Safety Report and called for establishing a bridge between scientific evidence and policy. He proposed developing “scientist AI” — non-autonomous systems that cannot independently pursue goals but instead provide research assistance. Bengio stressed the importance of international cooperation on AI safety, warning that if major powers such as the US and China treat AI development as a race, competitive pressures could ultimately harm everyone.

ZHOU Bowen (周伯文), Director and Chief Scientist of Shanghai AI Lab, highlighted the limitations of traditional AI safety approaches such as value alignment and red teaming. He argued that while these methods can address short-term challenges, they prove insufficient for managing long-term risks, particularly those posed by AI agents that may surpass human intelligence. Building on the “AI-45° Law” he proposed at WAIC 2024, Zhou emphasized the need to shift from “Making AI Safe” to “Making Safe AI” — embedding safety as a core property of AI systems rather than adding it on as a “patch” after development. He introduced Shanghai AI Lab’s SafeWork safety technology stack, which is designed around this principle.

Academician ZHANG Ya-Qin (张亚勤), Dean of Tsinghua University’s Institute for AI Industry Research, joined Academician Gao Wen and Professor Stuart Russell for a panel discussion, moderated by Concordia AI CEO Brian Tse. The conversation addressed frontier AI trends and early warning indicators, the co-evolution of digital and biological intelligence, strategies for managing high-severity but low-probability risks, and future pathways for global AI safety. The experts recommended introducing hardware-level safety mechanisms, creating a global AI safety research fund, implementing AI agent identity registration systems, and establishing regulations for “emergency shutdown” mechanisms.

Theme 2: Emerging Challenges in AI Safety

UC Berkeley Professor Dawn Song discussed the profound impact of frontier AI on cybersecurity, noting how it is transforming both offense and defense. On one hand, AI is being applied to identify and mitigate vulnerabilities, with performance in vulnerability detection reflected by benchmarks such as BountyBench and CyberGame. On the other hand, attackers can also exploit AI to carry out more sophisticated attacks, creating an asymmetry that favors offense over defense. She emphasized the need to enhance AI’s effectiveness for cyberdefense through system design, proactive defense, and formal verification.

Professor Nick Bostrom, Principal Researcher at the Macrostrategy Research Initiative and the author of Superintelligence, outlined four core challenges in machine intelligence: scalable AI alignment, AI governance, the moral status of digital minds, and intra-superintelligence cooperation. He noted that the ethics of digital minds remains especially neglected, as most people still do not take the issue seriously. Bostrom examined several attributes that might shape whether AI warrants moral consideration, including sentience, agency, potential, and modal status. He concluded by emphasizing that this field is still in its early stages and called for deeper research across technical, philosophical, and institutional dimensions.

Professor YANG Min (杨珉), Executive Dean and Professor of the School of Computing and Intelligence Innovation at Fudan University, argued that frontier AI poses a range of security challenges, including misuse in cybersecurity or CBRN (chemical, biological, radiological, and nuclear) domains, as well as risks of deception, self-replication, and self-improvement. He presented his team’s research showing that AI systems can recognize when they are being evaluated and adjust their behavior to appear safer. His team also found that several mainstream models already demonstrate early signs of self-replication. These results suggest that AI may be approaching a tipping point toward loss of control, warranting strengthened risk assessment and governance efforts.

Professor ZHANG Weiwen (张卫文), Baiyang Chair Professor and Director of the Center for Biosafety Research and Strategy at Tianjin University, examined the risks and opportunities arising from the integration of AI and the life sciences. He noted that biosecurity faces growing risks such as synthetic viruses and artificial bacteria, which traditional laboratory controls are unable to fully address. He warned that rapid AI progress could generate entirely new and unknown biological knowledge, creating more complex safety challenges. Zhang shared his Center’s international cooperation initiatives, such as the UN-recognized Tianjin Biosecurity Guidelines. He called for transnational collaboration among scientists to establish a dynamic and practical global biosecurity system.

Concordia AI and the Center for Biosafety Research and Strategy of Tianjin University released a report in Chinese titled Responsible Innovation in AI x Life Sciences. Concordia AI’s Head of AI Safety and Governance (China), FANG Liang (方亮), introduced the key findings. The report highlights the positive role of AI in advancing life sciences research and biosecurity governance. It also identifies three main categories of risks (accidental, misuse, and structural) and points out shortcomings in existing risk analysis and evaluation systems. The report further reviews governance practices across a range of domestic and international actors, including governments, research institutions, and enterprises.

Dr. Jaime Yassif, Vice President of the Nuclear Threat Initiative’s Global Biological Policy and Programs, emphasized that while AI can accelerate vaccine development and enhance biopharmaceutical capabilities, it also carries risks of misuse, such as enabling the creation of more dangerous pathogens or undermining biodefense systems. She called on policymakers, AI developers, and funders to increase investment in safety guardrails and incentivize safety practices for AIxBio tools. Yassif shared a regularly updated research agenda for AIxBio safeguards. She also introduced the AIxBio Global Forum, which aims to develop shared understanding of risks, improve safety practices, and promote governance mechanisms for AI usage in biology.

Dan Hendrycks, Director of the Center for AI Safety and Safety Advisor at xAI, and YU Xuedong (于学东), Deputy Director of Guangzhou Laboratory’s ABSL-3 Laboratory, joined Professor Zhang Weiwen and Dr. Jaime Yassif for a panel discussion, moderated by Concordia AI CEO Brian Tse. The dialogue focused on benefits and potential risks of the AI–life sciences convergence. The panelists recommended strengthening risk prevention and control across multiple layers, including AI models, biological design tool management, model access, and DNA screening mechanisms. They emphasized the need to establish clear technical and governance standards and avoid harmful competition. Finally, they called on global experts to work together to set norms and build robust defense mechanisms for AI in biology.

Amandeep Singh Gill, United Nations Under-Secretary-General and Special Envoy for Digital and Emerging Technologies, delivered a keynote speech. He indicated that global AI governance is entering a critical stage: moving from principles to practice, where details and implementation matter most. He emphasized that the UN, as the core platform for international law and governance, plays a vital role in advancing the implementation of related agreements. Gill called on multiple stakeholders, including private enterprises, civil society, and the technical community, to work together, build consensus, and promote compliance.

Theme 3: AI Risk Management in Practice

HAO Chunliang (郝春亮), Director of the China Electronics Standardization Institute (CESI) Cybersecurity Center AI Safety/Security Department presented the TC260 AI Safety Governance Framework and related standardization efforts. The framework analyzes AI safety risks along two dimensions — inherent and application-related — and proposes both technical and governance mitigation measures. In January 2025, TC260 also published the AI Safety Standards System (V1.0) – Draft for Comments, covering key technologies, security management, product applications, and testing and evaluation, with ongoing improvements based on broad feedback. Hao discussed the release of three national standards on generative AI security in April and China’s first mandatory national AI standard on labeling AI-generated synthetic content. Additionally, he outlined ongoing work on forthcoming standards for AI code generation security, AI agent security, and risk classification and grading.

Rishi Bommasani, Society Lead at the Stanford Center for Research on Foundation Models, emphasized California’s critical role in AI safety and shared insights from the California Report on Frontier AI Policy, which he co-authored. He reflected on lessons relevant to AI governance, including how early design choices create path dependencies, the central importance of transparency, and the need for independent verification of industry claims. He shared recommendations from the report, including information disclosure, whistleblower protections, third-party risk assessments, and post-deployment incident reporting.

This session started with three lightning talks by industry representatives:

Dan Hendrycks, Safety Advisor at xAI, shared insights from the company’s Draft AI Risk Management Framework. xAI mitigates malicious use risks through measures including access management and filtering methods, with particular attention to threats in the cyber and CBRN domains. The framework also addresses loss of control through measures including monitoring for deceptive tendencies.
FU Hongyu (傅宏宇), AI Governance Lead & Director at Digital Economy Research Center at Alibaba Research Institute, emphasized Alibaba’s commitment to open-source AI, highlighting its transparency benefits while recognizing its unique risks. He outlined Alibaba’s security pipeline covering data, processing, resource management, and automated safety tests. He further emphasized institutional safeguards, such as the establishment of a technology ethics review system in 2021.
BAO Chenfu (包沉浮), Outstanding Architect and Chairman of the Safety/Security Technology Committee at Baidu, emphasized that traditional security methods fall short for AI. He outlined Baidu’s lifecycle-based, defense-in-depth approach and highlighted its active role in industry standards and self-regulation.

Following the lightning talks, Rebecca Finlay, CEO of the Partnership on AI, joined a panel discussion with the three corporate representatives, moderated by Concordia AI International AI Governance Senior Research Manager Jason Zhou. They explored three key pillars: transparency in technical disclosure and regulatory alignment, organization-level ethical governance mechanisms, and the pros and cons of voluntary agreements versus binding regulation for achieving compliance. Panelists agreed that voluntary commitments offer flexibility in addressing uncertainty and unknown risks but should be augmented by more comprehensive measures, including increased transparency, internal governance mechanisms, and future legislation.

Shanghai AI Lab, in partnership with Concordia AI, released the Frontier AI Risk Management Framework v1.0. AI Safety Research Manager at Concordia AI, DUAN Yawen (段雅文) and Shanghai AI Lab Research Scientist Dr. SHAO Jing (邵婧) introduced the framework. It is China’s first comprehensive framework for managing severe risks from general-purpose AI models. Alongside the Framework, Shanghai AI Lab released a risk assessment report, which Concordia AI co-authored. We covered both documents in a previous Substack post.

In this panel, YANG Xiaofang (杨小芳), LLM Security Director at Ant Group; GONG Xiao (巩潇), Deputy Director of the China Software Testing Center; and Professor Robert Trager, Founding Director of the Oxford Martin AI Governance Initiative, moderated by Concordia AI AI Safety and Governance Senior Manager CHENG Yuan (程远), discussed three topics: corporate practice, third-party evaluation, and policy research. The discussion highlighted key challenges across the AI lifecycle, including risk identification, assessment, mitigation, and governance. The panel stressed that enterprises must go beyond technical solutions by strengthening organizational mechanisms and talent development. At the international level, they emphasized the need for consensus and incentive mechanisms to support the creation of a global AI risk governance framework.

Theme 4: International Governance of AI Safety

Anne Bouverot, Special Envoy of the President of the Republic of France for AI, reviewed the outcomes of the 2025 Paris AI Action Summit. She emphasized the launch of a foundation for developing public interest AI and also called for greater attention to AI sustainability issues, including energy consumption and environmental impact. Bouverot highlighted Europe’s investments and commitments in AI infrastructure and governance, emphasizing that trust and safety are central to enabling AI deployment. She concluded with a call for global collaboration to jointly promote safe and sustainable AI development.

Wan Sie Lee, Cluster Director (AI Governance and Safety) at Singapore’s Infocomm Media Development Authority (IMDA), introduced Singapore’s practices and international collaboration experience in global AI safety governance. She emphasized advancing safe and responsible AI through research, guidelines and tools, and global norms. She highlighted the Singapore Consensus and its defence-in-depth approach to AI safety, covering evaluations, safety techniques, and post-deployment control. In addition, she shared models for international collaboration such as joint testing exercises and cross-border red-teaming. She stressed that standard-setting and practical implementation must go hand in hand, with a necessity for interoperable international standards.

Concordia AI launched the State of AI Safety in China (2025) and State of AI Safety in Singapore reports. Kwan Yee NG (吴君仪), Head of International AI Governance at Concordia AI, introduced key findings from both reports.

The final panel welcomed Lucia Velasco, AI Policy Lead at the UN Office for Digital and Emerging Technologies; Benjamin Prud’homme, Vice-President of Policy, Safety and Global Affairs at Mila; FU Huanzhang (傅焕章), Assistant Director of the INTERPOL Innovation Centre; and GONG Ke (龚克), Executive Director of Chinese Institute of New Generation Artificial Intelligence Development Strategies, moderated by Concordia AI Head of International AI Governance Kwan Yee Ng. They discussed AI governance at the UN, translating scientific consensus into action, international law enforcement cooperation, and global safety red lines. They stressed that the scientific community must communicate frontier risks in accessible policy language to encourage broad participation and foster mutual trust. For law enforcement, they highlighted the importance of establishing rapid, cross-border cooperation mechanisms to respond to catastrophic risks in a timely and effective manner. The panelists also underscored the importance of enhancing AI literacy among the public and practitioners, and of building international dialogue mechanisms.

Closing Address

HE Jifeng (何积丰), Academician of the Chinese Academy of Sciences and a Member of the Shanghai Artificial Intelligence Strategic Advisory Expert Committee, delivered the Forum’s closing address. He pointed out that rapid AI development has brought unprecedented governance challenges. The core issue is how to harness superintelligence while ensuring human control and safety when machines are more intelligent than humans.

Referencing insights of earlier speakers, Academician He proposed researching technical interpretability, applying mathematical methods for modeling and reasoning, and ensuring the robustness and reliability of hardware and software systems in safety-critical applications. At the same time, he stressed the importance of building a multidimensional, multilayered governance framework encompassing international governance structures, safety verification methods, and social resilience.

He concluded by calling for recognition that safety governance is a fundamental safeguard, not an obstacle, to AI development. Only when society has full trust in these systems and embraces the outcomes of AI can the technology achieve explosive growth.

Group photo with guests and audience after the AI Safety and Governance Forum afternoon session.

Media Mentions

Media coverage of Concordia AI’s WAIC Forum, forum guests, or Concordia AI reports published at WAIC include:

Bloomberg, China Vies to Unseat US in Fight for $4.8 Trillion AI Market, July 30, 2025. The article included a table titled “China Sees Safety as Core Element of Its AI Strategy” with insights on domestic, international, technical, and industry developments based on Concordia AI’s State of AI Safety in China (2025) report.
Wired, Inside the Summit Where China Pitched Its AI Agenda to the World, July 31, 2025. The article mentioned Concordia AI’s AI Safety and Governance Forum, cited insights from our State of AI Safety in China (2025) report, and quoted Concordia AI CEO Brian Tse as saying: “You could literally attend AI safety events nonstop in the last seven days,” adding that because US and Chinese frontier models are “trained on the same architecture and using the same methods of scaling laws, the types of societal impact and the risks they pose are very, very similar.”
Caixin, Elon Musk’s xAI Safety Advisor: U.S., China, and Europe Should Seek “Unity in Diversity” in AI Regulation, July 30, 2025. The article interviewed Dan Hendrycks and mentioned his participation in Concordia AI’s WAIC Forum. It also discussed the Shanghai AI Lab and Concordia AI Frontier AI Risk Management Framework.
IT Times, Over 10 large models already possess “self-replication” capabilities, July 29, 2025. The article reported on speeches by Academician He Jifeng, Academician Gao Wen, Dean Yang Min, Director Zhou Bowen, and Professor Dawn Song, as well as the Shanghai AI Lab and Concordia AI Frontier AI Risk Management Framework.
Tech Review Africa, UN Digital Envoy concludes China visit, advocates for inclusive AI governance, August 4, 2025. This article reported on UN Under-Secretary-General Amandeep Singh Gill’s trip to WAIC, including participation in the Concordia AI forum.
The People’s Daily, Jointly Promote AI Development and Governance, July 31, 2025. This article mentioned Concordia AI’s State of AI Safety in China (2025) report and interviewed NTI’s Jaime Yassif.

The post Concordia AI holds the AI Safety and Governance Forum at the World AI Conference 2025 first appeared on Concordia AI.

Concordia AI and AI Verify Foundation AI Safety and Risk Management Workshop in Singapore

Concordia AI — Wed, 09 Jul 2025 10:56:27 +0000

On December 3, 2024,Concordia AI and the AI Verify Foundation jointly co-hosted a closed-door workshop bringing together over 20 researchers and professionals specializing in AI technology and policy from over 6 countries. The workshop, held alongside the International AI Cooperation and Governance Forum 2024, focused on developing international consensus on AI safety testing and evaluation. See the readout as a PDF here.

In accordance with the Chatham House Rule, participant contributions are not attributed to individuals. The readout reflects some common views but should not be taken as an endorsement by participants of the entire summary.

The participants recognized that there is an urgent need for a shared global understanding of AI capabilities, opportunities, and risks given the rapid advancement of AI capabilities. They noted that global collective action is necessary to ensure the safety of increasingly advanced systems and encourage the global community to anticipate emerging challenges and promote the safe and beneficial use of AI.

The workshop included presentations on AI safety testing approaches across various jurisdictions. Participants then broke into small groups to discuss opportunities and challenges in four key areas: risk identification, testing methodologies, risk mitigation, and ongoing monitoring of emerging risks.

Pillar 1: AI Safety Risk Identification

To effectively address AI safety concerns, the international community should identify both current and emerging risks across multiple domains. Priority areas requiring additional research include risks from AI agents and biological design tools. Understanding which risks have already manifested versus those likely to emerge enables more strategic preparation and mitigation efforts.
To address this and other emerging threats, some participants stated that we should establish robust early warning systems, including collaborative model testing environments and coordinated red team exercises to detect potentially dangerous capabilities.
Given the inherent uncertainties in risk assessment, a multi-faceted approach is essential.

Pillar 2: AI Safety Risk Measurement

Effective AI safety risk assessment requires careful matching of measurement tools to specific risk types. Current measurement approaches include static and customizable benchmarks, manual red teaming, and agent-based red teaming. Evaluations organizations would benefit substantially from sharing insights about the relative effectiveness of these tools for different risk scenarios.
Simultaneously, it is crucial to recognize that technical benchmarks alone cannot fully anticipate how AI systems will interact with society at large. There should be complementary assessment methods that consider broader societal impacts.
Given continued development of AI risk identification and the science of AI evaluations, formal testing standards may be premature at this stage. Instead, allowing diverse approaches to develop naturally will likely yield valuable insights that can inform future standardization efforts.
To advance the field, the global academic community would benefit from a formalized research agenda, potentially supported by funded prizes for solving specific AI safety testing challenges. This structured approach would help focus research efforts and accelerate progress in key areas.
A particularly promising opportunity for international collaboration is jointly developing open assessment suites. These could serve as practical tools for implementing regulatory frameworks such as the EU AI Act, while fostering greater international interoperability in AI safety testing.

Pillar 3: AI Safety Risk Mitigation

Risk assessment bodies and AI companies should develop clear thresholds that trigger enhanced safety mitigation measures. These thresholds can be established through literature reviews and consultation with subject matter experts and relevant government authorities. When evaluations indicate breaching of relevant thresholds, developers must implement interventions based on the severity level identified.
A defense-in-depth strategy across the AI development and deployment lifecycle is essential for comprehensive risk mitigation. While mitigation during the pre-training phase is challenging due to potential impacts on model capabilities, the post-training phase offers several viable options, including unlearning and refusal fine-tuning. At the application layer, additional safeguards such as input/output filtering and enhanced refusal mechanisms can provide further protection.
The dual-use nature of many AI domains presents particular challenges, for example, restricting models’ access to general biology knowledge could potentially impede beneficial innovation. While a consensus on this challenge remains elusive, one potential approach would be to create whitelists of vetted actors who can access models with specific knowledge capabilities under a know-your-customer (KYC) scheme.
Ensuring security of models against theft and interference is also an important and neglected mitigation. Mitigation of potential risks from open-weight models remains an open problem, and one potential approach is preventing the model from being fine-tuned for malicious purposes.

Pillar 4: AI Safety Risk Monitoring

Global perceptions of AI risks are fundamentally compatible, though factors including institutional differences influence which specific risks receive emphasis in different regions. The Bletchley Declaration demonstrates broad consensus on the importance of addressing public safety risks from general-purpose AI systems, even as complete consensus on a single risk taxonomy remains challenging.
AI safety governance operates across multiple levels, from corporate to national to international layers. A tiered approach enables appropriate distribution of responsibilities: national governments can implement domestic risk assessment and reporting requirements, while international cooperation can focus on establishing safeguards against catastrophic risks and defining clear red lines.
The challenge of incident reporting requires careful consideration of incentive structures. Overly punitive responses to safety incident reports may discourage corporate transparency and create unintended chilling effects. To promote open reporting, regulators should balance accountability measures with positive incentives such as third-party ratings or certifications, preferential capital access, and inclusion in a consortium for trusted providers. Consequences for non-reporting might include corporate liability, executive accountability, and reputational impacts.

The post Concordia AI and AI Verify Foundation AI Safety and Risk Management Workshop in Singapore first appeared on Concordia AI.

Concordia AI hosting AI safety sessions at the International AI Cooperation and Governance Forum 2024

Concordia AI — Fri, 20 Dec 2024 09:26:39 +0000

On December 2-3, 2024, Concordia AI and the AI Verify Foundation co-hosted three panels on AI safety at the International AI Cooperation and Governance Forum 2024 at the National University of Singapore (NUS), in collaboration with Tsinghua University and Hong Kong University of Science and Technology. The forum’s safety sessions focused on three key topics: big picture priorities and international cooperation on AI safety, the science of AI safety evaluations, and the relationship between safety at the foundation model and downstream application levels. Concordia AI Senior Research Manager Jason Zhou hosted the safety proceedings.

Presentations on international AI safety cooperation

Executive Director of the Digital Trust Centre Singapore and the Singapore AI Safety Institute (AISI) LAM Kwok Yan outlined a “TrustTech” approach to developing AI technology. Lam described Singapore’s efforts to develop AI trust technologies to enable secure collaboration across organizations, strengthen robustness, and bridge academic research with real-world deployment. He emphasized addressing AI systems as socio-technical systems and mitigating vulnerabilities in foundation models that could lead to collective societal failures. Singapore AISI’s work spans four core areas: testing and evaluation, safe model design and deployment, content assurance, and governance and policy.

UK AI Safety Institute (UK AISI) Chief Technology Officer Jade Leung discussed the organization’s use of AI safety testing methods including automated benchmarks, human uplift trials, and expert red teaming in five key domains of concern: chemical/biological misuse, cyber misuse, autonomous systems, safeguards, and societal impacts. She presented UK AISI’s open-source INSPECT testing platform and joint testing efforts between the UK, Singapore, and the US. Leung also shared UK AISI’s international collaboration efforts, including engaging a range of countries in the AI safety summits, commissioning the International Scientific Report on the Safety of Advanced AI, and work to secure corporate Frontier AI Safety Commitments.

Tsinghua Institute for AI International Governance (I-AIIG) dean XUE Lan (薛澜) shared seven key ideas and views of China’s network of AI safety research institutions. He advocated for advancing AI safety and development simultaneously under the UN’s Common Agenda and Global Digital Compact. He emphasized fairness through globally interoperable AI safety testing systems and auditable technologies. Xue called for international cooperation on data security and privacy protection, as well as enhanced coordination to prevent AI misuse in activities like misinformation. He stressed the need for increased international investment in AI safety R&D to prevent risk of AI going out of control, while strengthening global risk reporting and policy sharing through AI safety summits. Finally, he highlighted the importance of AI capacity building in developing countries to achieve shared security.

Tsinghua Professor and Zhipu AI Chief Scientist TANG Jie (唐杰) demonstrated various AI models developed at Zhipu AI, including ChatGLM and the agentic AutoGLM. The presentation discussed several key safety concerns, including jailbreak attacks and creating the SafetyBench dataset in order to conduct RLHF training and ensure model safety. He also raised the emerging challenges in ensuring safety of multimodal and increasingly agentic and potentially embodied systems.

Friederike Grosse-Holz from the EU AI Office’s AI Safety Unit gave an online address on the EU’s approach to regulating general-purpose AI models. She explained that Chapter 5 of the EU AI Act, implemented through the forthcoming General-Purpose AI Code of Practice, establishes the key regulatory framework. The regulations require transparency from AI providers, mandating them to share specific model information with the EU AI Office and downstream providers while adhering to EU copyright laws. She noted that for models identified as posing systemic risks, providers must conduct thorough risk assessments and implement appropriate mitigation measures.

Panel on international AI safety cooperation

Concordia AI CEO Brian Tse (谢旻希) moderated this panel, which explored emerging developments in AI safety and opportunities for collective action in 2025. Tang presented a framework of AI development layers from language understanding, advanced reasoning, and tool use to self-learning, emphasizing the need for systemic risk assessments as systems advance toward AGI. Xue advocated for scenario planning for different AI risks, drawing upon lessons from the crisis management field, and more frequent updates to scientific assessments than the IPCC’s five-year cycle. On AI safety testing, Leung highlighted that there is not yet detailed modeling for AI risks and more scientific efforts are needed to ensure replicability, necessitating UK AISI to develop many methods and tools from scratch. Lam stressed the importance of building trustworthy digital-physical interfaces, especially for safety-critical industries that can threaten human life if they malfunction. The panel also discussed the emerging challenge of monitoring and preventing various forms of deception when AI systems interact with human users. Looking ahead to 2025, panelists proposed ideas such as an ITER (global fusion project)-like global AI safety scientific project (Xue), systemic testing of AI safety against risk thresholds and red lines (Lam and Leung), and a global definition of AI safety (Tang).

Panel on AI safety testing science

This panel’s guests were Director of the National University of Singapore AI Institute Professor Mohan Kankanhalli, University of Illinois at Urbana-Champaign Professor LI Bo, and Tsinghua University Professor HUANG Minlie (黄民烈), with Singapore Infocomm Media Development Authority Director for Data-Driven Tech Wan Sie LEE moderating.

The panel examined approaches to testing methodologies and evaluation frameworks in AI safety. Kankanhalli noted that AI safety science remains in an early, empirical stage, suggesting that the field should draw inspiration from computer security’s adversarial frameworks and control systems’ mathematical modeling of boundary conditions. Li emphasized incorporating symbolic rules and principles for safety guarantees, addressing the challenges of bug fixes and long-tail risks in AI systems. Huang raised the promise of erasing harmful knowledge through machine unlearning as a countermeasure for jailbreak attacks, an approach also endorsed by Kankanhalli. The panel provided recommendations for international cooperation: standards for safety and security in certain domains (Li), open-source attack simulation projects (Huang), and exploring the risks of agentic systems operating in the physical world (Kankanhalli).

Panel on AI safety cooperation between regulators and industry

The participants in this panel were EU General-Purpose AI Code of Practice Vice-Chair Nitarshan Rajkumar, Resaro AI Managing Partner and CEO April Chin, and BCG X Principal Engineer SEA Robin Weston, with AI Verify Executive Director Shameek Kundu moderating.

The final panel explored the critical intersection between foundation model safety and downstream commercial applications. Rajkumar noted that while businesses focus on operational risks, governments must address broader national security concerns. He also compared foundation models to nuclear power plants upstream and AI applications to power outlets downstream to draw attention to the differing safety requirements needed at different levels – safety measures at the upstream level might be more significant and require government supervision. Chin described Resaro AI’s work in testing AI systems in high risk applications such as healthcare and education to bridge the gap between academic benchmarks and more use-specific benchmarks. She noted that the number of stakeholders is a challenge, where in one instance, over 700 test cases were required before deploying a chatbot. Weston advocated for a “continuous delivery” approach to AI deployment, arguing that gradual updates improve understanding of system behavior, help identify sources of problems, and helps to account for the fundamentally unpredictable behavior of software in the real world.

The post Concordia AI hosting AI safety sessions at the International AI Cooperation and Governance Forum 2024 first appeared on Concordia AI.