Foundation Model Safety R&D Engineer (China-based)

Applications to this position are now closed.

Background

Concordia AI is dedicated to identifying and addressing the most pressing challenges arising from rapid advances in AI, ensuring its development is safe, reliable, and responsible. Exploring AI safety technologies represents a critical component of this mission. Only through technical approaches can we establish scientific AI risk assessment and monitoring systems that enable real-time, accurate understanding of AI risk conditions and the implementation of appropriate countermeasures. In the meantime, technical solutions are key to mitigating AI risks – through technical means for AI system alignment, reinforcement, and monitoring, we can make systems significantly safer. Our ongoing work includes:

Frontier Model Risk Evaluation
Concordia AI collaborates with Shanghai Artificial Intelligence Laboratory on “Frontier AI Risk Management Framework”, evaluating large language model risk performance in biological/chemical knowledge, strategic deception, self-replication, persuasive manipulation, cybersecurity, and collusion scenarios.
AI Risk Monitoring Platform (coming soon)
Concordia AI develops a third-party platform for evaluating and monitoring frontier AI model risks, tracking real-time AI model risk status and trends to provide reference for policymakers, LLM developers, and AI safety researchers.
AI Safety Technical Research
Concordia AI co-publishes “AI Alignment: A Comprehensive Survey” with Peking University and multiple domestic and international universities and co-authors “Bare Minimum Mitigations for Autonomous AI Development”.

We are seeking a passionate professional with a solid technical background in AI safety to jointly explore the frontiers of foundation model safety and contribute to building next-generation AI safety systems. Joining as a core member of Concordia AI, you will be deeply involved in foundation model safety technology research and practice, tracking the latest domestic and international safety risk research and evaluation benchmarks, analyzing potential dual-use capabilities and safety boundaries of next-generation AI, and exploring innovative risk assessment methodologies. Additionally, you will participate in product development for foundation model safety evaluation and reinforcement platforms, driving technical implementation from design and development through testing and operations, providing critical support for the safe development of AI.

Responsibilities

Frontier AI safety research
Track the state of the art in LLM safety, including but not limited to safety evaluation, red-teaming/adversarial testing, value alignment, interpretability, and chain-of-thought monitoring. Work closely with Chinese and international AI safety research organizations to conduct joint studies and publish academic papers or technical reports.
LLM safety evaluation and adversarial testing
Participate in safety evaluation and red-teaming for LLMs and AI agents; design, innovate and implement evaluation schemes that meet or preferably exceed current industry level; explore risk mitigation approaches and system reinforcement strategies.
Safety platform & products
Translate frontier research conclusions and adversarial-testing insights into production-grade engineering; contribute to building an integrated, end-to-end LLM safety platform. Manage the full lifecycle – on demand design, technology curation, programing, implementation, deployment and operations – to deliver reliable, efficient safety products.

Qualifications

Required

Master’s degree or above, preferably in CS/ML/Software Engineering; exceptional undergraduates (such as those with first-author publications at top-tier ML/NLP/CV conferences) will also be considered.
Strong learning ability, capable of applying latest industry developments and iterating rapidly product prototypes.
Proficiency in Python and algorithms and data structures.
Familiarity with Git for version control and code collaboration.
Alignment with Concordia AI’s mission; high sense of ownership and problem-solving abilities to work both independently and collaboratively; excellent communication skills and teamwork spirit, strong learning ability, and proactive work approach.
Strong Chinese reading, writing, and communication skills

Preferred

2+ years of relevant work experience
Experience with PyTorch and other deep learning frameworks, and with development tools such as Jupyter Notebook, Docker, and Kubernetes.
Experience with large language model API integration, deployment, and evaluation, or experience in large language model safety research and practice.

Benefits

Annual salary range: US$76,000–108,000 per year, adjusted for experience and location.
Flexible working hours; up to 30% remote working quota each year
22 days of paid annual leave and 10 days of paid sick leave per year
Comprehensive social insurance package (“五险一金”) , plus supplemental commercial medical insurance
A modern and comfortable office space with an annual personal development fund and an allowance for office equipment and supplies

For detailed information, please see: https://mp.weixin.qq.com/s/DezWswOxFAZYcDQpB1Y-mA

Apply

Foundation Model Safety R&D Engineer (China-based)

Location

Expected Start Date

Application Deadline

Application Process