Head of Data & AI Platforms, Beijing
面议
About the Beijing AI Center
The Beijing AI Center is a new strategic investment by to accelerate drug discovery through AI. The center brings together AI researchers, computational scientists, and platform engineers to apply foundation models, agentic AI, and large-scale scientific computing to real R&D problems. Situated in one of the world's most dynamic AI talent markets, the center operates at the intersection of AI and biologics discovery, computational chemistry, and data-driven drug development.
The center is structured around three pillars: Discovery verticals (biologics engineering, computational chemistry) that own the science, Data & AI Platforms (this role) that own the capabilities, and R&D IT that owns the infrastructure. A dedicated on-premises GPU cluster provides the compute backbone, operated by IT and shaped by platform standards.
About the Role
This role owns the AI capability layer for the Beijing AI Center: the methods, standards, data products, evaluation frameworks, and AI application platforms that sit between raw compute infrastructure and Discovery science teams. You define how AI gets done in Beijing. Discovery teams decide what science to pursue. IT operates the cluster. You build the reusable platform capabilities that connect the two.
You are not an infrastructure leader. IT handles GPU provisioning, cluster operations, networking, and hardware vendor relationships. You are not a science leader. Discovery teams own model architecture decisions, training objectives, and scientific interpretation. You own the methodology, tooling, data readiness, and evaluation rigor that make both sides more productive.
What You Will Do
AI Methodology & Standards (20%)
Methods and best practices (not model architecture, which Discovery owns)
Define and maintain AI engineering standards for the center: training configurations, experiment reproducibility requirements, naming conventions, checkpoint management
Curate and document reusable training recipes: distributed training templates, parameter-efficient fine-tuning configurations, and inference optimization patterns
Publish best-practice guidance on method selection (when to use parameter-efficient methods vs. full fine-tuning, when to apply alignment techniques) so Discovery teams can make informed choices
Establish reproducibility framework: experiment configs, seed management, result validation criteria
Boundary with Discovery: Discovery scientists select model architectures, define training objectives, curate domain-specific training data, and interpret results. You provide the methods toolkit they draw from and the engineering standards they follow. You build the car; they drive it.
Benchmarking & Evaluation (15%)
Design and build the center's model evaluation platform: standardized test harnesses, metric dashboards, and comparison tooling
Curate domain-relevant benchmark suites in partnership with Discovery teams (biologics, computational chemistry)
Define evaluation methodology: what metrics matter, how to construct held-out test sets, how to avoid data leakage, when a model is "good enough" to deploy
Build leaderboard infrastructure so Discovery teams can rigorously compare model variants, fine-tuning approaches, and architectural choices
Establish evaluation-as-a-service: Discovery teams submit a model, get back a standardized scorecard
Boundary with Discovery: Discovery teams define what success looks like for their scientific problems and interpret evaluation results. You build the evaluation infrastructure and methodology that makes rigorous comparison possible.
Data Products & Governance (20%)
Own the "AI-ready" data layer: transform IT's foundational data pipelines into governed, versioned, documented datasets purpose-built for model training
Build data product catalogue with clear provenance, quality scores, known limitations, and recommended use cases
Implement data quality frameworks: automated validation rules, drift detection, completeness checks, cross-dataset consistency
Design and enforce data versioning standards so experiments are reproducible against specific dataset snapshots
Implement PIPL/DSL compliance framework for cross-border data transfer and local data operations
Define data governance policies aligned to Enterprise Data Enablement FAIR standards
Boundary with IT: IT builds and operates the data transfer pipelines, storage, and base ETL. You define what "AI-ready" means and build the quality, versioning, and governance layer on top.
Agentic AI & Applications (15%)
Deploy agentic AI capabilities for the Beijing center: multi-agent orchestration, local agent development frameworks, scientific workflow automation
Evaluate and integrate China-specific LLM landscape (Qwen, DeepSeek, GLM, Moonshot) for center use cases
Build China cloud integrations (Alibaba, Tencent, Baidu) for LLM inference and agent hosting
Enable AI productivity tooling for Discovery teams: coding assistants, literature search agents, experiment design copilots
Define agent evaluation methodology: how to measure agent reliability, safety, and scientific accuracy
Boundary with IT: IT provides the hosting infrastructure for agent services (containers, endpoints, networking). You own what gets built, which LLMs are selected, and how agents are evaluated.
Team Leadership (20%)
Recruit, manage, and develop a team of 8 platform FTEs in 2026, scaling to 10+
Set technical direction and quality bar for all platform deliverables
Run the hiring pipeline in Beijing: source candidates from top local universities and the Beijing AI talent market
Create a team culture that attracts top Beijing AI talent in a competitive market
Onboard and develop junior hires; establish technical mentorship with global platform leads
Stakeholder Alignment (10%)
Partner with IT infrastructure lead on compute policies, scheduling priorities, and capacity planning (you provide requirements and priorities; IT implements)
Work directly with Discovery vertical leads to understand workload requirements and prioritize platform delivery
Represent platform capabilities to ecosystem partners (academic institutions, AI companies) where technical integration is needed
Provide regular delivery updates to global AI leadership
Participate in weekly coordination meetings across all center functions
Requirements
Experience
10+ years in AI/ML platform engineering, data engineering, or applied AI at scale
5+ years leading technical teams (8+ direct reports minimum)
Track record building platform capabilities from scratch (not just maintaining)
Experience building internal AI products or platforms that research or science teams actually adopted
Experience serving scientific or research teams with platform capabilities (biopharma, genomics, or similar domain preferred)
Technical
Strong background in AI training methods: distributed training, parameter-efficient fine-tuning, and modern post-training and alignment techniques
Experience designing evaluation frameworks and benchmarking infrastructure for ML models
Data engineering and governance: data product design, quality frameworks, versioning, compliance
Modern AI application development: LLM tooling, agent frameworks, prompt engineering, multi-agent orchestration
MLOps fundamentals: experiment tracking, model registry, CI/CD for ML
China-Specific
Ability to work in Beijing full-time (on-site)
Mandarin fluency required (team, stakeholders, and partners operate in Mandarin)
China cloud provider experience (Alibaba Cloud, Tencent Cloud, Huawei Cloud)
Familiarity with Chinese AI ecosystem: local LLMs (Qwen, DeepSeek, GLM), academic institutions, AI companies
PIPL/DSL awareness (will learn on the job if foundational understanding exists)
Leadership
Proven ability to operate in a matrix organization: you report to global AI leadership but deliver locally for the center
Comfortable with ambiguity: the center is still being built, and priorities will evolve
Strong stakeholder management across Discovery scientists, IT infrastructure, and global platform teams
Product mindset: you measure success by adoption and impact, not just delivery
Talent magnet: ability to attract top AI talent in a competitive Beijing market
Nice-to-Have
Biopharma domain knowledge (drug discovery, protein engineering, computational chemistry)
Experience with model evaluation at scale (leaderboards, automated evaluation pipelines)
Academic partnership management (joint projects, co-supervision, IP frameworks)
Experience navigating multi-org delivery models where platform, infrastructure, and science are separate teams
About the Beijing AI Center
The Beijing AI Center is a new strategic investment by to accelerate drug discovery through AI. The center brings together AI researchers, computational scientists, and platform engineers to apply foundation models, agentic AI, and large-scale scientific computing to real R&D problems. Situated in one of the world's most dynamic AI talent markets, the center operates at the intersection of AI and biologics discovery, computational chemistry, and data-driven drug development.
The center is structured around three pillars: Discovery verticals (biologics engineering, computational chemistry) that own the science, Data & AI Platforms (this role) that own the capabilities, and R&D IT that owns the infrastructure. A dedicated on-premises GPU cluster provides the compute backbone, operated by IT and shaped by platform standards.
About the Role
This role owns the AI capability layer for the Beijing AI Center: the methods, standards, data products, evaluation frameworks, and AI application platforms that sit between raw compute infrastructure and Discovery science teams. You define how AI gets done in Beijing. Discovery teams decide what science to pursue. IT operates the cluster. You build the reusable platform capabilities that connect the two.
You are not an infrastructure leader. IT handles GPU provisioning, cluster operations, networking, and hardware vendor relationships. You are not a science leader. Discovery teams own model architecture decisions, training objectives, and scientific interpretation. You own the methodology, tooling, data readiness, and evaluation rigor that make both sides more productive.
What You Will Do
AI Methodology & Standards (20%)
Methods and best practices (not model architecture, which Discovery owns)
Define and maintain AI engineering standards for the center: training configurations, experiment reproducibility requirements, naming conventions, checkpoint management
Curate and document reusable training recipes: distributed training templates, parameter-efficient fine-tuning configurations, and inference optimization patterns
Publish best-practice guidance on method selection (when to use parameter-efficient methods vs. full fine-tuning, when to apply alignment techniques) so Discovery teams can make informed choices
Establish reproducibility framework: experiment configs, seed management, result validation criteria
Boundary with Discovery: Discovery scientists select model architectures, define training objectives, curate domain-specific training data, and interpret results. You provide the methods toolkit they draw from and the engineering standards they follow. You build the car; they drive it.
Benchmarking & Evaluation (15%)
Design and build the center's model evaluation platform: standardized test harnesses, metric dashboards, and comparison tooling
Curate domain-relevant benchmark suites in partnership with Discovery teams (biologics, computational chemistry)
Define evaluation methodology: what metrics matter, how to construct held-out test sets, how to avoid data leakage, when a model is "good enough" to deploy
Build leaderboard infrastructure so Discovery teams can rigorously compare model variants, fine-tuning approaches, and architectural choices
Establish evaluation-as-a-service: Discovery teams submit a model, get back a standardized scorecard
Boundary with Discovery: Discovery teams define what success looks like for their scientific problems and interpret evaluation results. You build the evaluation infrastructure and methodology that makes rigorous comparison possible.
Data Products & Governance (20%)
Own the "AI-ready" data layer: transform IT's foundational data pipelines into governed, versioned, documented datasets purpose-built for model training
Build data product catalogue with clear provenance, quality scores, known limitations, and recommended use cases
Implement data quality frameworks: automated validation rules, drift detection, completeness checks, cross-dataset consistency
Design and enforce data versioning standards so experiments are reproducible against specific dataset snapshots
Implement PIPL/DSL compliance framework for cross-border data transfer and local data operations
Define data governance policies aligned to Enterprise Data Enablement FAIR standards
Boundary with IT: IT builds and operates the data transfer pipelines, storage, and base ETL. You define what "AI-ready" means and build the quality, versioning, and governance layer on top.
Agentic AI & Applications (15%)
Deploy agentic AI capabilities for the Beijing center: multi-agent orchestration, local agent development frameworks, scientific workflow automation
Evaluate and integrate China-specific LLM landscape (Qwen, DeepSeek, GLM, Moonshot) for center use cases
Build China cloud integrations (Alibaba, Tencent, Baidu) for LLM inference and agent hosting
Enable AI productivity tooling for Discovery teams: coding assistants, literature search agents, experiment design copilots
Define agent evaluation methodology: how to measure agent reliability, safety, and scientific accuracy
Boundary with IT: IT provides the hosting infrastructure for agent services (containers, endpoints, networking). You own what gets built, which LLMs are selected, and how agents are evaluated.
Team Leadership (20%)
Recruit, manage, and develop a team of 8 platform FTEs in 2026, scaling to 10+
Set technical direction and quality bar for all platform deliverables
Run the hiring pipeline in Beijing: source candidates from top local universities and the Beijing AI talent market
Create a team culture that attracts top Beijing AI talent in a competitive market
Onboard and develop junior hires; establish technical mentorship with global platform leads
Stakeholder Alignment (10%)
Partner with IT infrastructure lead on compute policies, scheduling priorities, and capacity planning (you provide requirements and priorities; IT implements)
Work directly with Discovery vertical leads to understand workload requirements and prioritize platform delivery
Represent platform capabilities to ecosystem partners (academic institutions, AI companies) where technical integration is needed
Provide regular delivery updates to global AI leadership
Participate in weekly coordination meetings across all center functions