Overview
Roche is one of the world's largest biopharmaceutical companies, with a significant presence in immunology and oncology. Its immunology portfolio includes Ocrevus (ocrelizumab), the leading treatment for relapsing and primary progressive forms of multiple sclerosis, and its oncology franchise features Tecentriq (atezolizumab), a PD-L1 checkpoint inhibitor approved across multiple tumor types including non-small cell lung cancer, hepatocellular carcinoma, and urothelial carcinoma.
As post-marketing safety surveillance obligations expanded across both therapeutic areas, Roche's Global Patient Safety organization recognized a critical need to modernize how real-world evidence was integrated into ongoing benefit-risk assessments. The existing infrastructure relied on fragmented, manually curated databases that could not keep pace with the volume and velocity of post-marketing data being generated globally.
ArcaScience partnered with Roche to deploy an enterprise-grade data integration layer that unified disparate safety data sources and automated the incorporation of real-world evidence into quantitative benefit-risk analysis workflows.
The Challenge
Roche's pharmacovigilance infrastructure had evolved organically over two decades, resulting in a patchwork of 12 distinct safety databases, each with its own data model, coding conventions, and access protocols. The challenge was multi-dimensional:
Siloed safety databases. Individual Case Safety Reports (ICSRs) from spontaneous reporting, clinical trials, post-authorization safety studies (PASS), and patient support programs were stored in separate systems. The global safety database (Argus), clinical trial safety data in Oracle Clinical, and regional pharmacovigilance systems in China, Japan, and Brazil each operated independently. Reconciling a single drug's safety profile required manual extraction from up to eight separate sources.
Fragmented real-world evidence. RWE from electronic health records (EHR), claims databases (Optum, Truven MarketScan), disease registries (MSBase for multiple sclerosis, Flatiron Health for oncology), and social media monitoring existed in disparate formats. The pharmacoepidemiology team spent an estimated 60% of their time on data wrangling rather than analysis. There was no standardized pipeline to incorporate RWE into the Periodic Benefit-Risk Evaluation Report (PBRER) workflow.
Manual data reconciliation. Each PBRER cycle for Ocrevus required approximately 14 weeks of data preparation, including manual MedDRA coding harmonization across sources, deduplication of cases reported through multiple channels, and reconciliation of divergent adverse event terminology. For Tecentriq, the complexity was amplified by the drug's use across six distinct indications, each with different expected safety profiles and comparator landscapes.
Regulatory pressure. EMA's GVP Module VII requirements for signal detection increasingly expected sponsors to incorporate RWE alongside spontaneous reporting data. FDA's Sentinel System integration expectations and PMDA's evolving RWE guidance meant that Roche needed a scalable, validated approach to multi-source evidence synthesis, not just for two products but as a platform capability for the entire portfolio.
The ArcaScience Solution
ArcaScience deployed its Data Intelligence Engine as the foundational integration layer, connecting Roche's 12 existing safety data sources into a unified, continuously updated evidence base. The implementation was executed in three phases over 16 weeks, with full GxP validation.
Phase 1: Data Harmonization & Integration
The ArcaScience platform established automated ETL pipelines to ingest data from Roche's Argus safety database, Oracle Clinical trial repositories, regional PV systems, and external RWE sources. A semantic harmonization layer mapped divergent coding systems (MedDRA versions 23.0 through 26.1, WHO-Drug dictionaries, and institution-specific terminologies) into a unified ontology. Automated deduplication algorithms identified and reconciled cases reported through multiple channels, reducing duplicate case counts by 18% for Ocrevus and 22% for Tecentriq.
Phase 2: Automated RWE Incorporation
ArcaScience configured continuous data feeds from MSBase (the international MS registry with 80,000+ patient records), Flatiron Health's oncology EHR database, Optum claims data, and CPRD (UK primary care records). Natural language processing models extracted structured adverse event data from unstructured clinical notes, while propensity score-weighted analyses adjusted for confounding in observational comparisons. The platform generated automated incidence rate calculations with confidence intervals, updated weekly, for all MedDRA Preferred Terms across both products.
Phase 3: Quantitative BRA Integration
The Decision Intelligence module consumed the unified evidence base to generate continuously updated benefit-risk assessments. For Ocrevus, the platform implemented a multi-criteria decision analysis (MCDA) framework incorporating relapse rate reduction, disability progression, infection risk (including PML risk stratification), and immunoglobulin depletion monitoring. For Tecentriq, indication-specific BRA models incorporated tumor response rates, immune-related adverse event profiles, and comparator data from checkpoint inhibitor competitors (pembrolizumab, nivolumab, durvalumab). All outputs were formatted for direct insertion into PBRER Section 8 (Benefit-Risk Analysis).
Real-Time Signal Monitoring
ArcaScience deployed a continuous signal detection dashboard that ran disproportionality analyses (PRR, ROR, BCPNN, MGPS) across all unified data sources simultaneously. The system generated automated alerts when statistical thresholds were exceeded, with contextualized clinical assessments powered by ArcaScience's causal inference models. Signal evaluation reports were generated in regulatory-ready format, aligned with EMA's GVP Module IX requirements for signal management.
Platform Modules Used
Implementation Timeline
16 weeks
Products Covered
Ocrevus (ocrelizumab)
Tecentriq (atezolizumab)
Regulatory Jurisdictions
FDA, EMA, Swissmedic, PMDA, NMPA, ANVISA
Results & Impact
Data Preparation Time Reduction
PBRER data preparation for Ocrevus decreased from 14 weeks to 3.5 weeks. The automated ETL pipelines eliminated manual data extraction, coding harmonization, and cross-source reconciliation. For Tecentriq, multi-indication data preparation dropped from 18 weeks to 4 weeks, freeing pharmacovigilance scientists to focus on clinical interpretation rather than data wrangling.
Unified Data Sources
Twelve previously siloed databases now feed into a single, continuously updated evidence base: Argus global safety database, Oracle Clinical, three regional PV systems (Japan, China, Brazil), MSBase registry, Flatiron Health, Optum claims, CPRD, EudraVigilance, FAERS, and VigiBase. All sources are harmonized to a common data model with full provenance tracking.
New Signals Detected
The unified multi-source signal detection approach identified 23 potential safety signals in the first 12 months of operation, including 7 that had not been detected through spontaneous reporting alone. Three signals were subsequently confirmed through targeted epidemiological studies and incorporated into risk management plans. Early detection of a hepatotoxicity signal for Tecentriq in the hepatocellular carcinoma indication led to proactive label updates.
Faster PBRER Cycles
End-to-end PBRER preparation time reduced by 40% across both products. The automated benefit-risk analysis module generates Section 8 content with quantified benefit-risk trade-offs, effects tables, and value trees. Regulatory affairs teams reported that the structured, data-driven format improved consistency across jurisdictions and reduced health authority queries by 35% compared to previous submission cycles.
"For years, our pharmacovigilance teams spent the majority of every PBRER cycle on data preparation rather than scientific evaluation. ArcaScience fundamentally changed that equation. By unifying our 12 safety databases and automating RWE integration, we can now focus on what matters most: understanding the evolving benefit-risk profile of our medicines and making evidence-driven decisions that protect patients. The real-time signal monitoring capability alone has transformed our ability to detect and respond to emerging safety concerns."
Dr. Katharina Bergmann
Vice President, Global Patient Safety & Pharmacovigilance
Roche Pharma — Immunology & Oncology