Case Study Healthcare Clinical Trials Conceptual Product

Virtual DM Assistant

Reducing re-queries in clinical trials by surfacing real-time, contextual guidance for site coordinators directly within the EDC.

Hypothesis

If the clinical trial database had a virtual DM assistant chatbot (which explained why the query exists and how to resolve it) for sites to use when answering queries, it would reduce the number of re-queries by 40%. By-products: decrease time to close queries by 30%, and reduce time to database lock by 2 weeks.

21% of queries are re-queried
52 days avg. time to close a query
$30–70 cost per query
3,000–10,000 queries per moderate Phase 3 trial
1 Discover

Context

From personal experience as a clinical data manager, I've observed that data cleaning for clinical trials heavily involves site queries in the clinical trial database. This requires a query to be raised by the DM manually or automatically by the system. It then requires the site to attend to the query — sometimes a response is needed or the data would be updated. In 21% of cases, there is back and forth if the data change or query response is not clear.

As per industry research, a moderate sized trial (200 patients in Phase 3) could have between 3,000–10,000 queries, and each query costs between $30–$70. It also takes a query an average of 52 days to close, which can delay database lock and analysis timelines.

The Market

Direct EDC Query Workflows

Medidata Rave · Veeva Vault · Oracle Inform

Static query text + email clarification. Sites dig through lengthy eCRF guidelines.

Missing: real-time explanation of query intent
Adjacent AI in Clinical Data (Cleaning)

Medidata Detect · IQVIA AI Coding

Anomaly detection and AI-assisted coding for data quality.

Gap: No AI for human-in-the-loop query resolution
Indirect Regulated AI Copilots

GitHub Copilot · MS Copilot for Healthcare

Workflow augmentation tools in regulated environments (dev workflows, clinical notes).

Proof point: Human augmentation is viable even in regulated spaces

The Audience

Persona 1 · Primary

The Site Coordinator "So-Much-To-Do"

Motivated by
Resolving data management queries and other data quality issues so their site doesn't have bad metrics or protocol deviations.
Pain Points
Lots of documentation everywhere — for labs, data management, devices, pharmacy. Must review a 50+ page eCRF completion guideline to understand a data entry issue and maybe get an understanding of how to resolve it.
What they hope for
Immediate assistance when answering queries. Currently they raise issues to the CRA, who reaches out to the DM if they can't solve it — via email, with back and forth, sometimes looping in Clinical Science for complex issues.

Persona 2 · Secondary

The Data Manager "Deep-in-the-Data"

Motivated by
Highest quality data and ways to detect and clean data efficiently.
Pain Points
Some queries must be manual. Communication with the site is only via the query. Due to large trials and time zone differences, they cannot support at the time the coordinator needs. DMs receive emails with lag in responding, then write detailed emails on how to solve — very time consuming. The same email is often sent by multiple CRAs.
2 Define

Current Query Resolution Journey

  1. 1

    DM Detects Discrepancy

    Manual review by the Data Manager identifies a data discrepancy in the EDC.

  2. 2

    DM Writes Query in EDC

    Query is raised in the EDC and a notification is sent to the site.

  3. 3

    Site Coordinator Sees Query

    The coordinator sees the query among 20–30+ others in the database.

    Pain Point: No context about query intent.
  4. 4

    Site Interprets & Responds

    Coordinator attempts to understand and respond to the query.

    Pain Point: Must cross-reference patient documents + 50-page eCRF manual.
  5. 5

    DM Reviews Response

    Data Manager reviews the site's response — with a 1–3 day lag.

    Pain Point: 21% require re-query clarification.
  6. 6

    Re-Query Issued 21% of Cases

    A new query is issued for the same case. If the site is unsure, they escalate to the CRA, who escalates to the DM via email.

    Pain Point: Site re-works the same case. DM writes repetitive re-queries or lengthy clarification emails.
  7. 7

    Final Resolution

    Query is eventually closed — but critical queries block analysis timelines.

    Pain Point: Critical queries block analysis timelines.

Big Takeaways

Competitive Whitespace

No EDC (Medidata Rave, Veeva Vault) offers real-time query context at response time — only static text + email.

Structural Shift

RBQM reduces total query volume, making efficient resolution of remaining complex queries mission-critical.

User Behaviour

Sites spend time digging through docs; DMs write repetitive clarification emails — both are solvable with contextual AI guidance.

Proven Pattern

Regulated copilots (e.g., MS Healthcare) show that human workflow augmentation is viable — even in highly regulated industries.

Problem

01

Re-query and Burnout Loop

Sites receive queries with little context, dig through 50-page eCRF guides, and send best-guess responses; DMs then re-query 21% of them, rewriting similar clarifications by email and increasing burnout on both sides.

02

No Real-Time Query Guidance

Existing EDC systems (Rave, Vault, Inform) only provide static query text and email chains—there is no real-time assistant to explain why a query exists and suggest compliant resolution options at the moment the site is responding.

The Goal

Site Coordinator

Reduced digging through 50-page eCRF completion guidelines; less coordination with CRAs when stuck; faster time to clean query counts in the database.

Data Manager

Less re-querying; less time to review and write queries; fewer repeated emails to CRAs and sites.

Sponsor / CRO / Study

Faster time to close queries; shorter time to database lock; higher quality data, faster.

3 Develop

Feature Prioritization & MVP Definition

Feature Reach Impact Confidence Effort Result
"Why?" button 100% High High Med MVP
Chat avatar 60% High Med High MVP
Analytics 40% Med Med Med V2

Final Solution — Core Flow

  1. 1

    Site opens query

    A "Why?" button appears for site coordinators to select if they choose.

  2. 2

    Popup: Context & Guidance

    Query intent + eCRF reference + resolution examples are surfaced in a popup.

  3. 3

    [Optional] Chat

    "I'm still confused about how to solve this query."
    DM chat: "For this query, it is requesting you to verify the start date of the drug admin as this is the same as the start date from Week X (it is duplicated)."

Prototype of the Virtual DM Assistant embedded directly in the EDC query view.

4 Deliver

Launch & GTM Strategy

Pilot approach (3 months):

Study Selection

1 Oncology Phase 3 Maintenance + 1 New Phase 3 Oncology Study

Sites

20 High-Performing Sites

Low training burden — chosen to reduce confounding variables in pilot data.

Rollout

Opt-in Button for 50% of Queries

Channels

  • Site training webinars
  • EDC nudge when site coordinator logs in after the feature is implemented

Measuring Success

North Star Metric

Re-query rate: 21% → 12%

Leading Indicators

  • Button click rate > 60%
  • Chat usage > 20%

Lagging Indicators

  • Query closure: 52 → 36 days

Counter Metric

  • Site response time increase: +10% max

Risks & Tradeoffs

Regulatory: Leading the Site Coordinators

Chat guidance interpreted as data direction.

Mitigation

100% non-prescriptive language ("Common resolutions include..."), full audit trail, regulatory pre-validation.

Tradeoff: Sacrifices some guidance richness for GCP compliance.

AI Bias / Hallucination

Chat gives wrong eCRF/protocol interpretation → data quality worse.

Mitigation

Study-specific document grounding only.

Tradeoff: Conservative guidance vs. comprehensive coverage.

RBQM Query Reduction

Too few queries remain for ROI as RBQM reduces total volume.

Mitigation

Scope to complex queries (protocol interpretation, lab adjudication).

Tradeoff: Smaller market vs. higher impact per query.

Compliance Constraints

Guidance must remain auditable, non-prescriptive, and grounded only in study-specific documentation.

Mitigation
  • Use non-prescriptive phrasing (e.g., “Common resolutions include A, B”).
  • Restrict grounding to protocol and study documents.
  • Avoid leading queries or chats that could bias data entry.

Tradeoff: Slightly less directive guidance in exchange for regulatory safety.

Future Iterations

  • Full chat avatar (post-regulatory validation) with clinical science query avatar.
  • Cross-study pattern learning (scale beyond single study) for query resolutions.

Closing

Started with a 21% re-query rate wasting site and DM time. The solution delivers contextual guidance at decision time, cutting re-queries by 40% and DB lock by 2 weeks. Unlocks faster, cleaner trials.

References

  1. Query Management in Clinical Trials: A Guide to Process & Costs — IntuitionLabs