How we built an AI assistant that cut clinical trial setup from days to minutes

Before a single patient enrolls, someone has to turn protocol documents into a structured workflow in the app. It takes days and keeps specialists away from higher-value work. We built an AI assistant that reads the sponsor-provided documents and turns them into a complete, editable draft. So the week of setup becomes a quick review.

TL;DR

What 

An AI assistant that reads clinical trial documents (150+ pages) and automatically turns them into a fully structured, editable protocol draft.

Why it matters

Slow setup delays everything: trial start dates slip, costs grow, and fewer studies get off the ground. Faster setup means more trials, sooner.

What changed

Specialists no longer protocol details from documents into the system by hand. They open a ready draft and refine it, so they have more time for decisions that require real clinical expertise.

Result

The prototype won stakeholder buy-in at the hackathon.

The problem no one had time to fix 

Think back to the race to develop COVID-19 vaccines. Millions of people waited anxiously as clinical trials raced against time. And before a single patient could be enrolled, someone had to build a protocol: every visit scheduled, every procedure mapped, every data field configured. All of it by hand, from a pile of dense documents.

Usually, clinicians receive the study package for a new drug trial: a clinical trial protocol document, supplementary materials, and eCRF guidelines. Buried somewhere across those documents is the full roadmap of the trial. 

Their job is to read all of it, translate it into a configured protocol inside the clinical trial management system, and get it exactly right. Get it wrong, and you get a protocol deviation, which can compromise trial data or trigger regulatory issues.

In numbers: the status quo  

4–5

business days for an average protocol, v1 only

150+

pages of source documents to read and interpret

0

reusable work; every new trial starts from scratch


In this article, we'll take you behind the build: how we solved the problem, the architectural choices we made, and what it means for the future of clinical trial setup.

What is a clinical trial protocol? 

A clinical trial protocol is the plan for how a study will run. It has three layers. 

Visits

The sequence of patient appointments (e.g., "Pre-Screening", "Screening", "Week 4", "Follow-up"). Some visits depend on others and must happen in a specific order.

Procedures

The tasks performed at each visit (e.g., "Vital Signs", "Informed Consent", "Blood Draw", "ECG"). Each task follows the CDISC standard for clinical data collection.

Questions and form fields

Each procedure is a form with fields for text, dropdowns, checkboxes, and dates.


A typical real-world protocol has 50–80 procedures spread across 20–40 visits, with each form containing 5–30 questions. So, the complexity adds up fast.

Why is manual setup so hard 

The challenge isn't any single step; it's the volume of detail where mistakes aren't an option. 

Volume

150+ pages of source material with scattered information.

Time

Setting up the first version takes 4–5 business days on average.

Tedious and routine

Specialists spend time on manual transcription instead of higher-value clinical decisions.

Risk of human error

Inconsistent naming, incorrect schedule, or missed procedures lead to costly mid-trial fixes.

Inconsistent formatting

"Blood Pressure" in one sponsor's document is "BP Measurement" in another. Specialists standardize these terms manually every time.

Poor standards adherence

Procedures and fields should follow CDISC standards, but manual setup often leads to inconsistencies.

Limited scalability

Every additional study requires more people. You can't scale the work without scaling the team.

Slow onboarding

The nuances take significant time to learn. New staff require significant time to learn the nuances.

The idea: Let AI do the heavy lifting 

The idea of using AI for protocol setup existed long before the hackathon. The problem was clear, and so was the potential value. But the implementation was complex, so it kept getting pushed aside.

The hackathon gave the team a safe space to test it fast, without production constraints. The challenge came directly from the client's own problem statement.

The core bet: if AI produces a good enough draft, specialists only need to review and refine it rather than build everything from scratch. That shifts the bottleneck from creation to review. 

What we built: Clinical trial protocol drafting assistant 

We built an end-to-end pipeline that takes one or more PDFs and produces a fully structured protocol draft. The result appears as an interactive graph: visits as nodes, dependencies as lines, and procedures inside each visit. A chat interface lets specialists rename procedures, add visits, or move questions.

Technical stack 

LLMs

Claude Sonnet 4 and Claude Haiku 3.5, accessed via AWS Bedrock through LangChain.

Orchestration

LangGraph, with both Python and TypeScript implementations.

PDF parsing

PyPDF2 for text extraction, Camelot for tables.

Visualization

Vite + React Flow for the interactive protocol graph.

Why standard RAG doesn't work here 

The most obvious approach for large documents is RAG (Retrieval-Augmented Generation): split documents into chunks, index them, and retrieve relevant pieces when needed.

But there’s a problem: RAG only works if you already know what to look for.

Ask, “What procedures are in this trial?” and almost every chunk looks relevant. Ask, “What happens at Week 4?” and the system first needs to know that Week 4 is even a visit. It's a chicken-and-egg problem: you can't retrieve the right information until you've already built the protocol you're trying to build.

The Map-Reduce approach 

Instead of retrieval, the pipeline reads every page of every document and processes it in two stages.

Map 

The LLM processes each page independently and extracts only protocol-related information: visits, procedures, schedules, and form fields. It ignores administrative content, legal text, and contact details. Hundreds of pages are compressed into a compact set of structured summaries.

Reduce 

The LangGraph pipeline uses those summaries to: 

  • extract visits, 

  • identify procedures, 

  • build visits in parallel, 

  • classify procedures by CDISC domain, 

  • generate forms, 

  • assemble the final protocol.


The full pipeline: 

Each extracted procedure links back to the exact source pages where it appeared. When the system builds the procedure form, it sends the AI only those relevant pages instead of the full document. This reduces hallucination and keeps each step focused.

What changed for the business 

We tested the assistant against protocols already configured in production, known ground truth. The AI matched closely on visit structure, dependencies, and procedure assignments. 

Where gaps appeared, they were system-specific details not in the source documents, easy for a reviewer to catch and fill before enrollment begins.

The benefits add up quickly.

Throughput

More studies run in parallel without growing the setup team proportionally.

Consistency

CDISC classification applied uniformly; procedure naming consistent across visits.

Quality

Reviewing a draft catches errors more reliably than building from a blank slate.

Onboarding

New team members can review an AI draft faster than learning to build from scratch.

Scalability

You're no longer limited by how many people you have, just how fast they can review.


Sponsor satisfaction

Faster, more reliable delivery means earlier start dates for studies.


The same work that required 4–5 days of specialist time is in minutes. The same team handles more studies simultaneously. And because the AI applies the same naming conventions and CDISC classifications every time, consistency no longer depends on who built it.

From hackathon to production 

During the hackathon, the team built and demonstrated the prototype. Both the client and internal teams quickly aligned around the idea. 

After the presentation, the team moved the project into active development. They refactored and expanded the codebase, recreating the original Python prototype in TypeScript using the same LangGraph architecture.

Key takeaways 

  • Protocol setup is a genuine bottleneck

It sits on the critical path before any patient can be enrolled. Any tool that shortens it creates immediate, measurable value.

  • This is a strong AI use case

The task is document-to-structure extraction with a well-defined output schema. Complex enough to challenge junior staff, regular enough to be learnable by an LLM.

  • Map-Reduce beats RAG for this kind of task

When you don't know what to search for until after you've built what you're trying to build, retrieval doesn't work. Reading everything and distilling it is the right architecture.

  • MVP discipline is everything

PDF in, structured protocol out, editable via chat. That scope was achievable in a hackathon window and enough to secure production investment.

What’s next? 

We are working on deeper integration with clinical trial management systems, support for scanned PDFs, and improved handling of complex multi-column layouts.

FAQ 

What is clinical trial protocol automation? 

It's the use of AI to read clinical trial documents and automatically produce the structured protocol configuration that specialists would otherwise build manually.

How long does protocol setup take without AI? 

At least 4–5 business days for an average study. Complex studies take longer. And that's just for the first version before revision rounds.

How long does it take with AI? 

The pipeline processes 150+ pages across multiple files in minutes. Human review of the resulting draft takes a fraction of the time required for manual setup.

Can AI replace clinical experts? 

No, and it's not designed to. The AI produces a draft; the expert reviews, adjusts, and approves it. The change is that their time goes toward high-value reviews rather than low-value transcription.

Is AI reliable for healthcare workflows? 

VITech is HITRUST-certified, so reliability comes from design: structured output schemas, self-healing parsers that retry malformed responses, targeted context delivery, and mandatory human review before any protocol reaches production. The AI handles the volume, while specialists validate the results.


Let’s get in touch!

Fill out the form below, and we’ll get in touch soon