The Second Eye

“The Second Eye” opens a five-part series. It traces healthcare AI to its real origins, in radiology, cardiology, anesthesiology, and other place, decades before today’s hype. I’m tired of reading that AI in healthcare is failing. Its not! It started long before Silicon Valley noticed.

The Second Eye

From digitizing light to comprehensive clinical reasoning


Every few months, a new article declares that AI in radiology has failed to deliver on its promises. The false-positive rates are too high. The algorithms miss what any first-year resident would catch. The technology, we're told, is not ready.

This is not a new conversation. We've been having it, almost word for word, since 1998.

That year, Computer-Aided Detection, known as CAD, made its debut in mammography suites across the United States. It was supposed to be the second pair of eyes every radiologist needed. Instead, it became the colleague who cried wolf, flagging every dense cluster of pixels as a potential malignancy and flooding reading lists with false alarms. Radiologists learned to glance at the CAD overlay, shrug, and move on.

They were right to be frustrated. But the critics who declared the whole experiment a failure were wrong. Not because CAD worked well; it didn't. They were wrong because they mistook an early, clumsy iteration for the final verdict. CAD was built on rules encoded by engineers who had never sat in the dark at 2 a.m., scrolling through chest CTs, developing an instinct they couldn't articulate even to themselves. The rules couldn't capture what they couldn't name.

But CAD proved the concept. It demonstrated that an algorithm could sit inside a clinical workflow, flag a finding, and route it to a human for judgment. That loop between machine detection and human decision never went away. Everything that followed was an argument about how smart the machine in that loop could get.


I graduated from medical school in the late 1990s. After my internship, I faced the question every young physician dreads and loves in equal measure: what specialty do you want to spend your life in?

I knew a few things about myself. I was a fan of computers and technology. I wanted quick feedback, which suited my restless personality. And I wanted a life outside the hospital. The best candidate was radiology.

In the late 1990s, radiology was at an inflection point most other specialties wouldn't reach for another decade. It was already going digital. Three-dimensional MRI and CT reconstructions were finding their place, and you could feel the energy, the sense that imaging was about to become something fundamentally different. What sealed it for me was simpler. I still remember placing a transducer on a patient's abdomen and watching the screen light up. It felt like holding a flashlight in the dark, peeking inside a living body in real time. The image was grainy, the anatomy ambiguous, and I was hooked.

That feeling is worth remembering. Because every generation of technology in radiology has been chasing the same thing: a better flashlight.

Article content

Before there was AI in radiology, there had to be data. And before there was data, there had to be a standard.

In the 1980s, the American College of Radiology and the National Electrical Manufacturers Association created DICOM: Digital Imaging and Communications in Medicine. It's the protocol that defines how medical images are formatted, stored, and transmitted between devices. DICOM meant that an X-ray taken on a GE machine in Boston could be read on a Siemens workstation in Munich. It separated the image from the device. Every algorithm, every neural network, every startup that shows a model "reading" a scan is standing on the shoulders of that file format.

The next piece of infrastructure was PACS, the Picture Archiving and Communication System, the software platform that stores, retrieves, and displays medical images across an entire health network. PACS moved radiology from film lightboxes to digital workstations. I lived through that transition. One year you were holding films up to fluorescent panels, squinting at shadows. The next, you were scrolling through slices on a monitor, adjusting window levels with a mouse. A film on a lightbox is inert. A DICOM file on a server is data. That distinction is the foundation everything else was built on.

Article content

And the data was about to explode.

In the early 2000s, spiral CT machines transformed what had been a slow, slice-by-slice process into continuous volumetric acquisition. The scanner swept through the body in a single breath-hold, generating hundreds of slices in seconds. Multi-detector CT followed quickly: 4-slice machines giving way to 16, then 64, then 256 detector rows. A routine chest CT that once produced 30 images now produced 300 or more.

MRI went through a parallel transformation. Magnets got stronger, bore sizes shrank, pulse sequences grew more sophisticated. Diffusion-weighted imaging revealed early stroke damage within minutes. Cardiac MRI could freeze a beating heart mid-cycle. Ultrasound, the tool that first drew me to radiology, shrank from room-sized consoles to handheld probes that connected to a smartphone. Elastography let you "feel" tissue stiffness through the screen.

All of this meant one thing: vastly more data per patient, per study, per shift. A cardiac CT angiogram could generate over a thousand images. A whole-body MRI for cancer staging might produce thousands. The human eye and the human attention span had become the bottleneck.

This is the context most writing about AI in radiology misses. The algorithms didn't arrive because Silicon Valley decided healthcare was interesting. They arrived because the scanners outpaced the humans.


In 2012, a type of deep learning architecture called a convolutional neural network, or CNN, won the ImageNet image recognition competition by a startling margin. The model, AlexNet, worked in a way fundamentally different from CAD. Instead of following rules written by engineers, a CNN processes an image through successive layers of mathematical filters. Each layer learns to detect increasingly abstract features: edges in the early layers, textures and shapes in the middle, complex structures in the deeper layers. No human decides what those features should be. The network discovers them by training on millions of labeled examples.

Within three years, the medical imaging world noticed. If a CNN could distinguish a golden retriever from a labrador by learning visual features autonomously, could it distinguish a malignant nodule from a benign one?

It could. And the clinical results were striking.

Chest X-ray interpretation was an early proving ground. CNNs trained on hundreds of thousands of labeled X-rays learned to detect pneumonia, pleural effusions, and cardiomegaly, an enlarged heart visible on imaging, with accuracy rivaling experienced radiologists. For pneumothorax, a collapsed lung requiring urgent intervention, AI could flag the finding within seconds of the image reaching the server.

In neuroimaging, the stakes were even higher. Large vessel occlusion, the stroke type caused by a clot blocking a major brain artery, is one of the most time-sensitive diagnoses in medicine. Every minute of delay costs roughly 1.9 million neurons. Companies like Viz.ai built CNN-based systems that analyzed CT angiograms within minutes of acquisition, automatically alerting stroke teams on their phones. Studies associated these systems with significant reductions in time to treatment notification, often on the order of 20 to 30 minutes depending on the center. In stroke care, that's not an efficiency gain. It's the difference between walking out of the hospital and permanent disability.

Fracture detection, particularly subtle breaks in the wrist, spine, or hip, had always been a source of error in emergency departments. AI trained on tens of thousands of X-rays could highlight a hairline scaphoid fracture that might be missed at 4 a.m. during a busy trauma shift. Lung nodule characterization, the question of whether an incidental finding warrants follow-up, became another natural fit: AI could measure volume, track growth across serial scans, and assign risk scores based on shape, density, and location.

The algorithms were impressive. The clinical reality was fragmented. A hospital wanting AI-assisted radiology in 2020 might need one vendor for chest X-ray triage, another for stroke detection, a third for pulmonary embolism, a fourth for mammography. Each with its own integration, validation, and contract. Radiology departments were assembling a patchwork of point solutions, each excellent alone, collectively a nightmare to maintain. The FDA's 510(k) pathway, designed for single-purpose devices with one intended use per clearance, reinforced this fragmentation.


To understand why what came next matters, you have to see the architectural progression clearly.

CAD used hand-crafted features. A human engineer decided what mattered, pixel intensity, cluster size, edge sharpness, and wrote rules. The system could never be smarter than those rules.

CNNs used learned features. The network discovered what mattered from labeled data. A massive leap, but with a hard constraint: every new clinical task required a new labeled dataset and a new model.

Foundation models (e.g., Google’s Med-PaLM M, Microsoft’s LUMEN, Siemens’ CADA), the architecture defining the current moment, use learned representations. A foundation model is a large-scale AI system trained on enormous volumes of data, often without human-applied labels, using self-supervised learning. Instead of being told "this image contains a fracture," the model learns by predicting masked portions of images, reconstructing corrupted data, or matching related views of the same anatomy. Through this process it builds an internal model of what normal anatomy looks like, how structures relate across the body, and what constitutes meaningful deviation.

There is an important distinction here that both clinicians and builders should understand. A foundation model is not the same thing as an ensemble, a bundle of separate task-specific algorithms packaged under a single interface. In an ensemble, each component was trained independently on its own narrow task. In a true foundation model, there is one shared representation of anatomy and pathology, learned once, that supports multiple downstream tasks. The difference matters because a shared representation can recognize relationships between findings that siloed models never could: the interaction between a chest finding and an abdominal one, the pattern that becomes significant only in the context of the whole scan.

This is not an incremental improvement. It is a different engineering paradigm.


In early 2026, the FDA cleared one of the first multi-condition triage platforms built on a foundation-model architecture: Aidoc's CARE, the first comprehensive triage foundation model based platform for radiology. Previous AI tools looked for one thing at a time. CARE triages 14 acute conditions simultaneously: pulmonary embolism, a blood clot in the lung; intracranial hemorrhage, bleeding inside the skull; cervical spine fractures; aortic dissection, a tear in the wall of the body's largest artery; and more, all across the whole body, using a single reasoning engine.

The distinction matters. CARE is not 14 separate algorithms stitched together under one interface. It is one model that processes a scan the way a radiologist processes a scan: looking at everything, prioritizing what's urgent, recognizing that a finding in the chest might change the significance of a finding in the abdomen.

Remember the context collapse of the CAD era, the system that could flag a bright pixel but couldn't understand anatomy? This is context restoration. The machine is no longer pattern-matching in isolation. It reasons across anatomy, across conditions, across urgency levels.

For hospitals, the practical difference is enormous. The department that used to juggle ten vendor integrations now has one platform. But to understand why that matters, you need to understand what a radiologist's workday actually looks like.


The popular image is someone squinting at an X-ray on a lightbox. The reality in 2026 bears almost no resemblance.

The modern radiologist may not be in the hospital at all. PACS, combined with secure cloud infrastructure, untethered radiology from the reading room years ago. A radiologist in Tel Aviv reads emergency scans for a hospital in New York. A pediatric neuroradiology subspecialist in her home office in Denver reviews a case from a rural Montana clinic with no neuroimaging expertise. A patient in a small town at 2 a.m. gets their scan read by a world-class specialist, not because the specialist flew in, but because the images did.

The workday starts with the worklist: a queue of studies on the PACS workstation showing patient, study type, ordering physician, clinical indication, and priority. In a busy hospital, a general radiologist may interpret 50 to over 100 studies per day depending on case mix and practice model. Not all studies are equal; a simple chest X-ray takes minutes, while a complex trauma CT with multiple body regions can take considerably longer.

Without AI triage, this worklist is sorted by arrival time or manual urgency flags. A routine knee MRI and a CT angiogram for suspected pulmonary embolism sit in the same queue. The radiologist works through them in order or makes judgment calls based on clinical notes and instinct.

Article content

With a triage foundation model integrated into PACS, every CT that arrived overnight has already been analyzed. The pulmonary embolism case is at the top before the radiologist sits down, flagged: "PE detected, right lower lobe, high confidence." The radiologist opens the study, reviews the AI's annotations as an optional overlay, confirms or overrides the finding, and dictates the report.

The AI doesn't generate the report. It doesn't make the diagnosis. It sorts the pile and points to the page. The radiologist still reads, still thinks, still decides. But the sickest patients get seen first.

This distinction, between decision support and decision replacement, is why triage AI gets adopted while diagnostic AI faces resistance. "Look at this one first" is a fundamentally different proposition from "this is cancer." The first augments clinical judgment. The second threatens to replace it.

For anyone building these systems, the lesson is clear: the AI has to live inside the PACS, not beside it. It has to speak DICOM. It has to surface findings where the radiologist is already looking. Many products that failed had excellent algorithms and terrible workflow integration. They asked the radiologist to leave their workspace and review findings in a separate application. That is a product design failure, not an AI failure. And it's one the critics routinely conflate


So what's the next version of "AI in radiology is failing"?

It's already taking shape. The argument will be about explainability. Foundation models are harder to interpret than single-task CNNs. When a narrow model flags a lung nodule, you can generate a saliency map, a visual overlay showing which pixels influenced the decision. When a foundation model triages a scan based on a complex interplay of findings across organ systems, the explanation is harder to package neatly.

"Black box medicine," the critics will call it. They won't be entirely wrong to raise the concern.

But the pattern is worth recognizing. The same objection was raised about deep learning in 2015. The field responded with attention maps, gradient-weighted activation methods, and a growing ecosystem of interpretability tools. The solutions were imperfect but iterative. The regulatory frameworks adapted. The objection didn't disappear, but it stopped being a reason to reject the technology.

The explainability challenge for foundation models is real. It will be addressed the same way: iteratively, imperfectly. And then the critics will move on to the next concern, just as they moved on from CAD's false positives, just as they moved on from deep learning's opacity.

Meanwhile, somewhere in a hospital right now, a radiologist is starting their shift. Maybe she's at home, coffee in hand, logging into PACS from her study. Maybe he's in the reading room, monitors glowing in the familiar dark. Either way, the worklist has already been sorted. The critical cases are at the top. The pulmonary embolism that arrived at 3 a.m. didn't wait until morning to be found.

Thirty years ago, I put a transducer on a patient's abdomen and felt like I was holding a flashlight in the dark. The flashlight got brighter. It got smarter. And now it points itself at what matters most.

Medicine followed, as it always does.

Subscribe to Data, Decisions and Clinics

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe