This annotation protocol was designed to create a high-quality, radiologist-validated dataset for breast imaging analysis using mammography and digital breast tomosynthesis (DBT). The annotation process was report-guided and performed by trained radiologists to ensure accurate lesion localization, segmentation, and structured characterization.
The primary objective of the protocol was to extract detailed lesion-level and study-level information from radiology reports and accurately map these findings onto corresponding imaging data through structured classification and pixel-level segmentation.
Annotations were performed using the following imaging inputs:
All standard mammographic and tomosynthesis views were reviewed. Lesions were annotated across all views and slices where clearly visible.
The primary objective of the protocol was to extract detailed lesion-level and study-level information from radiology reports and accurately map these findings onto corresponding imaging data through structured classification and pixel-level segmentation.
The annotation process was performed in a structured, multi-step manner:
Radiologists carefully reviewed the associated radiology report before initiating image annotation. The report served as the primary source of reference and provided critical study-level and lesion-level information, including:
Study-level parameters:

Lesion-level parameters:
Radiologists used this report-based information to guide subsequent annotation and classification.
All lesions described in the radiology report were identified on mammography and tomosynthesis images.
For each lesion:
This ensured accurate lesion localization and spatial consistency across imaging datasets.
In addition to segmentation, detailed structured classification was performed at both the study and lesion levels.
Study-Level Classification
Radiologists recorded the following report-derived study-level parameters:
These values were directly extracted from the report to preserve the original diagnostic interpretation.
An “Annotator disagreement comment field” was incorporated for each classification parameter to document instances where the annotating radiologist’s interpretation differed from the report. This allowed capture of additional expert-derived insights while preserving the original report-based diagnostic classification.
Lesion-Level Classification
Each annotated lesion was characterized using structured metadata fields, including:
Lesion IDs were maintained consistently across mammography and tomosynthesis images to ensure cross-modality consistency.
The protocol included provisions for capturing additional findings identified during image review that were not explicitly described in the report.
If a radiologist identified an additional lesion:
This allowed capture of additional radiologist-interpreted findings while preserving the distinction between report-derived and radiologist-detected lesions.
To preserve interpretive transparency and enable secondary analysis, the protocol incorporated structured disagreement documentation.
For each classification field, annotators were provided with a disagreement comment option to record situations where imaging interpretation differed from the report.
Examples include:
In such cases:
This approach allowed retention of report fidelity while capturing expert radiologist insight.
To ensure annotation reliability:
This ensured spatial and anatomical consistency of lesion annotations across the dataset.
The annotation protocol focused on clinically relevant lesions and excluded benign findings unlikely to contribute meaningful diagnostic value.
The following were excluded:
However, suspicious calcifications described in the report were included and annotated.
The need for high quality, trusted and secure AI training data services has never been greater. iMerit combines the best of technology and automation with world-class subject matter expertise to deliver the data you need to get to production, fast.