P&ID Symbol Detection with YOLOv8 and PyTorch — Complete Tutorial

Codersarts AI
May 17
12 min read

Every P&ID is a dense map of symbols — valves, pumps, instruments, heat exchangers, control loops — where the position, shape, and connections between symbols carry meaning that no OCR engine can read.

This is the part of document intelligence that most tutorials skip entirely. OCR extracts text. But on a P&ID, a gate valve isn't labelled "gate valve" in plain text — it's a specific geometric symbol shape at a specific location connected to specific pipelines. Understanding that requires computer vision, not character recognition.

In this guide we train a custom YOLOv8 object detection model from scratch on P&ID symbols, covering everything: dataset preparation, annotation strategy, training configuration for high-resolution engineering drawings, inference, post-processing to associate symbols with instrument tags, and evaluation with precision/recall metrics.

This is the exact model architecture we use in production at docprocessing360.com — deployed for oil & gas, EPC, and manufacturing clients.

Why YOLOv8 for P&ID Symbols

Several object detection architectures exist. Here's why YOLOv8 wins for P&ID symbol detection specifically:

Criterion	YOLOv8	Faster R-CNN	LayoutLM	Template Matching
High-res image support	✅ Native	✅ Yes	❌ No	✅ Yes
Small object detection	✅ Strong	✅ Strong	❌ No	⚠️ Fragile
Custom class training	✅ Simple	⚠️ Complex	⚠️ Moderate	❌ Per-symbol
Training speed	✅ Fast	⚠️ Slow	⚠️ Slow	N/A
Production deployment	✅ ONNX/TorchScript	⚠️ Heavier	⚠️ Heavier	⚠️ Brittle
Handles symbol rotation	✅ With aug	⚠️ Limited	❌ No	❌ No
Overlapping symbols	✅ NMS handles	✅ Yes	❌ No	❌ Fails

YOLOv8 achieves high accuracy in P&ID symbol recognition and is proven effective for automating the identification of symbols in Piping and Instrumentation Diagrams. It also trains fast, deploys anywhere, and its Python API via Ultralytics makes the entire pipeline clean to maintain.

The Core Challenge: Why P&IDs Break Standard Models

Before writing any code, understand the specific challenges that make P&ID symbol detection harder than standard object detection:

1. Extreme Symbol Density

P&IDs pack dozens to hundreds of symbols onto a single sheet. Symbols overlap, share boundary regions, and are separated by pipeline lines rather than whitespace. Standard COCO-trained models assume objects are surrounded by background — P&IDs have almost no background.

2. No Large Public Dataset

Unlike natural image datasets where millions of labeled photos exist, there is no large public dataset of labeled engineering drawings. You must build or augment your own annotated dataset. This is the single biggest bottleneck.

3. Symbol Variation Across Standards

P&ID symbols vary by standard (ISA 5.1, ISO 14617), by company-specific symbol libraries, and by decade (1970s drawings look different from 2020s CAD exports). A model trained on one company's symbols may fail on another's without retraining.

4. High-Resolution Images

A single P&ID sheet may be 7000 × 4500 pixels or larger. Standard YOLOv8 training uses 640px images. Processing P&IDs at native resolution requires a tiled inference strategy.

5. Small Objects

Instrument tags like FIC-101A next to a 40×40 pixel valve symbol must both be detected reliably. Small object detection requires specific model configuration.

Environment Setup


pip install ultralytics opencv-python numpy pillow \
            matplotlib labelImg pyyaml torch torchvision

Verify GPU:


import torch
print(torch.cuda.is_available())      # True
print(torch.cuda.get_device_name(0))  # NVIDIA RTX 3090 / A100 etc.

YOLOv8 requires CUDA for practical training speeds. On CPU, a single epoch on 500 images takes ~45 minutes. On GPU it takes ~2 minutes.

Step 1 — Dataset Preparation

Option A: Use the Digitize-PID Synthetic Dataset (Fastest Start)

A synthetic dataset of 500 annotated P&ID sheets with 32 symbol classes is publicly available from the Digitize-PIDresearch paper. This dataset includes sample images in JPEG format with label annotations and bounding boxes for each piece of text and symbol in the image.

This is the fastest way to get a working model. Download, convert to YOLO format, and train. Accuracy on real P&IDs from this baseline will be 65–75% — good enough to validate the approach, not good enough for production.

Option B: Build Your Own Dataset (Production Quality)

For production accuracy (90%+), you need annotated samples from your actual P&ID documents.

Recommended annotation tool: LabelImg (free, outputs YOLO format directly)

Minimum samples per class:

50 images per symbol class for acceptable accuracy
100–200 images per class for production accuracy
More is always better — quality matters more than quantity

Annotation workflow:


Raw P&ID sheet (high-res PDF/TIFF)
        ↓
Convert to PNG at 300 DPI
        ↓
Tile into 1280×1280 patches (with 20% overlap)
        ↓
Annotate each patch in LabelImg (YOLO format)
        ↓
Collect .txt annotation files
        ↓
Train/val split (80/20)

Why tile? P&IDs at 300 DPI produce images too large for GPU memory at once. Tiling into 1280×1280 patches lets you process the full document while keeping each training sample GPU-friendly.



import cv2
import numpy as np
from pathlib import Path

def tile_image(img_path: str, tile_size: int = 1280,
               overlap: float = 0.2) -> list[tuple]:
    """
    Tile a large P&ID image into overlapping patches for annotation.
    Returns list of (patch_img, x_offset, y_offset) tuples.
    """
    img = cv2.imread(img_path)
    h, w = img.shape[:2]
    step = int(tile_size * (1 - overlap))
    tiles = []

    for y in range(0, h, step):
        for x in range(0, w, step):
            x2 = min(x + tile_size, w)
            y2 = min(y + tile_size, h)
            patch = img[y:y2, x:x2]

            # Pad to tile_size if edge patch
            if patch.shape[0] < tile_size or patch.shape[1] < tile_size:
                padded = np.zeros((tile_size, tile_size, 3), dtype=np.uint8)
                padded[:patch.shape[0], :patch.shape[1]] = patch
                patch = padded

            tiles.append((patch, x, y))

    return tiles

Step 2 — Symbol Classes (ISA Standard)

Define your symbol taxonomy before annotating. For ISA 5.1 compliant P&IDs, common classes include:


# pid_symbols.yaml — dataset configuration

path: ./datasets/pid
train: images/train
val: images/val
test: images/test

nc: 32  # Number of symbol classes

names:
  0: gate_valve
  1: ball_valve
  2: butterfly_valve
  3: check_valve
  4: control_valve
  5: globe_valve
  6: needle_valve
  7: plug_valve
  8: safety_relief_valve
  9: pump_centrifugal
  10: pump_reciprocating
  11: compressor
  12: heat_exchanger_shell_tube
  13: heat_exchanger_plate
  14: vessel_vertical
  15: vessel_horizontal
  16: tank_atmospheric
  17: filter_strainer
  18: indicator_generic
  19: transmitter_generic
  20: controller_generic
  21: recorder_generic
  22: flow_element
  23: level_gauge
  24: pressure_gauge
  25: temperature_element
  26: actuator_pneumatic
  27: actuator_electric
  28: signal_line_pneumatic
  29: signal_line_electric
  30: reducer_concentric
  31: blind_flange

Pro tip: Start with 10–15 most common symbols in your specific P&ID library rather than all 32 at once. A model with 92% accuracy on 12 classes beats 70% accuracy on 32 classes every time.

Step 3 — Dataset Directory Structure

YOLO expects a specific directory layout:

datasets/pid/
├── images/
│   ├── train/
│   │   ├── pid_001_tile_0_0.png
│   │   ├── pid_001_tile_0_1.png
│   │   └── ...
│   ├── val/
│   │   └── ...
│   └── test/
│       └── ...
└── labels/
    ├── train/
    │   ├── pid_001_tile_0_0.txt
    │   ├── pid_001_tile_0_1.txt
    │   └── ...
    ├── val/
    │   └── ...
    └── test/
        └── ...

Each .txt label file contains one row per symbol in that image tile:


# Format: class_id center_x center_y width height (all normalised 0–1)
4 0.523 0.341 0.042 0.038    # control_valve
0 0.712 0.198 0.031 0.029    # gate_valve
20 0.381 0.556 0.055 0.051   # controller_generic

Script to verify your dataset structure:


from pathlib import Path
import yaml

def verify_dataset(yaml_path: str):
    with open(yaml_path) as f:
        config = yaml.safe_load(f)

    base = Path(config['path'])
    issues = []

    for split in ['train', 'val']:
        img_dir = base / 'images' / split
        lbl_dir = base / 'labels' / split

        imgs = list(img_dir.glob('*.png')) + list(img_dir.glob('*.jpg'))
        lbls = list(lbl_dir.glob('*.txt'))

        print(f"{split}: {len(imgs)} images, {len(lbls)} labels")

        for img in imgs:
            lbl = lbl_dir / (img.stem + '.txt')
            if not lbl.exists():
                issues.append(f"Missing label: {img.name}")

    if issues:
        print(f"\n{len(issues)} issues found:")
        for i in issues[:10]:
            print(f"  {i}")
    else:
        print("\nDataset structure valid.")

verify_dataset('pid_symbols.yaml')

Step 4 — Training Configuration

YOLOv8 has multiple model sizes. For P&ID symbol detection:

Model	Parameters	Speed	Accuracy	Best for
yolov8n	3.2M	Fastest	Lowest	Prototyping only
yolov8s	11.2M	Fast	Good	Quick validation
yolov8m	25.9M	Moderate	Better	Recommended
yolov8l	43.7M	Slow	High	High accuracy needs
yolov8x	68.2M	Slowest	Highest	Maximum accuracy

Use yolov8m as your starting point. It balances training time and accuracy well for P&ID-sized datasets.


from ultralytics import YOLO

# Load pretrained model (downloads ~25MB weights)
model = YOLO('yolov8m.pt')

# Train on P&ID symbol dataset
results = model.train(
    data='pid_symbols.yaml',

    # Image size — critical for P&ID tiles
    imgsz=1280,          # Must match your tile size

    # Training duration
    epochs=150,
    patience=30,         # Early stopping if no improvement

    # Batch size — reduce if GPU OOM
    batch=8,             # RTX 3090: 8-16 | A100: 16-32

    # Optimisation
    optimizer='AdamW',
    lr0=0.001,           # Initial learning rate
    lrf=0.01,            # Final LR = lr0 * lrf
    warmup_epochs=5,

    # Augmentation — critical for P&ID robustness
    augment=True,
    degrees=15,          # Rotation (P&ID symbols can be rotated)
    scale=0.5,           # Scale variation
    fliplr=0.5,          # Horizontal flip
    flipud=0.0,          # No vertical flip (text would invert)
    mosaic=0.8,          # Mosaic augmentation
    copy_paste=0.3,      # Copy-paste augmentation

    # Device
    device='cuda',       # 'cpu' if no GPU

    # Output
    project='pid_detection',
    name='yolov8m_run1',
    save=True,
    plots=True,

    # Multi-scale training (improves small object detection)
    multi_scale=True,
)

print(f"Best mAP50: {results.results_dict['metrics/mAP50(B)']:.3f}")

Key Training Parameters for P&IDs

imgsz=1280 — Do not use 640. P&ID symbols are small relative to the full document. At 640px input, symbols that are 40×40 pixels in the original become 20×20 — below the reliable detection threshold for most models.

degrees=15 — P&ID symbols are sometimes drawn at slight angles, especially in scanned legacy documents. Rotation augmentation makes the model robust to this.

flipud=0.0 — Never flip vertically. Instrument tags and symbol labels would become mirrored text, confusing the model.

multi_scale=True — Trains on randomly resized images within ±50% of imgsz. Significantly improves small object detection.

Step 5 — Monitor Training

Training outputs are saved to pid_detection/yolov8m_run1/. Key files to watch:


pid_detection/yolov8m_run1/
├── weights/
│   ├── best.pt       ← Use this for inference
│   └── last.pt       ← Last epoch checkpoint
├── results.csv       ← Metrics per epoch
└── plots/
    ├── confusion_matrix.png
    ├── PR_curve.png
    └── results.png   ← Loss + mAP curves

Healthy training looks like:

Box loss and classification loss decrease steadily for ~50 epochs
mAP50 climbs above 0.80 by epoch 100
No divergence or plateau before epoch 50

If mAP plateaus below 0.70 at epoch 50:

Add more training samples (most common fix)
Increase epochs to 200
Check annotation quality — mislabelled samples are more damaging than fewer samples

Step 6 — Tiled Inference on Full P&ID Sheets

The biggest production challenge: running inference on a full P&ID sheet that is 7000+ pixels wide.


import cv2
import numpy as np
from ultralytics import YOLO
from pathlib import Path

model = YOLO('pid_detection/yolov8m_run1/weights/best.pt')

def detect_pid_symbols(
    image_path: str,
    tile_size: int = 1280,
    overlap: float = 0.2,
    conf_threshold: float = 0.35,
    iou_threshold: float = 0.45
) -> list[dict]:
    """
    Run tiled inference on a full P&ID sheet.
    Handles overlapping tiles via global NMS.
    """
    img = cv2.imread(image_path)
    h, w = img.shape[:2]
    step = int(tile_size * (1 - overlap))

    all_detections = []

    for y in range(0, h, step):
        for x in range(0, w, step):
            x2 = min(x + tile_size, w)
            y2 = min(y + tile_size, h)
            tile = img[y:y2, x:x2]

            # Pad edge tiles
            if tile.shape[0] < tile_size or tile.shape[1] < tile_size:
                padded = np.zeros((tile_size, tile_size, 3), dtype=np.uint8)
                padded[:tile.shape[0], :tile.shape[1]] = tile
                tile = padded

            # Run inference on this tile
            results = model.predict(
                tile,
                conf=conf_threshold,
                iou=iou_threshold,
                verbose=False
            )

            # Convert tile-local coordinates to global image coordinates
            for result in results:
                for box in result.boxes:
                    bx1, by1, bx2, by2 = box.xyxy[0].tolist()

                    # Offset back to global coordinates
                    gx1 = x + bx1
                    gy1 = y + by1
                    gx2 = x + bx2
                    gy2 = y + by2

                    # Skip detections in padding area
                    if gx1 >= w or gy1 >= h:
                        continue

                    all_detections.append({
                        'class_id': int(box.cls[0]),
                        'class_name': model.names[int(box.cls[0])],
                        'confidence': float(box.conf[0]),
                        'bbox_global': [gx1, gy1, gx2, gy2],
                        'center': [(gx1 + gx2) / 2, (gy1 + gy2) / 2]
                    })

    # Apply global NMS to remove duplicate detections from overlapping tiles
    all_detections = apply_global_nms(all_detections, iou_threshold=0.4)

    return all_detections


def apply_global_nms(detections: list[dict],
                     iou_threshold: float = 0.4) -> list[dict]:
    """
    Remove duplicate detections from overlapping tiles using NMS.
    """
    if not detections:
        return []

    boxes = np.array([d['bbox_global'] for d in detections])
    scores = np.array([d['confidence'] for d in detections])
    class_ids = np.array([d['class_id'] for d in detections])

    keep = []
    for cls_id in np.unique(class_ids):
        cls_mask = class_ids == cls_id
        cls_boxes = boxes[cls_mask]
        cls_scores = scores[cls_mask]
        cls_indices = np.where(cls_mask)[0]

        # NMS per class
        nms_keep = nms(cls_boxes, cls_scores, iou_threshold)
        keep.extend([cls_indices[i] for i in nms_keep])

    return [detections[i] for i in sorted(keep)]


def nms(boxes: np.ndarray, scores: np.ndarray,
        threshold: float) -> list[int]:
    """Standard Non-Maximum Suppression."""
    x1, y1, x2, y2 = boxes[:, 0], boxes[:, 1], boxes[:, 2], boxes[:, 3]
    areas = (x2 - x1) * (y2 - y1)
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)

        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        w = np.maximum(0, xx2 - xx1)
        h = np.maximum(0, yy2 - yy1)
        inter = w * h

        iou = inter / (areas[i] + areas[order[1:]] - inter)
        order = order[1:][iou <= threshold]

    return keep

Step 7 — Associate Symbols with Instrument Tags

Detecting a valve is only half the job. The valve needs to be linked to its instrument tag — the text label nearby that identifies it as FCV-201 or XV-103.

This is done by spatial proximity: for each detected symbol, find the nearest OCR text block and associate them.



def associate_tags_to_symbols(
    symbols: list[dict],
    ocr_words: list[dict],
    max_distance_px: int = 80
) -> list[dict]:
    """
    Associate each detected symbol with its nearest instrument tag
    from the OCR output.

    symbols: list of detections from detect_pid_symbols()
    ocr_words: list of {text, center, confidence, bbox} from OCR pipeline
    max_distance_px: max pixel distance to search for a tag
    """
    import re

    # Instrument tag pattern (ISA 5.1)
    tag_pattern = re.compile(
        r'\b[A-Z]{1,4}-\d{3,5}[A-Z]?\b'  # e.g. FIC-201, XV-1032A
    )

    enriched = []
    for symbol in symbols:
        sx, sy = symbol['center']
        nearest_tag = None
        nearest_tag_conf = 0.0
        min_dist = float('inf')

        for word in ocr_words:
            # Only consider instrument tag-formatted text
            if not tag_pattern.match(word['text']):
                continue

            wx, wy = word['center']
            dist = ((wx - sx) ** 2 + (wy - sy) ** 2) ** 0.5

            if dist < min_dist and dist <= max_distance_px:
                min_dist = dist
                nearest_tag = word['text']
                nearest_tag_conf = word['confidence']

        enriched.append({
            **symbol,
            'instrument_tag': nearest_tag,
            'tag_confidence': nearest_tag_conf,
            'tag_distance_px': round(min_dist, 1) if nearest_tag else None
        })

    return enriched

Output example:



{
  "class_name": "control_valve",
  "confidence": 0.94,
  "bbox_global": [1240, 880, 1310, 950],
  "center": [1275, 915],
  "instrument_tag": "FCV-201",
  "tag_confidence": 0.91,
  "tag_distance_px": 38.2
}

Step 8 — Evaluation: Precision, Recall & mAP

Evaluate your trained model systematically. Never deploy based on visual inspection alone.


from ultralytics import YOLO

model = YOLO('pid_detection/yolov8m_run1/weights/best.pt')

# Evaluate on test set
metrics = model.val(
    data='pid_symbols.yaml',
    split='test',
    conf=0.35,
    iou=0.50,
    imgsz=1280,
    verbose=True
)

print(f"mAP50:    {metrics.box.map50:.3f}")
print(f"mAP50-95: {metrics.box.map:.3f}")
print(f"Precision: {metrics.box.mp:.3f}")
print(f"Recall:    {metrics.box.mr:.3f}")

# Per-class breakdown
for i, cls_name in model.names.items():
    ap = metrics.box.ap50[i] if i < len(metrics.box.ap50) else 0
    print(f"  {cls_name:30s} AP50: {ap:.3f}")

Production Benchmarks to Target

Metric	Acceptable	Good	Production-ready
mAP50	>0.70	>0.82	>0.90
Precision	>0.75	>0.85	>0.92
Recall	>0.70	>0.82	>0.88

If recall is low but precision is high, lower the confidence threshold. If precision is low, raise it. The right threshold depends on your use case — high recall matters more when missing a symbol is worse than a false positive, which is usually the case in engineering documents.

Step 9 — Export for Production Deployment

Export the trained model to ONNX for cloud-agnostic deployment:



from ultralytics import YOLO

model = YOLO('pid_detection/yolov8m_run1/weights/best.pt')

# Export to ONNX (fastest cross-platform inference)
model.export(
    format='onnx',
    imgsz=1280,
    opset=17,
    simplify=True,
    dynamic=False
)

# Or TorchScript for PyTorch serving
model.export(format='torchscript', imgsz=1280)

# Or TensorRT for NVIDIA GPU deployment (fastest on GPU)
model.export(format='engine', imgsz=1280, half=True)  # FP16

Load ONNX model for inference without Ultralytics dependency:


import onnxruntime as ort
import numpy as np
import cv2

session = ort.InferenceSession(
    'best.onnx',
    providers=['CUDAExecutionProvider', 'CPUExecutionProvider']
)

def preprocess_for_onnx(img: np.ndarray, size: int = 1280) -> np.ndarray:
    img = cv2.resize(img, (size, size))
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = img.astype(np.float32) / 255.0
    img = np.transpose(img, (2, 0, 1))
    return np.expand_dims(img, axis=0)

Complete Pipeline: P&ID to Structured Output

Putting it all together — from raw P&ID image to structured JSON:



def process_pid_complete(
    image_path: str,
    ocr_words: list[dict]
) -> dict:
    """
    Full pipeline: P&ID image → detected symbols → associated tags → JSON
    """

    # 1. Detect symbols
    symbols = detect_pid_symbols(image_path)

    # 2. Associate with instrument tags from OCR
    enriched = associate_tags_to_symbols(symbols, ocr_words)

    # 3. Group by symbol class
    by_class = {}
    for sym in enriched:
        cls = sym['class_name']
        by_class.setdefault(cls, []).append({
            'tag': sym['instrument_tag'],
            'confidence': round(sym['confidence'], 3),
            'bbox': sym['bbox_global']
        })

    # 4. Summary statistics
    total = len(enriched)
    with_tags = sum(1 for s in enriched if s['instrument_tag'])
    avg_conf = sum(s['confidence'] for s in enriched) / total if total else 0

    return {
        'symbol_count': total,
        'tagged_count': with_tags,
        'tagging_rate': round(with_tags / total, 3) if total else 0,
        'avg_confidence': round(avg_conf, 3),
        'symbols_by_class': by_class,
        'all_detections': enriched
    }

Sample output:



{
  "symbol_count": 147,
  "tagged_count": 138,
  "tagging_rate": 0.939,
  "avg_confidence": 0.887,
  "symbols_by_class": {
    "control_valve": [
      { "tag": "FCV-201", "confidence": 0.94, "bbox": [1240, 880, 1310, 950] },
      { "tag": "PCV-301", "confidence": 0.91, "bbox": [2100, 1240, 2170, 1310] }
    ],
    "pump_centrifugal": [
      { "tag": "P-101A", "confidence": 0.96, "bbox": [540, 1820, 650, 1930] }
    ]
  }
}

Common Issues & Fixes

Low recall on small symbols (valves <40px) → Increase imgsz to 1280 or 1600. Add more annotated examples of small instances. Enable multi_scale=True.

False positives on pipeline lines → Add a pipeline_line class and annotate it as a negative class. This teaches the model what pipeline lines look like so it stops confusing them with symbols.

Model fails on a different company's P&IDs → Domain shift is expected. Annotate 30–50 samples from the new P&ID set and fine-tune the existing model (transfer learning) rather than retraining from scratch:


model = YOLO('pid_detection/yolov8m_run1/weights/best.pt')  # Load existing
model.train(data='new_company_pid.yaml', epochs=50, lr0=0.0001)  # Fine-tune

Duplicate detections from overlapping tiles → The apply_global_nms() function in Stage 6 handles this. Tune iou_threshold downward (0.3) if duplicates persist.

GPU out of memory → Reduce batch from 8 to 4 or 2. Or reduce imgsz from 1280 to 960 as a compromise.

What This Pipeline Doesn't Cover

Symbol detection gives you a list of detected symbols with bounding boxes and instrument tags. For a complete P&ID digitisation system you also need:

Line detection — identifying pipeline connections between symbols (graph extraction)
Line type classification — distinguishing process lines, signal lines, utility lines
Connection graph construction — building the P&ID as a graph where nodes are instruments/equipment and edges are pipelines

These are covered in the complete document intelligence pipeline guide → and in docprocessing360.com where the full stack runs live.

Live Demo

The symbol detection model described in this guide runs as part of the complete document intelligence stack at:

👉 docprocessing360.com

Upload a scanned P&ID and see detected symbols highlighted with bounding boxes, class labels, confidence scores, and associated instrument tags — in real time.

Build It With Codersarts

We train, deploy, and maintain custom YOLOv8 symbol detection models for engineering clients — including fine-tuning for company-specific P&ID symbol libraries, integration with OCR pipelines, and active learning systems that improve accuracy over time.

🌐 ai.codersarts.com
🔗 Live Demo: docprocessing360.com
💼 C2C / Contract engagements available

Tags: P&ID symbol detection, YOLOv8 PyTorch engineering documents, piping instrumentation diagram AI, object detection P&ID, YOLOv8 custom training, P&ID digitization deep learning, instrument tag detection computer vision, engineering drawing object detection, tiled inference large images YOLOv8