Load COCO Layout Annotations ============================================================== Preparation ----------- In this notebook, I will illustrate how to use LayoutParser to load and visualize the layout annotation in the COCO format. Before starting, please remember to download PubLayNet annotations and images from their `website <https://dax-cdn.cdn.appdomain.cloud/dax-publaynet/1.0.0/PubLayNet.html>`__ (let’s just use the validation set for now as the training set is very large). And let’s put all extracted files in the ``data/publaynet/annotations`` and ``data/publaynet/val`` folder. And we need to install an additional library for conveniently handling the COCO data format: .. code:: bash pip install pycocotools OK - Let’s get on the code: Loading and visualizing layouts using Layout-Parser --------------------------------------------------- .. code:: python from pycocotools.coco import COCO import layoutparser as lp import random import cv2 .. code:: python def load_coco_annotations(annotations, coco=None): """ Args: annotations (List): a list of coco annotaions for the current image coco (`optional`, defaults to `False`): COCO annotation object instance. If set, this function will convert the loaded annotation category ids to category names set in COCO.categories """ layout = lp.Layout() for ele in annotations: x, y, w, h = ele['bbox'] layout.append( lp.TextBlock( block = lp.Rectangle(x, y, w+x, h+y), type = ele['category_id'] if coco is None else coco.cats[ele['category_id']]['name'], id = ele['id'] ) ) return layout The ``load_coco_annotations`` function will help convert COCO annotations into the layoutparser objects. .. code:: python COCO_ANNO_PATH = 'data/publaynet/annotations/val.json' COCO_IMG_PATH = 'data/publaynet/val' coco = COCO(COCO_ANNO_PATH) .. parsed-literal:: loading annotations into memory... Done (t=1.17s) creating index... index created! .. code:: python color_map = { 'text': 'red', 'title': 'blue', 'list': 'green', 'table': 'purple', 'figure': 'pink', } for image_id in random.sample(coco.imgs.keys(), 1): image_info = coco.imgs[image_id] annotations = coco.loadAnns(coco.getAnnIds([image_id])) image = cv2.imread(f'{COCO_IMG_PATH}/{image_info["file_name"]}') layout = load_coco_annotations(annotations, coco) viz = lp.draw_box(image, layout, color_map=color_map) display(viz) # show the results .. image:: output_8_0.png You could add more information in the visualization. .. code:: python lp.draw_box(image, [b.set(id=f'{b.id}/{b.type}') for b in layout], color_map=color_map, show_element_id=True, id_font_size=10, id_text_background_color='grey', id_text_color='white') .. image:: output_10_0.png Model Predictions on loaded data -------------------------------- We could also check how the trained layout model performs on the input image. Following this `instruction <https://github.com/Layout-Parser/layout-parser/blob/main/examples/Deep%20Layout%20Parsing.ipynb>`__, we could conveniently load a layout prediction model and run predictions on the existing image. .. code:: python model = lp.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8], label_map={0: "text", 1: "title", 2: "list", 3:"table", 4:"figure"}) .. code:: python layout_predicted = model.detect(image) .. code:: python lp.draw_box(image, [b.set(id=f'{b.type}/{b.score:.2f}') for b in layout_predicted], color_map=color_map, show_element_id=True, id_font_size=10, id_text_background_color='grey', id_text_color='white') .. image:: output_15_0.png