bad result detected #114

DamonsJ · 2022-01-18T03:46:08Z

I got bad result using layout-parser
here is the image I am used:

here is the code run in python :

image = cv2.imread("1.png")
# Convert the image from BGR (cv2 default loading style)
# to RGB
image = image[..., ::-1]
origin_image = image.copy()

model = lp.Detectron2LayoutModel('lp://PubLayNet/mask_rcnn_R_50_FPN_3x/config', 
                             extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                             label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})
# Load the deep layout model from the layoutparser API 
# For all the supported model, please check the Model 
# Zoo Page: https://layout-parser.readthedocs.io/en/latest/notes/modelzoo.html

layout = model.detect(image)
# print("layout : ", layout)
# Detect the layout of the input image
text_blocks = lp.Layout([b for b in layout if b.type=='Text'])
drawRectangleInImage(origin_image, text_blocks, (36,255,12))

titles_blocks = lp.Layout([b for b in layout if b.type=='Title'])
drawRectangleInImage(origin_image, titles_blocks, (76, 155, 175))

figure_blocks = lp.Layout([b for b in layout if b.type=='Figure'])
drawRectangleInImage(origin_image, figure_blocks, (122, 96, 216))

lists_blocks = lp.Layout([b for b in layout if b.type=='List'])
drawRectangleInImage(origin_image, lists_blocks, (176, 155, 175))

tables_blocks = lp.Layout([b for b in layout if b.type=='Table'])
drawRectangleInImage(origin_image, tables_blocks, (76, 255, 75))

cv2.imshow('image', origin_image)
cv2.waitKey()

here is the result:

by the way ：

there is some warning generated ：

/usr/local/lib/python3.9/site-packages/detectron2/structures/image_list.py:99: UserWarning: floordiv is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
max_size = (max_size + (stride - 1)) // stride * stride
/usr/local/lib/python3.9/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]

The text was updated successfully, but these errors were encountered:

lolipopshock · 2022-01-18T21:12:14Z

Thank you for reporting this -- it can be easily resolved by reconfiguring the models hyperparameters, and one example is: https://github.com/allenai/VILA/blob/96cafe591ae6ee8a70f941a52dd37bbe0a60b243/datasets/s2-vl-utils/vision_model_loader.py#L140 .

DamonsJ · 2022-01-19T02:24:08Z

Thank you for reporting this -- it can be easily resolved by reconfiguring the models hyperparameters, and one example is: https://github.com/allenai/VILA/blob/96cafe591ae6ee8a70f941a52dd37bbe0a60b243/datasets/s2-vl-utils/vision_model_loader.py#L140 .

Hi, thanks very much for replying
I just want to recognize text, figure and table from published document.
how should I adjust the parameters?
when I use the extra config in :https://github.com/allenai/VILA/blob/96cafe591ae6ee8a70f941a52dd37bbe0a60b243/datasets/s2-vl-utils/vision_model_loader.py#L140 .

I can recognize text , figure , but math equation can not be recognized.

Thanks!

lolipopshock · 2022-01-21T16:02:42Z

There's a separate model Layout-Parser/platform#20 which can be used for detecting equation regions. Also see the code here https://github.com/allenai/VILA/blob/96cafe591ae6ee8a70f941a52dd37bbe0a60b243/datasets/s2-vl-utils/vision_model_loader.py#L150

DamonsJ added the bug label Jan 18, 2022

lolipopshock closed this as completed Jan 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bad result detected #114

bad result detected #114

DamonsJ commented Jan 18, 2022

lolipopshock commented Jan 18, 2022

DamonsJ commented Jan 19, 2022

lolipopshock commented Jan 21, 2022

bad result detected #114

bad result detected #114

Comments

DamonsJ commented Jan 18, 2022

lolipopshock commented Jan 18, 2022

DamonsJ commented Jan 19, 2022

lolipopshock commented Jan 21, 2022