Improper results on scanned pdfs

I have been trying to analyze the documents using layout parser on different types of documents, I am able to get expected results on True pdfs but not on scanned pdfs, it is detecting the scanned pdf image contents as figure or not as expected results.

I am facing this issue only for the scanned pdfs 

**Checklist**

1. I have searched related issues but cannot get the expected help.
2. The bug has not been fixed in the latest version, see the [Layout Parser Releases](https://github.com/Layout-Parser/layout-parser/releases/)

**To Reproduce**

import layoutparser as lp
import cv2

image = cv2.imread("test.png")
image = image[..., ::-1] 

model = lp.models.Detectron2LayoutModel('lp://PubLayNet/faster_rcnn_R_50_FPN_3x/config', 
                                 extra_config=["MODEL.ROI_HEADS.SCORE_THRESH_TEST", 0.8],
                                 label_map={0: "Text", 1: "Title", 2: "List", 3:"Table", 4:"Figure"})

color_map = {
'Text': 'red',
'Title': 'blue',
'List': 'green',
'Table': 'purple',
'Figure': 'pink',
}

layout = model.detect(image)

lp.draw_box(image, layout, box_width=3,color_map=color_map)

**Environment**
1. I am using windows
2. Latest layout parser version


Contains 2 images:

1:  Scanned pdf image result
2:  Proper pdf image result
![error](https://github.com/Layout-Parser/layout-parser/assets/88659756/955be63a-d290-485e-8eb5-c7edc56ef1af)
![positive](https://github.com/Layout-Parser/layout-parser/assets/88659756/bc8655da-d478-4b67-be44-4864cd4f79ba)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improper results on scanned pdfs #193

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improper results on scanned pdfs #193

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions