Skip to content

mask

Mask creation and manipulation for page regions and text/line detection.

This module provides:

  • A helper function (box) to generate structuring elements for morphological ops.
  • A Mask class, which thresholds the page image, applies morphological operations, and blends the resulting mask with a page mask.
  • An interface to retrieve the final contours from this mask.

Mask

Mask(name: str, small: ndarray, pagemask: ndarray, text: bool | str = True)

A thresholded mask builder for text (or line) detection.

Combines adaptive thresholding, morphological dilations/erosions, and a given pagemask to produce the final mask used for contour extraction.

Parameters:

Name Type Description Default
name str

A string identifier for debugging/logging.

required
small ndarray

A reduced-size (downsampled) version of the original image.

required
pagemask ndarray

A binary mask indicating the valid page region.

required
text bool | str

If True, process as text; if False, process as lines.

True
Source code in src/page_dewarp/mask.py
def __init__(
    self,
    name: str,
    small: np.ndarray,
    pagemask: np.ndarray,
    text: bool | str = True,
) -> None:
    """Initialize the Mask with the given image data and type.

    Args:
        name: A string identifier for debugging/logging.
        small: A reduced-size (downsampled) version of the original image.
        pagemask: A binary mask indicating the valid page region.
        text: If True, process as text; if False, process as lines.

    """
    self.name = name
    self.small = small
    self.pagemask = pagemask
    self.text = text
    self.calculate()

calculate

calculate() -> None

Apply adaptive thresholding and morphological ops to create self.value.

Steps:

  1. Convert self.small to grayscale.
  2. Use an adaptive threshold (binary inverse).
  3. Depending on self.text, either dilate or erode the result, log intermediate steps.
  4. Combine with self.pagemask to finalize the mask (store in self.value).
Source code in src/page_dewarp/mask.py
def calculate(self) -> None:
    """Apply adaptive thresholding and morphological ops to create `self.value`.

    Steps:

    1. Convert `self.small` to grayscale.
    2. Use an adaptive threshold (binary inverse).
    3. Depending on `self.text`, either dilate or erode the result, log intermediate steps.
    4. Combine with `self.pagemask` to finalize the mask (store in `self.value`).
    """
    sgray = cvtColor(self.small, COLOR_RGB2GRAY)
    mask = adaptiveThreshold(
        src=sgray,
        maxValue=255,
        adaptiveMethod=ADAPTIVE_THRESH_MEAN_C,
        thresholdType=THRESH_BINARY_INV,
        blockSize=cfg.ADAPTIVE_WINSZ,
        C=25 if self.text else 7,
    )
    self.log(0.1, "thresholded", mask)

    # If text, dilate horizontally; if lines, erode to remove noise
    mask = (
        dilate(mask, box(9, 1))
        if self.text
        else erode(mask, box(3, 1), iterations=3)
    )
    self.log(0.2, "dilated" if self.text else "eroded", mask)

    # If text, erode vertically; if lines, dilate further
    mask = erode(mask, box(1, 3)) if self.text else dilate(mask, box(8, 2))
    self.log(0.3, "eroded" if self.text else "dilated", mask)

    self.value = np.minimum(mask, self.pagemask)

log

log(step: float, text: str, display: ndarray) -> None

Optionally display or log the intermediate mask state at a given step.

Parameters:

Name Type Description Default
step float

A numeric code or fraction indicating the process step.

required
text str

A label describing what operation was just done (e.g. 'dilated').

required
display ndarray

The mask or image array to show for debugging.

required
Source code in src/page_dewarp/mask.py
def log(self, step: float, text: str, display: np.ndarray) -> None:
    """Optionally display or log the intermediate mask state at a given step.

    Args:
        step: A numeric code or fraction indicating the process step.
        text: A label describing what operation was just done (e.g. 'dilated').
        display: The mask or image array to show for debugging.

    """
    if cfg.DEBUG_LEVEL >= 3:
        if not self.text:
            # text images from 0.1 to 0.3, table images from 0.4 to 0.6
            step += 0.3
        debug_show(self.name, step, text, display)

contours

contours() -> list[ContourInfo]

Extract the final contours from self.value.

Calls get_contours to find external contours in the thresholded, morphological-processed mask stored in self.value.

Returns:

Type Description
list[ContourInfo]

A list of ContourInfo objects describing each discovered contour.

Source code in src/page_dewarp/mask.py
def contours(self) -> list[ContourInfo]:
    """Extract the final contours from `self.value`.

    Calls `get_contours` to find external contours in the thresholded,
    morphological-processed mask stored in `self.value`.

    Returns:
        A list of ContourInfo objects describing each discovered contour.

    """
    return get_contours(self.name, self.small, self.value)

box

box(width: int, height: int) -> np.ndarray

Return a structuring element of ones with shape (height, width).

Used in morphological operations (e.g., dilate, erode).

Source code in src/page_dewarp/mask.py
def box(width: int, height: int) -> np.ndarray:
    """Return a structuring element of ones with shape (height, width).

    Used in morphological operations (e.g., dilate, erode).
    """
    return np.ones((height, width), dtype=np.uint8)