Information about each word or line of text in the input document.

For additional information, see Block in the Amazon Textract API reference.

interface Block {
    BlockType?: BlockType;
    Geometry?: Geometry;
    Id?: string;
    Page?: number;
    Relationships?: RelationshipsListItem[];
    Text?: string;
}

Properties

BlockType?: BlockType

The block represents a line of text or one word of text.

  • WORD - A word that's detected on a document page. A word is one or more ISO basic Latin script characters that aren't separated by spaces.

  • LINE - A string of tab-delimited, contiguous words that are detected on a document page

Geometry?: Geometry

Co-ordinates of the rectangle or polygon that contains the text.

Id?: string

Unique identifier for the block.

Page?: number

Page number where the block appears.

Relationships?: RelationshipsListItem[]

A list of child blocks of the current block. For example, a LINE object has child blocks for each WORD block that's part of the line of text.

Text?: string

The word or line of text extracted from the block.