Describes the documents submitted with a dataset for an entity recognizer model.

interface DatasetEntityRecognizerDocuments {
    InputFormat?: InputFormat;
    S3Uri: undefined | string;
}

Properties

Properties

InputFormat?: InputFormat

Specifies how the text in an input file should be processed. This is optional, and the default is ONE_DOC_PER_LINE. ONE_DOC_PER_FILE - Each file is considered a separate document. Use this option when you are processing large documents, such as newspaper articles or scientific papers. ONE_DOC_PER_LINE - Each line in a file is considered a separate document. Use this option when you are processing many short documents, such as text messages.

S3Uri: undefined | string

Specifies the Amazon S3 location where the documents for the dataset are located.