CorpusA comprehensive collection of text or speech data, typically used to train machines for linguistic tasks.