Finding audio and video resources in internet has become a highly demanded application.

However, search engines are usually limited to adjacent texts (hand supplied transcripts or close captions) to index and classify multimedia documents.

Hearch is a multilingual (Basque, Spanish, English) spoken document retrieval system that using automatic speech recognition and natural language processing technologies obtains more accurate indexes and more focused search results.

