Optical character recognition was added to Tiki20 and will be improved massively in Tiki21 via this wrapper to work with Tesseract OCR inside PHP.
See also: http://we-love-php.blogspot.ca/2013/01/make-pdfs-searchable.html -> cool but not planned. The OCR gathered-data will live in the Tiki database.
https://en.wikipedia.org/wiki/DjVu -> Could be an interesting format