how to enable Full text PDF OCR search and partial word search OpenMediaVault

In our LAPTOP to NAS series we sucessefully converted our old laptop to a small  NAS without spending much.

Now added functionality series, let’s check about real good addons to Openmediavault NAS for free. Full file search along with pdfs is a must-have feature for most of us.

What is a Full-text search along with a PDF search with OCR.?

When you are NAS keep on filling and one day you want to retrieve some text from the backup file. The problem is you dint remember the file name or file type.

How you retrieve data by what you remember are a few keywords Full-text search along with PDF search will come in to picture to rescue. It searches file contents of type txt and pdfs for keywords and list them to you.l Wonderful?

How to enable Full-text search along with pdf file search with partial words search for free.

Step 1: Assumptions: OMV 5. x along with DOcket and portainer.

with combination Apache Tika and elasticsearch  in docker version will achieve this with FileRun. It is self-hosted file sharing and multi-device device sync software.

Step 2: Docker-Compose file for FileRun to deploy on Openmediavault

version: ‘2’
services:

db:
image: mariadb:10.1
environment:
MYSQL_ROOT_PASSWORD: yourpassword
MYSQL_USER: yourusername
MYSQL_PASSWORD: yourpassword
MYSQL_DATABASE: yourdbname
volumes:
– /yourpath/:/var/lib/mysql
web:
image: filerun/filerun
environment:
FR_DB_HOST: db
FR_DB_PORT: 3306
FR_DB_NAME: yourname
FR_DB_USER: yourname
FR_DB_PASS: yourpass
APACHE_RUN_USER: www-data
APACHE_RUN_USER_ID: 33
APACHE_RUN_GROUP: www-data
APACHE_RUN_GROUP_ID: 33
depends_on:
– db
links:
– db
– tika
– elasticsearch
ports:
– “8083:80”
volumes:
– /yourpath/l:/var/www/html
– /yourpath/:/user-files
tika:
image: logicalspark/docker-tikaserver
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch:6.2.4
container_name: elasticsearch
environment:
– cluster.name=docker-cluster
– bootstrap.memory_lock=true
– “ES_JAVA_OPTS=-Xms512m -Xmx512m”
ulimits:
memlock:
soft: -1
hard: -1
mem_limit: 1g
volumes:
– /syourpathh:/usr/share/elasticsearch/data

Before you start up containers, grant access 1000:1000 on server OS path mentioned in compose file yourpath/l:/var/www/html
else the container will crash. It may be fixed in the next release.
Now login to FileRUn -> control panel – enable full txt and ocr search. You are ready to go. don’t forget to enable index cron in web container.

*/2 * * * * /var/www/html/cron/process_search_index_queue.sh
root@7f343e91a8db:/var/www/html#

Enable Full Text search, pdf search along with partial word search in open media vault

Leave a comment

Your email address will not be published.

Exit mobile version