How to implement Elasticsearch in Nextcloud

@mrmarkuz

Hi Markus

I installed Elasticsearch according to your HowTo above. After adapting to the newer PHP SCL paths, it worked.

Since the NethServer upgrade to 7.9.2009, the elasticsearch service won’t start.

Any ideas?

Strange, after a couple of restarts, it’s working again!

Found the issue:
Older version of ingest-attachment.

Solution:

sudo /usr/share/elasticsearch/bin/elasticsearch-plugin remove ingest-attachment
sudo /usr/share/elasticsearch/bin/elasticsearch-plugin install ingest-attachment
systemctl restart elasticsearch
systemctl status elasticsearch

The - at the moment again “untested” Tesseract module needs to be updated and reactivated (sometimes need 2x activate pressed…)

Also: in NextCloud as admin you need to reactivate Elasticsearch under Fulltext-search…

Thanks
Andy

1 Like

I made the changes as suggested now I’m getting the following " Job for rh-php73-php-fpm.service failed because the control process exited with error code. See “systemctl status rh-php73-php-fpm.service” and “journalctl -xe” when I run the command “systemctl restart rh-php73-php-fpm”

As Stephane said there’s no need for a custom template. If you created the template file, it can be removed.
You can edit the file /etc/opt/rh/rh-php73/php-fpm.d/000-nextcloud.conf directly, tweaking the value of php_admin_value[memory_limit].

2 Likes

@mrmarkuz

Hi Markus

Since the update, I’ve noticed that Elasticsearch in Nextcloud doesn’t index the contents of eg. PDFs. Most PDFs are from Adobe Pro in this case, with OCR already “prepared”, not just containing a scanned image of text. There are those too, but a PDF containing text should be indexed…

A search for documents only shows results, if the search is part of the filename. No results are shown for contents… :frowning:

It doesn’t even index simple textfiles (.txt)…

NethServer and Nextcloud including all installed Apps are top up to date.

Any ideas?

Thanks
Andy

It’s working here but I needed to enable untested app “Full text search - Elasticsearch Platform”.

Does the indexing work without error?

sudo -u apache /opt/rh/rh-php73/root/usr/bin/php -d memory_limit=512M /usr/share/nextcloud/occ fulltextsearch:index

Is elasticsearch running?

systemctl status elasticsearch

@mrmarkuz

Hi Markus

I only have this as untested:

And, yes, elasticsearch runs and starts without issues.

Running a manual Index works, but at the end I see this:

And searching for any files only gives results when the searched value is part of a filename or directory. no contents - even simple textfiles are indexed… :frowning:

I can confirm that even PDF indexing was working before 7.9.2009…
Not exactly sure when it broke… :frowning:

You may try to reset the fulltextsearch and index again afterwards:

sudo -u apache /opt/rh/rh-php73/root/usr/bin/php -d memory_limit=512M /usr/share/nextcloud/occ fulltextsearch:reset

Here are my Nextcloud fulltext search settings:

PDF and txt content search is working here with 7.9.2009.

EDIT:

You may also check if elasticsearch answers correctly (I have version 7.9.1)

curl -X GET localhost:9200

and the nextcloud logs in /var/lib/nethserver/nextcloud/nextcloud.log

1 Like

OK, resetted Elasticsearch…

My settings are exactly the same as yours, with the exception of Tesserect, which I also have installed (And also was working…)

curl -X GET localhost:9200 always showed a correct responce…

OK, will now start a new index, this will take a long time…

Will give some feedback ASAP.

You don’t use Tesserect OCR in your setup?

Update:
Index now running, i finally again see Tesserect running, seems like the reset helped!
But the index will take time…

I think it was disabled on this server, I don’t use the full text search feature much…

Great that it seems to work now.

For me, my Macbook will find files just as fast… (As you, i don’t use it much)

But some of my clients, with their really fast search options in Windows 10, really WANT Fulltextsearch! :slight_smile:

When working, Tesseract OCR actually does a good job. I have 4 languages installed.

1 Like

I can confirm that after resetting the index my elastic search instance works without errors . I did this immediately after the 7.9 update. Since then, the smart search works without problems.

2 Likes

Hello,
does anyone know how to exclude certain file types from the search based on their extension?

Sincerely, Marko

@Andy_Wismer @mrmarkuz
I reseted my Index because errors like here. Then I re-indexed and can monitor the indexing of each newly added file, by

sudo -u apache /opt/remi/php73/root/usr/bin/php -d memory_limit=512M /usr/share/nextcloud/occ fulltextsearch:live

But within Nextcloud I don’t get any result when I search any string.
What could have happened there?


If I test the index, all works fine.
GET http://127.0.0.1:9200/_search?q=Dfvekvbewvervbtwt

…shows me a the relevant file with the content.

But not in Nextcloud!
Sincerely, Marko

@capote

Hi

At the moment all my Nextclouds do NOT have working FullTextSearch, even though Elasticsearch as such seems to work OK… :frowning:

No idea what is not working…

My 2 cents
Andy

1 Like

this is a miracle … While investigating this problem over 3 hours … nothing works.
After I finished this article … it now works in Nextcloud.

I love this community more and more . … The problems solve themselves after you post them here.

After you have posted it here, it should work now :rofl:

Not quite - still seeing stuff like this…

may be related to group folders?

I’m not using Group Folders at home…

Is there something in /var/log/elasticsearch/elasticsearch.log ?

Which elasticsearch version do you use?

[root@nethserver ~]# rpm -qa elasticsearch
elasticsearch-7.11.1-1.x86_64

Maybe an update helps.

You may try to reset fulltextsearch

sudo -u apache scl enable rh-php73 -- php -dmemory_limit=512M /usr/share/nextcloud/occ fulltextsearch:reset

and create a new index and check if there are errors:

sudo -u apache scl enable rh-php73 -- php -dmemory_limit=512M /usr/share/nextcloud/occ fulltextsearch:index

1 Like