Notice to Git Dev/Master branch users: Solr changes + Solr 6.6.2 (2017-10-18) + security notice


#1

If you are using the [Git Dev/Master branch][1], please note that the Solr component (/applications/solr) has received some changes including solr schema tweaks; you will receive these the next time you ‘git pull’. These require reindexing the product data.

The rebuildSolrIndex(Auto) service should automatically execute the next time you restart your instance, thanks to a new trigger added in commit https://github.com/ilscipio/scipio-erp/commit/76ea9070ff663924806602d25cb90df39ab3a197 . So most users testing out the dev/master branch should not have any issues.

2017-09-11: Please note that for the Solr version 6.6.0 update, you will need to rebuild after pulling from git - see the post following this one below.

If you need to defer this execution (large database, custom setup, etc.), you may set solr.index.rebuild.autoRun.ifConfigChange=false in solrconfig.properties. However Solr queries to the product schema will not run correctly until you execute the rebuildSolrIndex service manually.

To run rebuildSolrIndex manually, you can use: https://localhost:8443/admin/control/setSyncServiceParameters?SERVICE_NAME=rebuildSolrIndex&POOL_NAME=pool&RUN_SYNC=Y

Some changes are work-in-progress, and there may be further enhancements in the next few months.

KNOWN ISSUES (Solr 6.6.0+):

  • FIXED (safe workaround applied): Occasionally when restarting the server, Solr will fail to delete its index write.lock file, causing errors on boot because it thinks it’s locked
    [1]: https://github.com/ilscipio/scipio-erp

UPDATE (2017-10-18):
Master branch has been updated to Solr 6.6.2 which fixes the security issue CVE-2017-12629 below ( https://github.com/ilscipio/scipio-erp/commit/2fbb16e862f6d9398e683f30371917518e9792ba ). Please stop server, git pull, “ant clean build”, and restart server.


#2

This topic is now pinned. It will appear at the top of its category until it is either unpinned by a moderator, or the Clear Pin button is pressed.


#3

UPDATE [2017-09-11: Solr version 6.6.0 is now in master branch; some further changes yet to be added]:

Please note that in the following week, the Solr libraries and files will be updated to version 6.6.0 in the Git dev/master branch.

It contains major changes, but according to our tests, there should not be any problem for dev/master branch users. All that is needed is to

1. Stop server (./ant stop)
2. Pull from git (./git pull)
3. Rebuild and start (./ant clean build start ; if any issues with ivy, try ./ant clean-ivy-full` first)
4. rebuildSolrIndex should trigger automatically on startup (unless you have disabled this - simply run it manually), as described above.

If you experience any issues simply post on the forum and we will fix them. Thank you.


#4

Please note that in Solr 6.6.0 now in dev/master branch, the Solr webapp link has changed from /solr/admin.html to /solr/index.html:
http://localhost:8080/solr/index.html


#5

UPDATE (2017-10-18):
Master branch has been updated to Solr 6.6.2 which fixes the issue below ( https://github.com/ilscipio/scipio-erp/commit/2fbb16e862f6d9398e683f30371917518e9792ba ). Please stop server, git pull, “ant clean build”, and restart server.

PLEASE NOTE: SECURITY NOTICE:

Solr discovered a security flaw in their recent releases (CVE-2017-12629):
https://lucene.apache.org/solr/news.html#12-october-2017-please-secure-your-apache-solr-servers-since-a-zero-day-exploit-has-been-reported-on-a-public-mailing-list

For Scipio, in master branch, a corresponding property “solr.disable.configEdit” to the one described in that post can be set to “true” in build.xml (or passed on command line using ‘ant -Dsolr.disable.configEdit=true’).

Due to the abstraction layers in Scipio, Scipio should not be immediately vulnerable as long as you have not exposed the “/solr” webapp on a public server or server accessible by untrusted users.

We will update the version as soon as soon as it is published by Solr.


#6

I’m having issues after this update to solr. My test environment has many languages, but three seem to be omitted in the existing solr config in the master branch: Korean (ko), Vietnamese (vi), and Chinese (zh).

Errors:
ofbiz.log:Caused by: org.apache.solr.common.SolrException: Unknown fieldType ‘text_ko’ specified on field text_i18n_ko
ofbiz.log:Caused by: org.apache.solr.common.SolrException: Unknown fieldType ‘text_vi’ specified on field text_i18n_vi
ofbiz.log:Caused by: org.apache.solr.common.SolrException: Unknown fieldType ‘text_zh’ specified on field text_i18n_zh

I have already tried the following:

Updated: applications/solr/configsets/product_configs/conf/managed-schema

<field name="text_i18n_vi" type="text_vi" indexed="true" stored="false" multiValued="true"/>
<field name="text_i18n_ko" type="text_ko" indexed="true" stored="false" multiValued="true"/>
<field name="text_i18n_zh" type="text_zh" indexed="true" stored="false" multiValued="true"/>

solrconfig.xml… Updated (example of zh)

default_lang_zh
spell_lang_zh
solr.DirectSolrSpellChecker
internal
0.5
2
1
5
4
0.01


default_dlang_zh
spell_dlang_zh
solr.DirectSolrSpellChecker
internal
0.5
2
1
5
4
0.01

Created: lang# stopwords_vi.txt, stopwords_ko.txt, stopwords_zh.txt

And of course reinitialized the solr data by removing the “data” directory (which is re-created), and updating the database table “solr_status” table and setting it back to:

fbiz=# select * from solr_status;
-[ RECORD 1 ]---------±--------------------------
solr_id | SOLR-MAIN
data_status_id | SOLR_DATA_OLD
data_cfg_version | 1.0.14

Which forces a rebuild of the solr indexes. If I remove the three locales (vi,ko,zh), it seems to work fine after I reset stuff. So, it appears that there is “something” else required, possibly SEED DATA? I’ve looked, and I don’t see why it is failing.

Can you guys fix this?

ofbiz.log:Caused by: org.apache.solr.common.SolrException: Unknown fieldType ‘text_ko’ specified on field text_i18n_ko
ofbiz.log:Caused by: org.apache.solr.common.SolrException: Unknown fieldType ‘text_vi’ specified on field text_i18n_vi
ofbiz.log:Caused by: org.apache.solr.common.SolrException: Unknown fieldType ‘text_zh’ specified on field text_i18n_zh


#7

Hi Mike, thanks for reporting.

That issue happens because Solr itself doesn’t provide mappings for those languages out-of-the-box. So we will have to add new field type entries for them (however the analyzer might not be ideal until further review, requires language specialty).


#8

@mz4wheeler The github was just updated for ko,vi,zh: https://github.com/ilscipio/scipio-erp/commit/3d5baf1e8a76f86ae94b57f8223e0ac8091c3e14

You should revert your changes under /applications/solr related to this, then stop server, then git update, then rebuild (./ant clean build), and then run rebuildSolrIndex service using link above: https://localhost:8443/admin/control/setSyncServiceParameters?SERVICE_NAME=rebuildSolrIndex&POOL_NAME=pool&RUN_SYNC=Y (it may run automatically at startup because I increased solr.config.version in the commit)
Usually you shouldn’t need to delete the data folder. Running rebuildSolrIndex should be enough, unless something became corrupted.

If you need/find more missing language codes, or problems with this, let me know. Unfortunately the Solr schema limitations do not allow to add them easily, they have to be inserted 20 different places, so it will be easier if we do it on our end. The analyzers for these languages still need review.


#9

Wow… Thanks for the quick turn-around. Just reset things, and now the three new languages work. Thanks!


#10

A couple of notes:

  1. It may be worth the time to create “managed-schema” and “solrconfig.xml” from checked in templates, and create new, DYNAMIC, smaller versions based on the locales in “general.properties” --or-- “solrconfig.properties”. This way solr won’t waste memory and computing resources trying to process locales is does not need to support. Also, if someone needs a new locale, solr will automatically pick it up.

  2. It seems to me that anything solr is CATALOG or STORE specific. There should be solr-related maintenance functions as a sub-set of the catalog (or store) back-end menus. There are some functions hidden than can be called as a service, but it is not convenient. I may want to re-index keywords, or rebuild solr, for instance, and it is difficult to figure out how to do it. It would be nice if these and other useful solr functions were built into the UI.

  3. What if I am running multiple stores (common). It seems to be (I may be wrong) that solr is global in nature. If I have 20 backend stores, they are all under one solr engine. Is it possible to split these? Off the bat it would most likely require a new column in the “solr_status” table to define a store_id or catalog_id.

Thanks


#11

Hey again, thanks for those suggestions.

I agree with 1) in principle but the overhead doesn’t appear significant at the moment (performance-wise). There are both advantages and drawbacks to templating the XMLs toward simplifying the setup but it’s a good suggestion.

  1. Yes at the moment you have to use the service engine run screen to invoke them manually. I agree a screen dedicated to calling the solr services could be helpful. However by default products should re-index mostly automatically through the ECAs defined in solr component (e.g. there is one on Product entity), so the original design was to try to make it automatic rather than manual. Some limitations argue for easier manual access though, I agree.

  2. The rebuildSolrIndex(Auto) currently runs globally for all stores, but it’s quite fast. The solr schema stores the store ID, and it was recently changed to recognize it better, but in terms of running the queries for reindexing it may not make a huge difference. The solr data itself should always be considered volatile (subject to reindexing at almost any time) so in practice it hasn’t come up as an issue.


#12

UPDATE (2017-10-18): SECURITY:
Master branch has been updated to Solr 6.6.2 which fixes the issue reported above ( https://github.com/ilscipio/scipio-erp/commit/2fbb16e862f6d9398e683f30371917518e9792ba ). Please stop server, git pull, “ant clean build”, and restart server.


#13

I am still getting inconsistent (buggy) behavior with solr. I think it has to do with loading products “while” a store is running. This should not be an issue for an ecommerce store.

It seems that the loading process triggers a bunch of events that are solr related, which seems to 1) slow down loading 1000s of products, and 2) seems to corrupt solr.

To fix this, I have to delete the contents of solr_status, and restart. It then takes a long to time to completely re-index solr, THEN the solr products look proper. I have already tried to clear the system caches, and it does not fix solr. Is there a way to disable the solr related events during a mass load of products (10,000+), then run a solr update later?


#14

Yes, everything from the ECA-triggered updates to the check at startup can be disabled if needed in production.

They are all flags in solrconfig.properties. If you’re loading huge amounts of products you may want to disable the ECAs by setting:
solr.eca.enabled=false
In that case you have to run rebuildSolrIndex yourself after product loads and modifications. It updates the status table on its own.

I’m concerned about the “corrupt solr” part. Can you describe how that manifests?


#15

When products are loaded, I see this in the java string:

-Dsolr.lock.type=single

     This option specifies which Lucene LockFactory implementation
     to use.

     single = SingleInstanceLockFactory - suggested for a
              read-only index or when there is no possibility of
              another process trying to modify the index.
     native = NativeFSLockFactory - uses OS native file locking.
              Do not use when multiple solr webapps in the same
              JVM are attempting to share a single index.
     simple = SimpleFSLockFactory  - uses a plain file for locking

     Defaults: 'native' is default for Solr3.6 and later, otherwise
               'simple' is the default

     More details on the nuances of each LockFactory...
     http://wiki.apache.org/lucene-java/AvailableLockFactories
-->
<!-- SCIPIO: NOTE: 2011-09-11: This is now specified through build.xml -->
<lockType>${solr.lock.type:native}</lockType>

Maybe the solr locking should always be “simple”?


#16

No, lock type must be set to single. The others don’t work. It shouldn’t affect anything as there is only one ofbiz instance running from the folder.


#17

The premise is that the ecommerce app is already running, and I’m trying to import (1000s) products on a separate java process using ./ant load-readers “-Ddata-readers=hot_deploy_store”, so it’s not a single instance. I’m ok re-running the solr sync after, as long as the existing, running store continues to work during the whole process. Still experimenting.

Also, even with -Dsolr.eca.enabled=false on the java command line, I still get:

2017-11-01 16:58:36,295 |main |EntityEcaRule |I| Running Entity ECA Service: addToSolr, triggered by rule on Entity: Product
2017-11-01 16:58:36,297 |main |EntityEcaRule |I| Running Entity ECA Service: addToSolr, triggered by rule on Entity: ProductCategoryMember
2017-11-01 16:58:36,302 |main |EntityEcaRule |I| Running Entity ECA Service: addToSolr, triggered by rule on Entity: ProductPrice
2017-11-01 17:00:53,631 |main |ServiceDispatcher |T| Sync service [entity-default/indexContentKeywords] finished in [9] milliseconds
2017-11-01 17:00:53,631 |main |EntityEcaRule |I| Running Entity ECA Service: indexProductKeywords, triggered by rule on Entity: ProductContent
2017-11-01 17:01:00,486 |main |ServiceDispatcher |T| Sync service [entity-default/indexProductKeywords] finished in [6855] milliseconds
2017-11-01 17:01:00,486 |main |EntityEcaRule |I| Running Entity ECA Service: addToSolr, triggered by rule on Entity: ProductContent

Is it possible to disable all ECAs during a bulk product load?


#18

NOTE: That switch was not designed to work on the command line. You have to set it in solrconfig.properties.

The ECA updates are disabled through the flag I wrote (only when set in solrconfig.properties). Those messages don’t mean that solr is being updated, they are only the service engine being verbose. If you want to stop the messages you can comment out the *ecas.xml includes in /applications/solr/ofbiz-component.xml but it’s same result.

For Solr what matters is that there be only ever one Solr webapp instance running. I can’t fully comment on that use of load-readers at the moment, but AFAIK that ant target doesn’t load Tomcat so it shouldn’t cause any Solr updates although I’m not certain if it’s safe in other aspects.


#19

NOTICE: In the Git master branch was now pushed some helper screens for Solr status and reindexing invocation, which also documents some of the functions inline. Simply upgrade/pull using Git, rebuild, restart, access the Admin app and click the Solr sidebar menu item.