Creating a Search with SimpleSearch.ContentRepositoryAdaptor

Hey, I am creating a global search for a website. I am using SimpleSearch and am quite happy with this solution.

After a few easy configurations, I am now stuck.
I can search for page titles with this code, but not for content. Instead, the search spits out results that have a matching result neither in the title nor in the content, but in the database field “persistence_object_identifier”.

For example, a search just for the letter “A” gives me this result:
Home
9ee21871-d5e0-47fb-ac80-58e1ded4df7c
Über uns
580f8e94-a4d2-4e03-8580-4a6bb056e771


What I want is:
The search will ignore this database field and look for the content instead.
I would also be satisfied if the search would read the description of the page if that makes it easier.

I hope someone has experience with SimpleSearch and can help me.

Best regards
Julian

PS: Is any documentation of SimpleSearch out there?

#Root.fusion
prototype(Flowpack.SimpleSearch.ContentRepositoryAdaptor:Search) < prototype(Neos.Fusion:Template) {
    templatePath = 'resource://WG.Basesite/Private/Fusion/PluginOverwrites/Flowpack.SimpleSearch.ContentRepositoryAdaptor/Resources/Private/Templates/NodeTypes/Search.html'

    searchResults = ${Search.query(site).nodeType('Neos.Neos:Document').log().fulltext(request.arguments.search.word).execute()}
    searchWord = ${request.arguments.search.word}
    searchQuery = ${this.searchWord ? Search.query(site).nodeType('WG.BaseSite:Document.AbstractPage').log().fulltext(request.arguments.search.word + '*') : null}

    searchResultContent = ${Search.query(searchResult).nodeType('Neos.Neos:Content').fulltextMatchResult(request.arguments.search.word + '*')}

    configuration = Neos.Fusion:DataStructure {
        itemsPerPage = 5
        insertAbove = false
        insertBelow = true
        maximumNumberOfLinks = 3
    }

    @cache {
        mode = 'uncached'

        context {
            1 = 'site'
            2 = 'node'
        }

    }
}

#Search.html
{namespace neos=Neos\Neos\ViewHelpers}
{namespace fusion=Neos\Fusion\ViewHelpers}
{namespace search=Neos\ContentRepository\Search\ViewHelpers}
<div class="flowpack-simplesearch-search">
    <form method="POST">
        <input id="searchInput" name="search[word]" value="{searchWord}" />
        <input type="hidden" name="--neos-contentrepository-viewhelpers-widget-paginateviewhelper[currentPage]"
            value="" />
        <button type="submit"></button>
    </form>
    <!-- <hr/> -->
    <f:if condition="{searchWord}">
        <div id="searchResults" class="search-results">
            <f:if condition="{searchQuery}">
                <dl>
                    <search:widget.paginate query="{searchQuery}" as="paginatedNodes" configuration="{configuration}">
                        <f:for each="{paginatedNodes}" as="searchResult">
                            <dt><neos:link.node node="{searchResult}">{searchResult.fullLabel}</neos:link.node></dt>
                            <dd>
                                <fusion:render path="searchResultContent" context="{searchResult: searchResult}" />
                            </dd>
                        </f:for>
                    </search:widget.paginate>
                </dl>
            </f:if>
        </div>
    </f:if>
</div>

Which fields did you enable for the full text search?
Which search backend are you using?

I used in in a project last week and didn’t have those issues. But I’m also not using the fulltextMatchResult method but instead render my own result based on the nodes properties.

I haven’t enabled anything because I don’t know where.

Except for changing the NodeType for the searchQuery and a bit of html structure, I use simpleSearch right out of the box.

I tried a few changes in Configuration/NodeTypes.yaml but they didn’t do anything. Even after rebuilding the index.

What do you mean by “search backend”? I haven’t configured a MySQL connection, so I guess SQLite? I’m not very familiar with SQLite.

Can you tell me where I have to make changes?

Best regards
Julian

You can find the configuration to setup the simple search with mysql instead here GitHub - Flowpack/Flowpack.SimpleSearch: A simple php/sqlite search engine for generic data.
But this is not a necessity, just good to know. I used both in the past.

The included configuration for nodetypes is here Flowpack.SimpleSearch.ContentRepositoryAdaptor/Configuration/NodeTypes.yaml at master · Flowpack/Flowpack.SimpleSearch.ContentRepositoryAdaptor · GitHub
That should give you some hints on how to adjust your own nodetypes.

The fulltextExtractor that you can see in the nodetype configuration will extract the html of a property and put it into 7 buckets called H1-H6 and “text”. Those define the priority of the fields.

You can also do something like this, to tell the indexer how to handle a field:

fulltextExtractor: "${Indexing.extractInto('h6', node.properties.metaKeywords)}"

Hey Sebastian,

thank you for your help. I was already familiar with these links. But you gave me new ideas. Especially the fulltextExtractor: "${Indexing.extractInto('h6', node.properties.metaKeywords)}" part is really useful to know.

Also, I found the SQLite file, which gave me another hints.

I added:

'Neos.Neos:Content':
  search:
    fulltext:
      isRoot: true
      enable: true
  properties:
    Text_1:
      search:
        fulltextExtractor: "${Indexing.extractHtmlTags(node.properties.Text_1)}"

And modified:

'Neos.Neos:Document':
  search:
    fulltext:
      isRoot: true
      enable: true
  properties:
    title:
      search:
        fulltextExtractor: "${Indexing.extractInto('h1', node.properties.title)}"
    metaKeywords:
      search:
        fulltextExtractor: "${Indexing.extractInto('h5', node.properties.metaKeywords)}"
    metaDescription:
      search:
        fulltextExtractor: "${Indexing.extractInto('h6', node.properties.metaDescription)}"

So the keywords and the description are added to the index.
But I’m not really satisfied with the “Text_1” solution. Can I just get all properties in the Index?

And the search results still give me the __identifier__. It also looks like the search is only based on the title and the identifier. The content of Text_1 is shown, but only if the title or identifier matches the search. If I search for words from Text_1, nothing appears.


I have another question: If I make changes to the content from Text_1 the index gets automatically rebuilded, but the dataset is at the end of the table. So the search result doesn’t have the content anymore.


Here is my full NodeTypes.yaml:

#Configuration/NodeTypes.yaml
'Neos.Neos:Node': &node
  properties:
    '__identifier':
      search:
        indexing: "${node.identifier}"

    '__workspace':
      search:
        indexing: "${'#' + node.context.workspace.name + '#'}"

    '__path':
      search:
        indexing: "${node.path}"

    '__parentPath':
      search:
        indexing: "${'#' + Array.join(Indexing.buildAllPathPrefixes(node.parentPath), '#') + '#'}"

    '__sortIndex':
      search:
        indexing: "${node.index}"

    '_removed':
      search:
        # deliberately don't map or index this
        indexing: ''

    '__type':
      search:
        indexing: "${node.nodeType.name}"
    # we index the node type INCLUDING ALL SUPERTYPES
    '__typeAndSupertypes':
      search:
        indexing: "${'#' + Array.join(Indexing.extractNodeTypeNamesAndSupertypes(node.nodeType), '#') + '#'}"
    '__dimensionshash':
      search:
        indexing: "${'#' + String.md5(Json.stringify(node.context.dimensions)) + '#'}"

'unstructured': *node

'Neos.Neos:Hidable':
  properties:
    '_hidden':
      search:
        indexing: "${node.hidden}"

'Neos.Neos:Timable':
  properties:
    '_hiddenBeforeDateTime':
      search:
        indexing: "${(node.hiddenBeforeDateTime ? Date.format(node.hiddenBeforeDateTime, 'U') : null)}"

    '_hiddenAfterDateTime':
      search:
        indexing: "${(node.hiddenAfterDateTime ? Date.format(node.hiddenAfterDateTime, 'U') : null)}"

'Neos.Neos:Content':
  search:
    fulltext:
      isRoot: true
      enable: true
  properties:
    Text_1:
      search:
        fulltextExtractor: "${Indexing.extractHtmlTags(node.properties.Text_1)}"

'Neos.Neos:Document':
  search:
    fulltext:
      isRoot: true
      enable: true
  properties:
    title:
      search:
        fulltextExtractor: "${Indexing.extractInto('h1', node.properties.title)}"
    metaKeywords:
      search:
        fulltextExtractor: "${Indexing.extractInto('h5', node.properties.metaKeywords)}"
    metaDescription:
      search:
        fulltextExtractor: "${Indexing.extractInto('h6', node.properties.metaDescription)}"

'Flowpack.SimpleSearch.ContentRepositoryAdaptor:Search':
  superTypes:
    'Neos.Neos:Content': true
  ui:
    label: 'Search'
    icon: 'icon-search'

Pretty sure I miss something simple but important.

LG
Julian