Creating a Search with SimpleSearch.ContentRepositoryAdaptor

Hey, I am creating a global search for a website. I am using SimpleSearch and am quite happy with this solution.

After a few easy configurations, I am now stuck.
I can search for page titles with this code, but not for content. Instead, the search spits out results that have a matching result neither in the title nor in the content, but in the database field “persistence_object_identifier”.

For example, a search just for the letter “A” gives me this result:
Home
9ee21871-d5e0-47fb-ac80-58e1ded4df7c
Über uns
580f8e94-a4d2-4e03-8580-4a6bb056e771


What I want is:
The search will ignore this database field and look for the content instead.
I would also be satisfied if the search would read the description of the page if that makes it easier.

I hope someone has experience with SimpleSearch and can help me.

Best regards
Julian

PS: Is any documentation of SimpleSearch out there?

#Root.fusion
prototype(Flowpack.SimpleSearch.ContentRepositoryAdaptor:Search) < prototype(Neos.Fusion:Template) {
    templatePath = 'resource://WG.Basesite/Private/Fusion/PluginOverwrites/Flowpack.SimpleSearch.ContentRepositoryAdaptor/Resources/Private/Templates/NodeTypes/Search.html'

    searchResults = ${Search.query(site).nodeType('Neos.Neos:Document').log().fulltext(request.arguments.search.word).execute()}
    searchWord = ${request.arguments.search.word}
    searchQuery = ${this.searchWord ? Search.query(site).nodeType('WG.BaseSite:Document.AbstractPage').log().fulltext(request.arguments.search.word + '*') : null}

    searchResultContent = ${Search.query(searchResult).nodeType('Neos.Neos:Content').fulltextMatchResult(request.arguments.search.word + '*')}

    configuration = Neos.Fusion:DataStructure {
        itemsPerPage = 5
        insertAbove = false
        insertBelow = true
        maximumNumberOfLinks = 3
    }

    @cache {
        mode = 'uncached'

        context {
            1 = 'site'
            2 = 'node'
        }

    }
}

#Search.html
{namespace neos=Neos\Neos\ViewHelpers}
{namespace fusion=Neos\Fusion\ViewHelpers}
{namespace search=Neos\ContentRepository\Search\ViewHelpers}
<div class="flowpack-simplesearch-search">
    <form method="POST">
        <input id="searchInput" name="search[word]" value="{searchWord}" />
        <input type="hidden" name="--neos-contentrepository-viewhelpers-widget-paginateviewhelper[currentPage]"
            value="" />
        <button type="submit"></button>
    </form>
    <!-- <hr/> -->
    <f:if condition="{searchWord}">
        <div id="searchResults" class="search-results">
            <f:if condition="{searchQuery}">
                <dl>
                    <search:widget.paginate query="{searchQuery}" as="paginatedNodes" configuration="{configuration}">
                        <f:for each="{paginatedNodes}" as="searchResult">
                            <dt><neos:link.node node="{searchResult}">{searchResult.fullLabel}</neos:link.node></dt>
                            <dd>
                                <fusion:render path="searchResultContent" context="{searchResult: searchResult}" />
                            </dd>
                        </f:for>
                    </search:widget.paginate>
                </dl>
            </f:if>
        </div>
    </f:if>
</div>

Which fields did you enable for the full text search?
Which search backend are you using?

I used in in a project last week and didn’t have those issues. But I’m also not using the fulltextMatchResult method but instead render my own result based on the nodes properties.

I haven’t enabled anything because I don’t know where.

Except for changing the NodeType for the searchQuery and a bit of html structure, I use simpleSearch right out of the box.

I tried a few changes in Configuration/NodeTypes.yaml but they didn’t do anything. Even after rebuilding the index.

What do you mean by “search backend”? I haven’t configured a MySQL connection, so I guess SQLite? I’m not very familiar with SQLite.

Can you tell me where I have to make changes?

Best regards
Julian

You can find the configuration to setup the simple search with mysql instead here GitHub - Flowpack/Flowpack.SimpleSearch: A simple php/sqlite search engine for generic data.
But this is not a necessity, just good to know. I used both in the past.

The included configuration for nodetypes is here Flowpack.SimpleSearch.ContentRepositoryAdaptor/Configuration/NodeTypes.yaml at master · Flowpack/Flowpack.SimpleSearch.ContentRepositoryAdaptor · GitHub
That should give you some hints on how to adjust your own nodetypes.

The fulltextExtractor that you can see in the nodetype configuration will extract the html of a property and put it into 7 buckets called H1-H6 and “text”. Those define the priority of the fields.

You can also do something like this, to tell the indexer how to handle a field:

fulltextExtractor: "${Indexing.extractInto('h6', node.properties.metaKeywords)}"

Hey Sebastian,

thank you for your help. I was already familiar with these links. But you gave me new ideas. Especially the fulltextExtractor: "${Indexing.extractInto('h6', node.properties.metaKeywords)}" part is really useful to know.

Also, I found the SQLite file, which gave me another hints.

I added:

'Neos.Neos:Content':
  search:
    fulltext:
      isRoot: true
      enable: true
  properties:
    Text_1:
      search:
        fulltextExtractor: "${Indexing.extractHtmlTags(node.properties.Text_1)}"

And modified:

'Neos.Neos:Document':
  search:
    fulltext:
      isRoot: true
      enable: true
  properties:
    title:
      search:
        fulltextExtractor: "${Indexing.extractInto('h1', node.properties.title)}"
    metaKeywords:
      search:
        fulltextExtractor: "${Indexing.extractInto('h5', node.properties.metaKeywords)}"
    metaDescription:
      search:
        fulltextExtractor: "${Indexing.extractInto('h6', node.properties.metaDescription)}"

So the keywords and the description are added to the index.
But I’m not really satisfied with the “Text_1” solution. Can I just get all properties in the Index?

And the search results still give me the __identifier__. It also looks like the search is only based on the title and the identifier. The content of Text_1 is shown, but only if the title or identifier matches the search. If I search for words from Text_1, nothing appears.


I have another question: If I make changes to the content from Text_1 the index gets automatically rebuilded, but the dataset is at the end of the table. So the search result doesn’t have the content anymore.


Here is my full NodeTypes.yaml:

#Configuration/NodeTypes.yaml
'Neos.Neos:Node': &node
  properties:
    '__identifier':
      search:
        indexing: "${node.identifier}"

    '__workspace':
      search:
        indexing: "${'#' + node.context.workspace.name + '#'}"

    '__path':
      search:
        indexing: "${node.path}"

    '__parentPath':
      search:
        indexing: "${'#' + Array.join(Indexing.buildAllPathPrefixes(node.parentPath), '#') + '#'}"

    '__sortIndex':
      search:
        indexing: "${node.index}"

    '_removed':
      search:
        # deliberately don't map or index this
        indexing: ''

    '__type':
      search:
        indexing: "${node.nodeType.name}"
    # we index the node type INCLUDING ALL SUPERTYPES
    '__typeAndSupertypes':
      search:
        indexing: "${'#' + Array.join(Indexing.extractNodeTypeNamesAndSupertypes(node.nodeType), '#') + '#'}"
    '__dimensionshash':
      search:
        indexing: "${'#' + String.md5(Json.stringify(node.context.dimensions)) + '#'}"

'unstructured': *node

'Neos.Neos:Hidable':
  properties:
    '_hidden':
      search:
        indexing: "${node.hidden}"

'Neos.Neos:Timable':
  properties:
    '_hiddenBeforeDateTime':
      search:
        indexing: "${(node.hiddenBeforeDateTime ? Date.format(node.hiddenBeforeDateTime, 'U') : null)}"

    '_hiddenAfterDateTime':
      search:
        indexing: "${(node.hiddenAfterDateTime ? Date.format(node.hiddenAfterDateTime, 'U') : null)}"

'Neos.Neos:Content':
  search:
    fulltext:
      isRoot: true
      enable: true
  properties:
    Text_1:
      search:
        fulltextExtractor: "${Indexing.extractHtmlTags(node.properties.Text_1)}"

'Neos.Neos:Document':
  search:
    fulltext:
      isRoot: true
      enable: true
  properties:
    title:
      search:
        fulltextExtractor: "${Indexing.extractInto('h1', node.properties.title)}"
    metaKeywords:
      search:
        fulltextExtractor: "${Indexing.extractInto('h5', node.properties.metaKeywords)}"
    metaDescription:
      search:
        fulltextExtractor: "${Indexing.extractInto('h6', node.properties.metaDescription)}"

'Flowpack.SimpleSearch.ContentRepositoryAdaptor:Search':
  superTypes:
    'Neos.Neos:Content': true
  ui:
    label: 'Search'
    icon: 'icon-search'

Pretty sure I miss something simple but important.

LG
Julian

I still don’t unterstand why the identifier is part of your full text search. I never experienced that. Does the identifier show up in any other column in the sqlite table?

And what do you mean with but the dataset is at the end of the table ?

The identifier is only in the first column of the table.
I have configured the indexing so that the document description is in the “text” column and is rendered when the search word matches any word of the description. But if the description is empty, SimpleSearch looks for the identifier and renders it.

Maybe this is the default behavior, but I don’t know how to avoid it.


To the last part: I think I made a small mistake in my thinking.