Hi together,
I’m searching best practice to prevent a particular Page “like a «thank your for contact»-Site” from search-indexing. (I’m using flowpack/simplesearch)
###Questions
Does anyone know a simple way to exclude a bunch of websites? Easiest would be to hide from indexing all with (default Page-property) “_hiddenInIndex == true”
Is there a built-in-way to exclude all «Hide from search engines (noindex)» from internal-indexing?
Maybe Does anyone know a way to exclude from indexing-process with a particular nodeTypeProperty? (So, it would be possible to extend the default ‘Neos.NodeTypes:Page’ with an property like «Hide from internal Indexing»)
###Currently approach
I have defined a nodeType PageNotInSearchIndex.yaml:
# Document-Type like default document but disbled from indexing from search
'Vendor.Site:.PageNotInSearchIndex':
superTypes:
'Neos.NodeTypes:Page': TRUE
ui:
label: 'Page* (prevent from search-indexing)'
icon: 'icon-search-minus'
group: general
search:
fulltext:
isRoot: FALSE
enable: FALSE
```
Maybe this is a too simplistic (or wrong) solution. But currently it looks like the pages are excluded from the process `./flow nodeindex:build`
Neos.Seo has a metaRobotsNoindex property that i use for such purposes since you usually want to exclude stuff from internal and external indexing at the same time.
Yes … i basically exclude the same stuff from internal indexing as from external. Editors only have to learn one thing. I cannot remember a sane reason for separating this
Me too .
Maybe you could explain, how you exclude those «metaTobotsNoindex == true»-Pages from internal indexing? Sorry, but I can’t find a solution. Do you add vars to the end of ./flow nodeindex:build?
@christianm: thanks for answer.
Will use my approach or maybe test manually query-filtering.
With exclude from indexing the cost could be less. Because only once a night indexing exactly the needed pages could be faster/smaller indexing-file and also faster without query-filtering. Maybe