This section, empowered by the details about how we build a DocSearch index, gives you the best practices to optimize our crawl. Adopting this following specification is required to let our crawler build the best experience from your website. You will need to update your website and follow these rules.
Note: If your website is generated thanks to one of our supported tool, you do not need to change your website as it is already compliant with our requirements.
The generic configuration example
Overview of a clear layout
A website implementing these good practises will look simple and crystal clear. It can have this following aspect:
The main blue element will be your
.DocSearch-content container. More details
in the following guidelines.
You can add some specific static classes to help us find your content role. These classes can not involve any style changes. These dedicated classes will help us to create a great learn-as-you-type experience from your documentation.
Add a static class
DocSearch-contentto the main container of your textual content. Most of the time, this tag is a
lvlelements outside this main documentation container (for instance in a sidebar) must be
globalselectors. They will be globally picked up and injected to every record built from your page. Be careful, the level value matters and every matching element must have an increasing level along the HTML flow. A level
lvlX) should appear after a level
X > Y.
lvlXselectors should use the standard title tags like
h3, etc. You can also use static classes. Set a unique
nameattribute to these elements as detailed below.
Every DOM elements matching the
lvlXselectors must have a unique
nameattribute. This will help the redirection to directly scroll down to the exact place of the matching elements. These attributes define the right anchor to use.
Every textual element (selector
text) must be wrapped in a
<li>tag. This content must be atomic and splitted into small entities. Be careful to never nest one matching element into another one as it will create duplicates.
Stay consistent and do not forget that we need to have some consistency along the HTML flow as presented here.
Introduce global information as meta tags
Our crawler automatically extracts information from our DocSearch specific meta tags:
content value of the
meta tag will be added to every record extracted
from the page. Given that the name is
$NAME will be set as
an attribute in every records. Its value will be its related
You can then transform these attributes as
facetFilters to filter over
them from the UI. We will need to set
attributesForFaceting of your Algolia
index exposed via the DocSearch
Nice to have
Your website should have an updated sitemap. This is key to let our crawler know what should be updated. Do not worry, we will still crawl your website and discover embedded hyperlinks to find your great content.
Every page needs to have their full context available. Using global elements might help (see above).
js_render: truein your configuration.
Any questions? Send us an email.