Skip to main content
Version: current

DocSearch x Algolia Crawler

If you're not finding the answer to your question on this website, this page will help you. If you're still unsure, don't hesitate to send your question to us directly.

You can also read our Crawler FAQ, to understand how it behaves:

For questions related to the DocSearch program, please see our DocSearch program FAQ.

How often will you crawl my website?

Crawls are scheduled at a random time once a week. You can configure this schedule from the config file or trigger one manually from the Crawler interface.

Why do I have duplicate content in my results?

This can happen when you have more than one URL pointing to the same content, for example with ./docs, ./docs/ and ./docs/index.html.

We recommend configuring canonical URLs on your website, you can read more on the "Consolidate duplicate URLs" guide by Google.

Ultimately, it is possible to set set the exclusionPatterns to all the patterns you want to exclude.

Are the docsearch-scraper and docsearch-configs repository still maintained?

We've deprecated our legacy infrastructure, but you can still use it to run your own instance and plug it to DocSearch v3!

How to migrate

Every owners should have received a migration email from docsearch@algolia.com with the details. If you were not part of the previous index owners, or the maintainer has changed, you can request access via our support page.

All the steps are detailed in the email you've received, but in order to use the new infrastructure you need to:

  • Join the Algolia application with the invite included in the email
  • Update your frontend integration with the credentials received in the email.
docsearch({
container: '#docsearch',
appId: 'YOUR_NEW_ALGOLIA_APP_ID',
apiKey: 'YOUR_NEW_ALGOLIA_SEARCH_API_KEY',
indexName: 'YOUR_INDEX_NAME', // it does not change
});

What should I do with my legacy config and credentials?

You can forget about them, we will do the cleaning once all of our users have migrated to the new infrastructure!

You should use the dedicated web interface to make any changes to your index.

Why do I see two Algolia apps in my dashboard?

We did not remove access to the legacy DocSearch application (BH4D9OD16A) to give you the time to get familiar with our new infrastructure. BH4D9OD16A will remain available until the migration has been completed for all the DocSearch users.

Search yield no results

If your search does not yield any results, but there is no error in your browser developer tools, there might be an issue with your index.

Make sure that:

  1. Your Crawler config matches your website structure

We provide config templates for many website generators, but you can also use them as a base. To debug your selectors, we recommend using the URL tester.

  1. Your index settings are up to date (you'll see a banner in the search preview if not)

The Crawler only applies index settings at index creation index, to keep the Algolia dashboard as the source of truth. If you have drastically changed your config, or moved to a website generator, we recommend you to delete your index from the Algolia dashboard before starting a new crawl.