Skip to main content
Version: Beta (v4.x)

Improving Answer Quality with Markdown Indexing

To deliver more accurate, context-rich answers at scale, AskAI benefits from cleanly structured content. One of the most effective ways to achieve this is by using a Markdown-based indexing helper in your Algolia Crawler configuration. This setup ensures AskAI can access well-formed, content-focused records—especially important for larger documentation sites where metadata, navigation elements, or layout artifacts might otherwise dilute the quality of generative responses.

info

These steps are especially valuable for large-scale sites using DocSearch-generated indices, but also apply to custom or smaller setups: you can manually create and upload a Markdown-based index tailored to AskAI.

Note: For more integration examples (Docusaurus, VitePress, Astro/Starlight, and generic setups), see the section at the bottom of this page.

Overview

To maximize the quality of AskAI responses, configure your Crawler to create a dedicated index for Markdown content. This approach enables AskAI to work with structured, chunked records sourced from your documentation, support content, or any Markdown-based material—resulting in significantly more relevant and precise answers. The steps below walk through how to set up your Crawler to index Markdown files specifically for AskAI.


Step 1: Update your existing DocSearch Crawler configuration

  • In your Crawler config, add the following to your actions: [ ... ] array:
// actions: [ ...,
{
indexName: "my-markdown-index",
pathsToMatch: ["https://example.com/docs/**"],
recordExtractor: ({ $, url, helpers }) => {
const text = helpers.markdown("main"); // Change "main" to match your content tag (e.g., "main", "article", etc.)
if (text === "") return [];

// Extract language or other attributes as needed. Optional
const language = $("html").attr("lang") || "en";

return helpers.splitTextIntoRecords({
text,
baseRecord: {
url,
objectID: url,
title: $("head > title").text(),
lang: language, // Add more attributes as needed
},
maxRecordBytes: 100000, // Higher = fewer, larger records. Lower = more, smaller records.
// Note: Increasing this value may increase the token count for LLMs, which can affect context size and cost.
orderingAttributeName: "part",
});
},
},
// ...],
  • Then, add the following to your initialIndexSettings: { ... } object:
// initialIndexSettings: { ...,
"my-markdown-index": {
attributesForFaceting: ["lang"], // Add more if you extract more attributes
ignorePlurals: true,
minProximity: 4,
removeStopWords: false,
searchableAttributes: ["unordered(title)", "unordered(text)"],
removeWordsIfNoResults: "allOptional" // This will help if the LLM finds no results. A graceful fallback.
},
// ...},

Step 2: Run the DocSearch crawler to create a new Ask AI optimized index

After updating your Crawler configuration:

  1. Publish your configuration in the Algolia Crawler dashboard to save and activate it.
  2. Run the Crawler to index your markdown content and create the new index.

The Crawler will process your content using the markdown extraction helper and populate your new index with clean, structured records optimized for AskAI.

Tip: Monitor the crawl progress in your dashboard to ensure all pages are processed correctly. You can view the indexed records in your Algolia index to verify the structure and content.


Step 3: Integrate your new index with Ask AI

Once your Crawler has created your optimized index, you can integrate it with Ask AI in two ways: using DocSearch (recommended for most users) or building a custom integration using the Ask AI API.

Using DocSearch

Configure DocSearch to use both your main keyword index and your markdown index for Ask AI:

docsearch({
indexName: 'YOUR_INDEX_NAME', // Main DocSearch keyword index
apiKey: 'YOUR_SEARCH_API_KEY',
appId: 'YOUR_APP_ID',
askAi: {
indexName: 'YOUR_INDEX_NAME-markdown', // Markdown index for Ask AI
apiKey: 'YOUR_SEARCH_API_KEY', // (or a different key if needed)
appId: 'YOUR_APP_ID',
assistantId: 'YOUR_ALGOLIA_ASSISTANT_ID',
searchParameters: {
facetFilters: ['language:en'], // Optional: filter to specific language/version
},
},
});
  • indexName: Your main DocSearch index for keyword search.
  • askAi.indexName: The markdown index you created for Ask AI context.
  • assistantId: The ID of your configured Ask AI assistant.
  • searchParameters.facetFilters: Optional filters to limit Ask AI context (useful for multi-language sites).

Tip: Keep both indexes updated as your documentation evolves to ensure the best search and AI answer quality.


Best Practices & Tips

  • Use clear, consistent titles in your markdown files for better searchability.
  • Test your index with AskAI to ensure relevant answers are returned.
  • Adjust maxRecordBytes if you notice answers are too broad or too fragmented.
    • Note: Increasing maxRecordBytes may increase the token count for LLMs, which can affect the size of the context window and the cost of each AskAI response.
  • Keep your markdown well-structured (use headings, lists, etc.) for optimal chunking.
  • Add attributes like lang, version, or tags to your records and attributesForFaceting if you want to filter or facet in your search UI or AskAI.

FAQ

Q: Why use a separate markdown index? A: It allows AskAI to access content in a format optimized for LLMs, improving answer quality.

Q: Can I use this with other content types? A: Yes, but markdown is especially well-suited for chunking and context extraction.

Q: What if I have very large markdown files? A: Lower the maxRecordBytes value to split content into smaller, more focused records.


For more details, see the AskAI documentation or contact support if you need help configuring your Crawler.


Crawler Configuration Examples by Integration

Below are example configurations for setting up your markdown index with different documentation platforms. Each shows how to extract facets (like language, version, tags) and configure the Crawler for your specific integration:

Generic Example:

// In your Crawler config:

// actions: [ ...,
{
indexName: "my-markdown-index",
pathsToMatch: ["https://example.com/**"],
recordExtractor: ({ $, url, helpers }) => {
const text = helpers.markdown("main"); // Change "main" to match your content tag (e.g., "main", "article", etc.)
if (text === "") return [];

// Customize selectors or meta extraction as needed. Optional
const language = $("html").attr("lang") || "en";

return helpers.splitTextIntoRecords({
text,
baseRecord: {
url,
objectID: url,
title: $("head > title").text(),
// Add more optional attributes to the record
lang: language
},
maxRecordBytes: 100000, // Higher = fewer, larger records. Lower = more, smaller records.
// Note: Increasing this value may increase the token count for LLMs, which can affect context size and cost.
orderingAttributeName: "part",
});
},
},
// ...],

// initialIndexSettings: { ...,
"my-markdown-index": {
attributesForFaceting: ["lang"], // Recommended if you add more attributes outside of objectID
ignorePlurals: true,
minProximity: 4,
removeStopWords: false,
searchableAttributes: ["unordered(title)", "unordered(text)"],
removeWordsIfNoResults: "allOptional" // This will help if the LLM finds no results. A graceful fallback.
},
// ...},

Each example shows how to extract common facets and configure your markdown index for AskAI. Adjust selectors and meta tag names as needed for your site.