Kagi LLM skill
Recently, Kagi released an official API dashboard that included its own MCP server. This made my current Kagi skill for opencode irrelevant, so I took out multiple life insurance policies and holding its funeral here :D
LLM Concepts Issue
It turns out that when you give an LLM search results that don’t include sites infected with ads and auto-generated content, it can stay mostly kinda almost-ish consistent in its tasks.
What I mean by this is that even if you provide an LLM with proper docs for something, if that something doesn’t follow patterns previously set by similar projects before it, the LLM will struggle with complex tasks, for example, a new programming language with a unique syntax.
In my experience, trying to make Opus 4.6 help me construct a virtual key in Kanata whose events trigger on an on-held event but only on the on-release of said trigger, which itself is activated via a chord, it shits the bed. I’m sure that if I had prompted better, added examples, etc, I would have gotten better results, but at that point I’ll just do it myself.
Anyway, rant over here’s the skill.
Kagi Skill
This skill uses Firecrawl for the scraping side but it’s not needed, opencode’s webfetch (or whatever scraper you’re using) should work just fine.
SKILL.MD
---
name: search
description: |
web search, crawling, and page interaction via Kagi search cli -> Firecrawl CLI . Use this skill whenever the user wants to search the web, find articles, research a topic, look something up online, scrape a webpage, grab content from a URL, extract data from a website, crawl documentation, download a site, or interact with pages that need clicks or logins. Also use when they say "fetch this page", "pull the content from", "get the page at https://", or reference scraping external websites. This provides real-time web search with full page content extraction and interact capabilities.
allowed-tools:
- Bash(firecrawl *)
- Bash(kagi *)
---
# search
- what? `skill to search and process the web`
- when? (
user.ask[questions examine proof problem verify]: `user has a question`
user.know[understand comprehend grasp fathom perceive]: `user wants to understand something`
user.do[crawling scraping maping downloading searching]: `user wants to interact with website(s)`
)
- where? `through system shell using cli tools[kagi firecrawl]`
## search.basic
1. kagi.search[`kagi is a better google`]: `kagi "foobar"`
- kagi.search.multiple[`multiple searches at once`]: `kagi "foobar" && kagi "fooobar"`
- kagi.search.output[`example output`]: ```
TITLE1
https://example1.com
SUMMARY1
TITLE2
https://example2.com
SUMMARY2
TITLE2
https://example2.com
SUMMARY2
--- Related Searches ---
examples of examples
list of examples
examples news```
2. redundancy.check[`skip scraping if success`]: `ls .firecrawl | grep -E 'foo|bar'`
3. firecrawl.scrape[`scrape results`]: `firecrawl scrape -u "example.com" -o .firecrawl/example.md`
- firecrawl.scrape.flag.o[`naming convention`]: ```
.firecrawl/search-{query}.json
.firecrawl/search-{query}-scraped.json
.firecrawl/{site}-{path}.md```
- firecrawl.scrape.flag.f[`output format`]: ```
# format options: markdown, html, rawHtml, links, images, screenshot, summary, changeTracking, json, attributes, branding
# no format flag defaults to markdown
firecrawl scrape -u "example.com" -o .firecrawl/example.md
# use format to get content relevant to your task
firecrawl scrape -u "example.com" -f "links" -o .firecrawl/example.md
# multiple formats will return json
firecrawl scrape -u "example.com" -f "links,images" -o .firecrawl/example.json```
- firecrawl.scrape.multiple[`multiple scrapes at once`]: ```bash
firecrawl scrape -u "example.com" -o .firecrawl/1.md &
firecrawl scrape -u "example.com/foo" -o .firecrawl/2.md &
firecrawl scrape -u "bar.example.com" -o .firecrawl/3.md &```
4. redundancy.check[`helps reduce token usage`]: ```bash
# List word count
wc -l .firecrawl/example.md
# With head for quick peek
wc -l .firecrawl/file.md && head -50 .firecrawl/file.md```
5. os.read[`read results`]: ```bash
# Grep keyword
grep -n "keyword" .firecrawl/file.md
# Use jq to parse json
jq -r '.data.web[].url' .firecrawl/example.json
# example for getting titles and urls
jq -r '.data.web[] | "\(.title): \(.url)"' .firecrawl/example.json```
## firecrawl.commands
- firecrawl.scrape[`have a url, page is static or js-rendered`]: `get a pages content`
- firecrawl.download[`only use if kagi returns nothing`]: `Find pages on a topic`
- firecrawl.crawl[`bulk extract a site section`]: `Need many pages (e.g., all /docs/)`
- firecrawl.map[`need to locate a specific subpage`]: `find urls within a site`
- firecrawl.search[`only use if kagi returns nothing`]: `Find pages on a topic`
- firecrawl.agent[`need structured data from complex sites`]: `AI-powered data extraction`
- firecrawl.interact[`content requires clicks, scrolls, or typing, use with interact for best results`]: `interact with a page`
- firecrawl.download[`download a site to files`]: `save an entire site as local files`
My approach
This skill uses a type of pseudo code approach for guiding the LLM towards what I want. Why do I do this? Well, I hop LLMs frequently, and certain prompts work better with different LLMs. Gemini works better with XML prompts, while Claude likes Markdown. My source? My brain :) So don’t take this approach too seriously, it’s just me experimenting but with consistent results.
I recommend trying it out if your prompts seem to keep missing their mark.
search.sh
#!/usr/bin/env bash
# Kagi Search CLI
# Usage: tool <search query>
# Add your API key below
API_KEY="APIKEY_HERE"
if [ -z "$1" ]; then
echo "Usage: $0 <search query>"
exit 1
fi
QUERY="$*"
RESPONSE=$(curl -s -G \
--data-urlencode "q=$QUERY" \
-H "Authorization: Bot $API_KEY" \
"https://kagi.com/api/v0/search")
# Check for API errors
if echo "$RESPONSE" | jq -e '.error' &> /dev/null; then
echo "API Error:"
echo "$RESPONSE" | jq -r '.error'
exit 1
fi
# Print search results (t=0)
echo "$RESPONSE" | jq -r '.data[] | select(.t == 0) | "\n\(.title)\n\(.url)\n\(.snippet // "")"'
# Show related searches if any (t=1)
RELATED=$(echo "$RESPONSE" | jq -r '.data[] | select(.t == 1) | .list[]' 2>/dev/null)
if [ -n "$RELATED" ]; then
echo -e "\n--- Related Searches ---"
echo "$RELATED"
fi