clawdbites
Extract recipes from Instagram reels.
Installation
npx clawhub@latest install clawdbitesView the full skill documentation and source below.
Documentation
Instagram Recipe Extractor
Extract recipes from Instagram reels using a multi-layered approach:
No Instagram login required. Works on public reels.
When to Use
- User sends an Instagram reel link
- User mentions "recipe from Instagram" or "save this reel"
- User wants to extract recipe details from a video post
How It Works (MANDATORY FLOW)
ALWAYS follow this complete flow — do not stop after caption if instructions are missing:
--dump-json)- ✅ YES: Present the recipe
- ❌ NO (missing instructions or incomplete): Automatically proceed to audio transcription — do NOT stop or ask the user
- Download video:
yt-dlp -o "/tmp/reel.mp4" "URL"- Extract audio:
ffmpeg -y -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav- Transcribe:
whisper /tmp/reel.wav --model base --output_format txt --output_dir /tmp- Merge caption ingredients with audio instructions
Completeness check heuristics:
- Has ingredients = contains 3+ quantity+item patterns (e.g., "1 cup flour", "2 lbs chicken")
- Has instructions = contains action verbs (blend, cook, bake, mix, pour, add) + sequence OR numbered steps
Extraction Command
yt-dlp --dump-json "" 2>/dev/null
Key fields from JSON output:
description— The caption containing the recipeuploader— Creator's namechannel— Creator's handlewebpage_url— Original URLlike_count— Popularity indicator
Recipe Parsing
Look for these patterns in the caption:
Macros:
- "X Calories | Xg P | Xg C | Xg F"
- "Macros per serving"
- "Cal/Protein/Carbs/Fat"
Ingredients:
- Lines starting with quantities (1 cup, 2 tbsp, 24oz)
- Lines with measurement units
- Emoji bullet points (🥩 🌽 🧀 etc.)
Sections:
- "For the [component]:"
- "Ingredients:"
- "Instructions:"
- "Directions:"
Output Format
Present extracted recipe cleanly:
## [Recipe Name]
*From @[handle]*
**Macros (per serving):** X cal | Xg P | Xg C | Xg F
### Ingredients
- [ingredient 1]
- [ingredient 2]
...
### Instructions
1. [step 1]
2. [step 2]
...
---
Source: [original URL]
User Actions After Extraction
Let the user decide what to do:
- "Save to my recipes" → Save to Apple Notes (if meal-planner skill available)
- "Add to wishlist" → Save to
memory/recipe-wishlist.json - "Just show me" → Display only, no save
- "Plan this for next week" → Hand off to meal-planner skill
Wishlist Storage
Optional storage for recipes user wants to try later:
memory/recipe-wishlist.json:
{
"recipes": [
{
"name": "Recipe Name",
"source": "instagram",
"sourceUrl": "",
"handle": "@creator",
"addedDate": "2026-01-26",
"tried": false,
"macros": {
"calories": 585,
"protein": 56,
"carbs": 25,
"fat": 28,
"servings": 3
},
"ingredients": [...],
"instructions": [...]
}
]
}
Error Handling
If yt-dlp fails:
- Check if URL is valid Instagram reel format
- May be a private account — inform user
- Suggest user paste caption text manually as fallback
If no recipe found in caption (IMPORTANT):
After extracting, scan the caption for recipe indicators:
- Ingredient quantities (numbers + units like oz, cups, tbsp, lbs)
- Recipe sections ("For the...", "Ingredients:", "Instructions:")
- Cooking verbs (bake, cook, sauté, mix, combine)
- Macro information (calories, protein, carbs, fat)
If none found, tell the user clearly:
"I pulled the caption but it doesn't look like the recipe is there — it might just be a teaser or the recipe is only shown in the video itself. Here's what the caption says:
[show caption]
A few options:
1. Check the comments — sometimes creators post recipes there
2. Check their bio link — might lead to the full recipe
3. Describe what you saw in the video and I can help find a similar recipe"
Recipe detection heuristics:
HAS_RECIPE if caption contains:
- 3+ ingredient-like patterns (quantity + food item)
- OR "recipe" + ingredient list
- OR macro breakdown + ingredients
- OR numbered/bulleted instructions
NO_RECIPE if caption is:
- Mostly hashtags
- Just a description/teaser
- Under 100 characters
- No quantities or measurements
Integration with meal-planner
The meal-planner skill can reference this skill:
- When planning meals, check wishlist for untried recipes
- Suggest wishlist recipes that match pantry items
- Mark recipes as "tried" after they're used in a meal plan
Audio Transcription (V2) — MANDATORY FALLBACK
When caption is missing instructions, ALWAYS transcribe the audio automatically. Do not stop and ask the user — just do it. This is the most common case since creators often put ingredients in captions but speak the instructions.
Step 1: Download video
yt-dlp -o "/tmp/reel.mp4" ""
Step 2: Extract audio
ffmpeg -i /tmp/reel.mp4 -vn -acodec pcm_s16le -ar 16000 -ac 1 /tmp/reel.wav
Step 3: Transcribe with Whisper
/Users/kylekirkland/Library/Python/3.14/bin/whisper /tmp/reel.wav --model base --output_format txt --output_dir /tmp
Step 4: Parse transcript for recipe
Look for cooking instructions, ingredients mentioned verbally.
Inference for Missing Measurements
ALWAYS infer quantities when not provided. Never present a recipe without amounts — estimate based on context and standard package sizes.
Vague Language → Specific Amounts
| What they say | Infer |
| "some chicken" | ~1 lb |
| "a bit of garlic" | 2-3 cloves |
| "handful of spinach" | ~2 cups |
| "drizzle of oil" | 1-2 tbsp |
| "season to taste" | ½ tsp salt, ¼ tsp pepper |
| "splash of soy sauce" | 1-2 tbsp |
| "a few tablespoons" | 2-3 tbsp |
| "some rice" | 1 cup dry |
| "cheese on top" | ½ - 1 cup shredded |
| "diced onion" | 1 medium onion |
| "bell peppers" | 2 peppers |
Standard Package Sizes (when item mentioned without amount)
| Ingredient | Standard Package | Infer |
| Puff pastry | 17oz sheet | 1 sheet |
| Ground beef/turkey | 1 lb pack | 1 lb |
| Chicken breast | ~1.5 lb pack | 1.5 lbs |
| Sausage links | 14oz / 4-5 links | 1 package |
| Bacon | 12oz / 12 slices | ½ package (6 slices) |
| Shredded cheese | 8oz bag | 1-2 cups |
| Tortillas | 8-10 count | 1 package |
| Canned beans | 15oz can | 1 can |
| Broth/stock | 32oz carton | 1-2 cups |
| Pasta | 16oz box | 8oz (half box) |
| Rice | 2 lb bag | 1-2 cups dry |
Context-Aware Scaling
By recipe type:
- Stir fry for 2 → 1 lb protein, 4 cups veggies
- Soup/stew → 1.5-2 lbs protein, 4 cups broth
- Sheet pan meal → 1.5 lbs protein, 3-4 cups veggies
- Appetizers → smaller portions, estimate ~12-15 pieces per batch
By servings mentioned:
- "Serves 4" → Scale standard amounts for 4
- "Meal prep for the week" → Assume 5-8 servings
- No servings mentioned → Default to 4 servings
By protein target (if user has macro goals):
- 40-50g protein per serving → ~6-8oz cooked meat per portion
- Scale recipe protein accordingly
Output Format
Always present inferred amounts clearly:
### Ingredients
- 1 lb ground turkey *(estimated)*
- 1 medium onion, diced *(estimated)*
- 2 cups broth *(estimated based on typical soup)*
Mark inferred quantities with (estimated) so user knows what came from the source vs inference.
Combined Extraction Flow
1. TRY CAPTION (instant)
└── yt-dlp --dump-json → parse description
└── Recipe found? → DONE ✅
└── Check for "pinned" / "in comments" / "check comments" → FLAG
2. IF FLAGGED: CHECK FOR CREATOR COMMENT
└── Look through comments for creator's username
└── If creator comment found with recipe → DONE ✅
└── If not found → continue + notify user
3. TRY AUDIO (30-60 sec)
└── Download video
└── Extract audio with ffmpeg
└── Transcribe with Whisper (base model)
└── Parse transcript for recipe
└── Infer missing measurements
└── Recipe found? → DONE ✅
4. PRESENT RESULTS + PROMPT IF NEEDED
└── Show what was extracted from audio
└── If "pinned" was flagged, tell user:
"The creator mentioned the full recipe is pinned in the comments.
I extracted what I could from the audio, but if you want the
exact measurements, paste the pinned comment here and I'll
merge it with what I found."
5. TRY FRAME ANALYSIS (if audio incomplete)
└── Extract 5-8 key frames with ffmpeg
└── Send to Claude vision
└── Ask: "Extract any recipe text, ingredients, or measurements shown"
└── Merge findings with audio transcript
6. FALLBACK (nothing found)
└── Inform user: "Recipe wasn't in caption or audio/video"
└── Offer: search for similar recipe based on video title/description
Frame Analysis
Extract key frames and analyze with vision model.
Extract frames:
# Extract 1 frame every 5 seconds
ffmpeg -i /tmp/reel.mp4 -vf "fps=1/5" /tmp/frame_%02d.jpg
# Or extract specific number of frames evenly distributed
ffmpeg -i /tmp/reel.mp4 -vf "select='not(mod(n,30))'" -vsync vfr /tmp/frame_%02d.jpg
Send to vision model:
Use Claude's image analysis to read each frame:
- Recipe cards / title screens
- Ingredient lists shown on screen
- Measurements in text overlays
- Step-by-step instructions displayed
Vision prompt:
Analyze this frame from a cooking video. Extract any:
- Recipe name or title
- Ingredients with quantities
- Cooking instructions
- Nutritional information / macros
- Any other recipe-related text shown
If no recipe text is visible, respond with "No recipe text found."
Merge strategy:
- Audio transcript = primary source (spoken instructions)
- Frame analysis = supplement (exact measurements, recipe cards)
- Combine both, prefer specific measurements from visual over inferred from audio
Pinned Comment Detection
Scan caption for these phrases (case-insensitive):
- "recipe pinned"
- "pinned in comments"
- "check comments"
- "in the comments"
- "comment below"
- "recipe below"
- "full recipe in comments"
If detected, flag and notify user after extraction:
"Heads up — the creator said the recipe is pinned in the comments.
I got what I could from the audio, but yt-dlp can't access pinned comments
without login. If you want the exact recipe, copy the pinned comment and
send it to me — I'll format it properly."
Requirements
yt-dlp—brew install yt-dlpffmpeg—brew install ffmpegwhisper—pip3 install openai-whisper(runs locally, no API key)- No Instagram login required for public reels