Skip to content

feat: adopt scrapegraph-py 2.0.0 resource-based client#8

Open
VinciGit00 wants to merge 2 commits intomainfrom
feat/scrapegraph-py-v2-api
Open

feat: adopt scrapegraph-py 2.0.0 resource-based client#8
VinciGit00 wants to merge 2 commits intomainfrom
feat/scrapegraph-py-v2-api

Conversation

@VinciGit00
Copy link
Copy Markdown
Member

Summary

  • Migrates all LangChain tools to the new ScrapeGraphAI client surface introduced in scrapegraph-py 2.0.0.
  • Tools now build typed request models (ScrapeRequest, ExtractRequest, SearchRequest, CrawlRequest, MonitorCreateRequest, HistoryFilter) and hit the new client.crawl.*, client.monitor.*, and client.history.* resources.
  • Results are returned as the serialized ApiResult envelope (status / data / error / elapsed_ms).
  • Breaking: CrawlStartTool uses max_depth; MonitorCreateTool uses interval (cron or shorthand) and optional prompt+output_schema for JSON extraction; HistoryTool takes service/page/limit; minimum Python bumped to 3.12 to match the SDK.

Test plan

  • poetry run pytest tests/unit_tests/ (47 passed)
  • Request-model serialization verified for all endpoints
  • Live smoke test against api.scrapegraphai.com/api/v2 (blocked on a valid v2 API key; client returns HTTP 403 Invalid API key with the key provided during review)

🤖 Generated with Claude Code

VinciGit00 and others added 2 commits April 1, 2026 07:29
Port the LangChain integration to the new ScrapeGraph v2 API, mirroring
scrapegraph-py PR #82.

Breaking changes:
- SmartScraperTool -> ExtractTool (client.extract)
- SearchScraperTool -> SearchTool (client.search)
- SmartCrawlerTool -> CrawlStartTool/CrawlStatusTool/CrawlStopTool/CrawlResumeTool
- Scheduled job tools -> MonitorCreateTool/MonitorListTool/MonitorGetTool/MonitorPauseTool/MonitorResumeTool/MonitorDeleteTool
- ScrapeTool now uses url+format params instead of website_url+render_heavy_js
- GetCreditsTool now uses client.credits() instead of client.get_credits()
- MarkdownifyTool preserved but now uses client.scrape(format="markdown") internally
- Removed: AgenticScraperTool (removed in v2)
- New: HistoryTool for request history
- Version bumped to 2.0.0, scrapegraph-py dependency updated to >=1.31.0

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tools now construct typed request models (ScrapeRequest, ExtractRequest,
SearchRequest, CrawlRequest, MonitorCreateRequest, HistoryFilter) and call
the new ScrapeGraphAI client surface (including crawl/monitor/history
resources). Results are returned as serialized ApiResult dicts. Minimum
Python bumped to 3.12 to match the SDK.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant