Scraping Flow
Creating a Scraping Flow with NotexAI
This guide demonstrates how to use NotexAI to scrape an HBS news article and summarize it using OpenAI.
1. Use the Scrape Endpoint
Send a request to scrape a webpage:
curl --location 'https://api.notexai.pro/env/scrape/' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer your-api-key' \
--data '{
"url": "https://www.hbs.edu/news/articles/Pages/awa-ambra-seck-profile-2024.aspx",
"scrape_images": false,
"screenshot": false
}'Response:
{
"session_id": "31f613a4-068e-464d-88b1-b8eb5d4d5c6f",
"error": null,
"title": "New Faculty Profiles: Awa Ambra Seck - News - Harvard Business School",
"url": "https://www.hbs.edu/news/articles/Pages/awa-ambra-seck-profile-2024.aspx",
"timestamp": "2025-01-08T14:23:32.969858",
"screenshot": null,
"data": {
"markdown": "...",
"images": null,
"structured": null
},
"space": null
}2. Extracted Data
The response includes structured data extracted from the page, such as the article title, metadata, and content. Here is an example:
3. Summarize with OpenAI
Use OpenAI’s API to summarize the extracted content:
Summary Output:
That’s it! You’ve successfully created a scraping flow and summarized the content using NotexAI and OpenAI. 🌌
Last updated