fetch_news_content

Function fetch_news_content 

Source
pub async fn fetch_news_content(
    url: &str,
) -> Result<String, Box<dyn Error + Send + Sync>>
Expand description

What: Fetch the full article content from an Arch news URL.

Inputs:

  • url: The news article URL (e.g., https://archlinux.org/news/...)

Output:

  • Ok(String) with the article text content; Err on network/parse failure.

ยงErrors

  • Network fetch failures
  • HTML parsing failures

Details:

  • For AUR package URLs, fetches and renders AUR comments instead.
  • For Arch news URLs, checks cache first (15-minute in-memory, 14-day disk TTL).
  • Applies rate limiting for archlinux.org URLs to prevent aggressive fetching.
  • Fetches the HTML page and extracts content from the article body.
  • Strips HTML tags and normalizes whitespace.
  • Caches successful fetches in both in-memory and disk caches.