this post was submitted on 03 Jun 2024
1468 points (97.9% liked)

People Twitter

5213 readers
2137 users here now

People tweeting stuff. We allow tweets from anyone.

RULES:

  1. Mark NSFW content.
  2. No doxxing people.
  3. Must be a tweet or similar
  4. No bullying or international politcs
  5. Be excellent to each other.

founded 1 year ago
MODERATORS
 
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 5 months ago* (last edited 5 months ago)

Python web scraping is just fine, with the llms you.have the option of either extracting the html and having the LLM read.over that, or having a vision ai OCR the page and make its own decision of what to extract.