I did a lot of research and nothing relevant worked. Basically I am trying to scrape RSS Feed and populate the data in a table format on a webpage created using Python Flask. I have scraped the data in a dictionary form. But it does not fetch the data in real-time (or every 5 seconds) on the webpage.
Here is the code for scraping RSS Feed using formfeed, rss_feed.py.
import feedparserimport timedef feed_data(): RSSFeed = feedparser.parse("https://www.upwork.com/ab/feed/jobs/rss?sort=recency&paging=0%3B10&api_params=1&q=&securityToken=2c2762298fe1b719a51741dbacb7d4f5c1e42965918fbea8d2bf1185644c8ab2907f418fe6b1763d5fca3a9f0e7b34d2047f95b56d12e525bc4ba998ae63f0ff&userUid=424312217100599296&orgUid=424312217104793601") feed_dict = {} for i in range(len(RSSFeed.entries)): feed_list = [] feed_list.append(RSSFeed.entries[i].title) feed_list.append(RSSFeed.entries[i].link) feed_list.append(RSSFeed.entries[i].summary) published = RSSFeed.entries[i].published feed_list.append(published[:len(published)-6]) feed_dict[i] = feed_list return feed_dictif __name__=='__main__': while True: feed_dict = feed_data() #print(feed_dict) #print("==============================") time.sleep(5)
Using the time.sleep() works on this script. But when I import it in the app.py, it fails to reload every 5 seconds. Here is the code to run the Flask app, app.py:
from flask import Flask, render_templateimport rss_feedfeed_dict = rss_feed.feed_data()app = Flask(__name__)@app.route("/")def hello(): return render_template('home.html', feed_dict=feed_dict)
I tried using BackgroundScheduler from APScheduler as well. Nothing seems to be working. Formfeed's 'etag' and 'modified' not being recognized for some reason (is it deprecated?). I even tried using the 'refresh' attribute in the meta tag. But that of course only updates the Jinja2 template and not the code itself:
<meta http-equiv="refresh" content="5">
I am really stuck on this.
Here is a link to the (half complete) app: https://rss-feed-scraper.herokuapp.com/