Early narrative explaining results as of May 13, 2022 (presentation for OPA Research):
The slide set:
Whiteboard explainer video:
Download the Powerpoint slide set which includes the video:
The below is automated daily data cultivation with automated daily aggregate analysis in Wolfram Mathematica:
To download or use a copy of the code immediately above, click on the three lines in the bottom left and then choose "make your own copy" or "download". There is a free time-limited version of Mathematica that you can use with the download, or making your own copy will open up an online version of the software that you can use for free.
Finally, use the below to cultivate the daily asteroid namesake article counts:
Python code for cultivation of daily data
import requests from bs4 import BeautifulSoup import pandas as pd from datetime import date from datetime import timedelta #import numpy as np #dataset containing 1200 names df = pd.read_csv('names to search Fiverr A.csv') #creating a list of all 1200 names all_names = df['NAME'].values #input_date = '2022-02-16' input_date0 = date.today()-timedelta(days =1)#gives yesterday date input_date = input_date0.isoformat() #print(input_date) #input_date = input('Enter the date in format yyyy-mm-dd example 2022-02-03: ') #Generating Urls from given list of names def get_url(name): url_template = 'https://news.google.com/search?q={}' url = url_template.format(name) return url #Scraping news title and date def get_news(article,name): #title = article.h3.text title_date = article.div.div.time.get('datetime').split('T')[0] # print(title_date) if title_date == input_date: all_data = (title_date,name) return all_data #Main function to run all code main_list = [] def Main_task(): for news_name in all_names: records = [] count = 0 url = get_url(news_name) response = requests.get(url) soup = BeautifulSoup(response.text,'html.parser') articles = soup.find_all('article','redacted') for article in articles: try: all_data = get_news(article,news_name) records.append(all_data) except: continue count = len(records) # print("---") main_list.append((news_name,count)) Main_task() mynamedata = pd.DataFrame(main_list,columns= ['NAMES',input_date]) mynamedata.to_csv(input_date+'.csv')
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
ARTICLESAuthorRenay Oshop - teacher, searcher, researcher, immerser, rejoicer, enjoying the interstices between Twitter, Facebook, and journals. Categories
All
Archives
September 2023
|