Ayurvedic Astrology
  • Home
  • Astrology Readings
    • Bookings
    • Vintage Reviews
    • World Renown
  • Resources
    • Media >
      • Articles
    • Calculators >
      • Astrology Centric >
        • Any Asteroid
        • Calendar Convert
        • Find Location DMS
        • Katapayadi Calculator
        • Real-time Gold & Silver Prices
        • Simple Star Calculator
        • Varsha Phala and Tithi Pravesh Location Finder
        • Earthquake Predictor
      • AI Centric >
        • Ask Charak
        • Handwriting Analysis
        • History Predictor
        • Personality Trait Predictor
        • TruthBot
    • Research >
      • Bibliography
      • Coding
      • NASA Research Tool
      • Degrees Spreadsheet
    • Privacy Policy
  • Store
    • Astrology
    • Jewelry
    • Lifestyle
  • About
    • Location
    • Contact
    • Newsletter

Ridiculously Good Results Again for the Amazon Misspelling Rates Data Using BigML's DeepNet

2/19/2018

12 Comments

 
[Edit: The latest and greatest on this project, including source files, can be seen at ​the publication link here.]
I have posted a few times here about a rich dataset that I have.

Data were obtained from Stanford's SNAP data repository of Amazon.com reviews that gave daily misspelling rates; astronomical data were from Wolfram's Mathematica software and its astronomy resources.

Here is what the dataset looks like. Click on each picture to enlarge.
Picture
Each of the 5296 rows represents a sequential day in a 14.5 year span of Amazon review misspelling rates during Jan 1, 2000 to Jul 1, 2014.

Across the top are the labels. In each column is a simple, stable, linear function of the right ascension (i.e., the astrological Tropical degree) of the planet, moon, or star at midnight at the start of that day in London, UK. Retrogressions of the planets are also included.

The final column is the log of difference of the misspelling rate of the day from the 27-day SNIP baseline. (The Moon's right ascension completes its cycle every 27 and change days. That is the shortest cycle for any of the right ascensions.) Thus, it is the data over time minus its background noise. The following is a graph of this column's data over time.
Picture
SNIP stands for Sensitive Nonlinear Iterative Peak-clipping algorithm. This method preserves any cyclic patterns -- such as the planetary placements and retrogressions -- while discarding "background noise" in the data, which would tend to obfuscate the patterns. The SNIP method is not subjective. It comes out of processing signals within spectra and is unprejudiced. Note that the SNIP method comes from signal processing and tends to preserve cyclic behavior in spectra.

The apparent cyclicity hidden within this data is revealed via a correlelogram:
Picture
The thin bands represent the start and end of Mercury retrograde across 14.5 years with Mercury retrograde analysis being the original motivator for acquiring this data.

​​For today's study, the data for the first 80% of days were developed into a training group, and that of the subsequent 20% of days were isolated as a test group for prediction.

What was doing the training and testing? They were done entirely by an automated machine learning (AI) algorithm from BigML.com called DeepNet.  DeepNet* was applied to the training set of the first 80% of days. This DeepNet was then tested or evaluated on the last 20% of days. The DeepNet is a hands-off technique offered to anyone for free.

The chart below displays ridiculously good results as given in the usual AI industry way: the error rates for predictions for the 20% test group by DeepNet (in green) is dramatically smaller than other standard methods of prediction, in gray, which are based on the mean (average) rate of the training data or an approach assuming random chance. Moreover, the strong R-squared suggests good correlation of predicted misspelling rates to actual values only for the astronomical data of the DeepNet.
Picture
Let me summarize my take: future misspelling rates in Amazon reviews were successfully predicted using only basic astronomy data when compared to random values or when the average (mean) value was repeatedly applied. Moreover, similarly there was a fine fit of correlation of the model's predicted values to the actual values as shown by the R-squared.
Here are the DeepNet fields in order of importance.
Picture
Picture
I am not even sure what to do next, but in case you do, here is the spreadsheet.
fullamazonmisspellingdata.csv
File Size: 783 kb
File Type: csv
Download File

Please let me know what is up, and please reference this post if you use the data set.


*  For an instructable on exactly how I did this, see here. Note that a linear split was used.
12 Comments
Robin
2/21/2018 11:10:27 pm

You've done it again! Excellent. I greatly admire your work!

Reply
Renay
2/23/2018 10:29:00 am

Thanks Robin. That means a lot!

Reply
Bobilon
2/22/2018 07:47:41 pm

No need to post this -- the farthest I went mathmatically was econometrics 30 years ago and I wasn't particularly good at it then so what follows may is likely misinformed nonsense but you did cattle call ideas so here's my feeble attempt at this sirt of thinking, I'd run the trial you ran in different ways both on your initial data
and in the same and different ways on like data sets. By doing so, you can confirm your initial findings while figuring out the best numerical routine and sampling methods to feed Deep Net a qua optimal data set for generating forecasts. Performing and saving those multiple trials, noting the ways the error function varies varies across them, will likely give you a qua organic feel for what methods and parts from maths toolbox best map between the astrological and terrestrial elements of your data set. One lucky optimization run plus AI is not a sufficiently plausible basis to consider what this means which I likely will never comprehends beyond believing the butteries wings is true -- which I do. My limitations aside, if what you believe you've discovered tests as statistically significant.when attacked with nuerotic mathmatical informed trial and error rigor, you'll have proven astrology is as valid a place to seek the whys of human behavior as anyplace else. That would be some nobel-prize level harvard(?) girl and/or the most radical extension of Veblin's Theory of the Leisure Class in human history. .Fine work. Best, B

Reply
Renay
2/22/2018 08:50:44 pm

Thanks B. I appreciate your intelligent comment.

It's true, I have a lot more work to do. I'm on it. A quick note, I did run a linear regression (RMSE 0.1272 vs 0.1420 baseline) and a decision tree (RMSE 0.1189 vs 0.1485 baseline), both also non-optimized for loss or anything else, as shown in a presentation from last year at https://www.dropbox.com/s/cn33tibz0gznx92/Kepler%202017%20Mercury%20Retrograde%20%20Presentation1.pptx?dl=0.

You are totally right about the optimizations.

Reply
Renay
3/1/2018 07:59:53 am

"To consider what this means": I take it you mean establishing astrology in general?

Regarding this particular dataset: "one lucky optimization".... but wouldn't other optimizations, as the name suggests, just improve the situation? That is, things can only get better?

Reply
louise
2/22/2018 08:36:09 pm

Oh how silly smart!

Reply
Robin
2/22/2018 08:47:15 pm

I am glad Bobilon was able to comment. It is not easy, as sometimes it's not clear where to put comments! But if one tries placing the cursor here and there experimentally, it can be done.

Thanks for all your research and hard work, Renay!

Reply
Renay
2/22/2018 08:55:10 pm

Thanks Robin! You are right. The place in particular to put one's cursor to answer the CAPTCHA is right below the sentence after the picture that asks you for your answers. I have contacted host server tech support to no avail.

Reply
Robin
2/22/2018 08:52:09 pm

Sorry! Actually, the problem is not the commenting itself, but proving that one is not a robot. It's not readily apparent where to type in to copy the words from the images. Experimentation and persistence tend to win out!

Reply
Adrian Fourie link
3/23/2018 01:57:15 pm

Try using probability calculators. I knew there was a bias from Sun-Jupiter aspects on the Dow but didn't know what the odds were of reproducing it experimentally. The bias is only a couple of percent, but the odds of reproducing it over 600 samples is about 1%. (I used calculator.tutorvista.com for http://luckydays.tv/djAnalysis.txt)

Reply
Renay link
3/24/2018 12:28:40 am

Hi Adrian! I like your Dow Jones analysis. Maybe a good way of reproducing your probability would be through a Monte Carlo simulation? You would need a little bit of programming, but not much! If you want me to help you, I will!

For the above study, the cool thing is that it is not stochastic, i.e., it is not based on probability. The results are simply differences. Machine learning really is radical stuff.

Reply
Renay link
3/24/2018 12:38:32 am

Oh yeah, I should add that strictly speaking: the gold standard is not just the probability of your particular result but the SUM of getting your result plus all the more extreme (>48 out of 83 in your case) results. This is going to put you at closer to 50% likelihood, I believe. So, you may need a different approach.

Reply

Your comment will be posted after it is approved.


Leave a Reply.

    ARTICLES

    Author

    Renay Oshop - teacher, searcher, researcher, immerser, rejoicer, enjoying the interstices between Twitter, Facebook, and journals.

    Categories

    All
    Astrology
    Ayurveda
    Cooking
    Math
    Miscellaneous
    Research
    Sanskrit
    Yoga

    RSS Feed

    Archives

    January 2023
    December 2022
    November 2022
    April 2022
    March 2022
    October 2021
    September 2021
    February 2021
    January 2021
    December 2020
    November 2020
    October 2020
    September 2020
    July 2020
    June 2020
    April 2020
    March 2020
    February 2020
    January 2020
    October 2019
    September 2019
    February 2019
    January 2019
    October 2018
    February 2018
    January 2018
    November 2017
    October 2017
    February 2017
    January 2017
    June 2016
    February 2016
    December 2015
    November 2015
    October 2015
    September 2015
    August 2015
    July 2015
    June 2015
    March 2015
    April 2014
    February 2014
    January 2014
    January 2013
    December 2012
    October 2012
    May 2012
    April 2012
    February 2012
    January 2012
    November 2011
    July 2011
    April 2011
    December 2010
    November 2010
    October 2010
    September 2010
    August 2010
    June 2010
    January 2010
    October 2009
    July 2009
    May 2009
    February 2009
    January 2009
    November 2008
    October 2008
    September 2008
    August 2008
    July 2008
    June 2008
    May 2008
    April 2008
    June 2002

© 2008–2022  Renay Oshop  AyurAstro®
  • Home
  • Astrology Readings
    • Bookings
    • Vintage Reviews
    • World Renown
  • Resources
    • Media >
      • Articles
    • Calculators >
      • Astrology Centric >
        • Any Asteroid
        • Calendar Convert
        • Find Location DMS
        • Katapayadi Calculator
        • Real-time Gold & Silver Prices
        • Simple Star Calculator
        • Varsha Phala and Tithi Pravesh Location Finder
        • Earthquake Predictor
      • AI Centric >
        • Ask Charak
        • Handwriting Analysis
        • History Predictor
        • Personality Trait Predictor
        • TruthBot
    • Research >
      • Bibliography
      • Coding
      • NASA Research Tool
      • Degrees Spreadsheet
    • Privacy Policy
  • Store
    • Astrology
    • Jewelry
    • Lifestyle
  • About
    • Location
    • Contact
    • Newsletter