[Edit: this study was subsequently repeated with NFL player data, and the same negative results held and in the same way.]
It was a topic of conversation: does body mass index (BMI) or even height or weight vary among the sun signs?
Ayurveda would say that they may vary by Ascendant, not particularly the sun sign, but getting good data on the Ascendant is a notable difficulty, because the Ascendant depends on the birth time of day.
Getting tens of thousands of charts with that level of precision is nearly unheard of.
A psychology prof from France named Michel Gauquelin compiled thousands of charts in the 1950's and 1960's that included birth time and hence Ascendant information. I am a little skeptical of the quality of these charts, as many are from the 1800's and more worringly, many use Local Mean Time, which if you back-engineer, describe the birth time as being at 10 am or 11 am on the dot.
So, I feel I can not use Gauquelin's data. I am always on the lookout, then, for credible birth or event data, scouring data science competition sites like Kaggle for the elusive ideal data set.
I came across SoFIFA.com in that way. It gives very good data on the various thousands of professional male soccer players associated with FIFA, including their birth date and height and weight. Birth time and place are not given.
However, if I could just "scrape" that data, that would give me a way to test the conjecture that BMI or even height or weight may vary by sun sign. So, that is what I did.
[Edit: The latest and greatest on this project, including source files, can be seen at the publication link here.]
I have posted a few times here about a rich dataset that I have.
Data were obtained from Stanford's SNAP data repository of Amazon.com reviews that gave daily misspelling rates; astronomical data were from Wolfram's Mathematica software and its astronomy resources.
Here is what the dataset looks like. Click on each picture to enlarge.
Each of the 5296 rows represents a sequential day in a 14.5 year span of Amazon review misspelling rates during Jan 1, 2000 to Jul 1, 2014.
Across the top are the labels. In each column is a simple, stable, linear function of the right ascension (i.e., the astrological Tropical degree) of the planet, moon, or star at midnight at the start of that day in London, UK. Retrogressions of the planets are also included.
The final column is the log of difference of the misspelling rate of the day from the 27-day SNIP baseline. (The Moon's right ascension completes its cycle every 27 and change days. That is the shortest cycle for any of the right ascensions.) Thus, it is the data over time minus its background noise. The following is a graph of this column's data over time.
SNIP stands for Sensitive Nonlinear Iterative Peak-clipping algorithm. This method preserves any cyclic patterns -- such as the planetary placements and retrogressions -- while discarding "background noise" in the data, which would tend to obfuscate the patterns. The SNIP method is not subjective. It comes out of processing signals within spectra and is unprejudiced. Note that the SNIP method comes from signal processing and tends to preserve cyclic behavior in spectra.
The apparent cyclicity hidden within this data is revealed via a correlelogram:
The thin bands represent the start and end of Mercury retrograde across 14.5 years with Mercury retrograde analysis being the original motivator for acquiring this data.
For today's study, the data for the first 80% of days were developed into a training group, and that of the subsequent 20% of days were isolated as a test group for prediction.
What was doing the training and testing? They were done entirely by an automated machine learning (AI) algorithm from BigML.com called DeepNet. DeepNet* was applied to the training set of the first 80% of days. This DeepNet was then tested or evaluated on the last 20% of days. The DeepNet is a hands-off technique offered to anyone for free.
The chart below displays ridiculously good results as given in the usual AI industry way: the error rates for predictions for the 20% test group by DeepNet (in green) is dramatically smaller than other standard methods of prediction, in gray, which are based on the mean (average) rate of the training data or an approach assuming random chance. Moreover, the strong R-squared suggests good correlation of predicted misspelling rates to actual values only for the astronomical data of the DeepNet.
Astro-databank is the resource for researchers in astrology. It is a repository of birth information of many thousands of people and events and includes biographic data as well as the birth time, place, and date. Of high utility is the included Rodden Rating which tells us the accuracy of each chart.
An AA rating is "Data as recorded by the family or state". The expectation of many researchers is that AA data is of the highest accuracy possible and can be used freely. In the following, I show statistically that it is extremely unlikely that the AA rating charts altogether are accurate. I will be considering charts of people only and only those born at or after 1930.
From here on, I will try to state things as explicitly as possible.
The common assumption (the null hypothesis) is that AA rating charts are all of high accuracy and hence, taken altogether, exhibit behavior of high accuracy. My assertion (the alternate hypothesis) is that AA rating charts do not exhibit behavior of accurate birth data.
One behavior of accurate birth data is that the minute of birth is evenly distributed. That is to say, a birth time of 8 minutes after the hour is not expected to happen much more or less than a birth time of 9 minutes after the hour, for example.
The following is a simulation of a uniform distribution so that you you know what its plot looks like.
When it comes to modeling real-life phenomena for astrological research, earthquakes are one of the most widely studied.
And why not? After all, the exact time, place, and day are known as well as the strength of the effect (in magnitude).
However, a mapping of earthquake strength to solar system events has proven to be elusive, not to be dramatic, but until now.
Inspired by this Kaggle post, I decided to try my hand at this perhaps age-old problem, and I found that yes, earthquake magnitude correlates with the moon phase at the time of the event. (Moon phase has been looked at quite often but not with the model I will present today.)
First off, I went to https://earthquake.usgs.gov for the earthquake data. (Thanks to Joe Ritrovato for the link.)
I wanted to look particularly at all earthquakes of any depth between Jan 1, 1975 and Jan 1, 2005. Those years were chosen, because a uniform seismograph was finally used through out the world by the mid-1970's, and hydraulic fracturing with its associated quakes was not yet in widespread practice. The search was further restricted to earthquakes of magnitude greater than 5.5, following this system of what counts as a serious earthquake. (Some lower limit to the magnitudes was necessitated by the search limit on the USGS site.)
Here is what my search looked like (be sure to also choose earthquakes only below the fold):
And here is what you will see if you press enter:
By reading the book Cymatics, many of us thrilled to the idea of vibration made visible in that gem of a book from the late 1960's.
With a few lines of code I have decided to plot the equations which go to heart of, and could be said to generate, these beautiful forms.
More motivated me than just the chance to look directly at and witness the imagery. I have seen some claims that the Shri Chakra could be seen from these "tonoscopes".
I decided to test a recent theoretical development of how football game winners can be seen in the day, place, and starting time of the event.
Accordingly, my assistant drew up a table of all Super Bowls so far and their event information.
I then drew up each chart and made a prediction. These were then checked against the real winners.
I should preface by saying that I am a football agnostic. I do not know much about football, only watching socially for a few minutes here and there and receiving the good-natured teasing of friends for knowing so little.
I want to admit that I have seen some games at some times, enough to develop the model, yet I feel I can judge these past charts fairly, truly without any a priori knowledge of the winner.
Thirty five out of forty five Super Bowls were evaluated correctly. (Five early Super Bowls did not have recorded start times available.)
Final 2-sided p-value is 0.0002.
To do this project even more fairly, I would recommend the following:
Astrological research is presently a tough row to hoe.
There is no money in it, and my clients actually don’t like it.
They ask me why I do it. These are intelligent well-educated people, but they tell me they come to astrology because they are sick of science. ("One year coffee is good for you, one year it is bad for you…")
The work of astrological science is lonely, scary, and frustrating and really tough on me physically.
I do it instead out of love, out of passion, out of honestly wanting to know the answer.
That is why I say it is a hobby for me. I think that is a good thing.
I went to college full-time at fifteen, actually being able to emancipate based on my scholarship stipend. Some of my friends were getting PhDs at that age from schools like Harvard and Princeton.
One thing united us in order to work so hard and give up so much, so young. We all shared an immense personal love for science, actually being in love with Mother Nature Herself.
Then, 5 or 10 years later, we all made it into the profession, the industry, of science, and we almost all dropped out.
Research into fundamental astrology returns my gaze back to Mother Nature, and it is that worship which is really why I do it, and yet I certainly do keep all the numbers real. You must, to get really close to Her.
So, even though I do a particular methodological approach which is very big data and AI oriented, using pretty advanced mathematics, I do it for intensely personal reasons, and actually as an artistic expression, I feel.
The methodologies that I would like to see at large in the future of astrology research would be ones that would speak to our fact-based culture as a whole, and maybe along the way, some money could be sent toward astrological researchers, because a need of society at large is met, some pressing practical societal need is answered.
Richard Feynman once said: “People who wish to analyse nature without using mathematics must settle for a reduced understanding.”
So, I think we have to increase our mathematics chops, even if we are doing hermeneutics, perhaps learning from all the good work happening in the digital humanities in the past decade.
We also have an opportunity to go beyond what even regular science provides society, and that is to do our work with love, for love.
It is not just an opportunity, it is a necessity, for astrology is and we are psychology and medicine and football games and politics and money and families, everything altogether, and that is love. I know it is.
Remarks prepared for The Kepler Conference, 2017.
Click on image to see the presentation from The Kepler Conference for Astrological Research, Jan 2017.
[Edit: a crazy further reduction in RMSE was achieved by finally using a neural net. The quick write-up of that can be seen here.
A full journal article was just approved for publication (March 2018). It will be referenced in the bibliography.]
To hear the audio of the misspellings, download the original file below and mouse over on the second red graph.
Renay Oshop - teacher, searcher, researcher, immerser, rejoicer, enjoying the interstices between Twitter, Facebook, and journals.