Debunking YouTube's archaeoastronomy skeptics

                                    Image from Scijinks.gov

Some YouTube commentators, like Milo Rossi of MiniMinuteman and Stefan Milo, have made ignorant comments about the astronomical interpretation of Gobekli Tepe and Pillar 43. Below, I make the scientific case for this interpretation.


Q: What is science?

A: Science attempts to create consistent models of reality.


Q: What is a good model?

A: One that has a lot of explaining power relative to its inputs. This is also known as Occam's razor, parsimony, and model efficiency.


Q: How do we measure the 'explaining power'?

A: There are many ways, but the best is to use statistical methods, including hypothesis testing.


Examples of statistics in science

  • Suppose 100 pills given to a random sample of people, of which 50 are placebos. Suppose all those taking the pill are dead by the next day, but none of those taking the placebo are dead. Conclusion: the pill is deadly. The hypothesis “The pill is deadly” is made a posteriori and is possible. The conclusion is based entirely on the statistics. No prior knowledge or evidence is needed, other than this is the only experiment performed.
  • Suppose you enter a room, and find 100 people are stood on caricatures painted on the floor that look just like them. Is this situation deliberately arranged? Again, the hypothesis is made a posteriori and is possible. Again, the statistics tell us the situation is almost certainly arranged, as the probability it occurred by pure chance is tiny. Again, no prior knowledge or evidence is needed, other than this is the only room examined.

What if only 20 people, and not 50, who took the pill died? And what if only 50 of the people were stood on their own caricatures? What then? In general, the stronger the correlation, the less likely it occurred by pure chance and the more likely it is telling us something interesting. This is a general rule in science. It is just as valid for Pillar 43 as anything else.


Permutation type tests

Suppose you enter a room, and find 100 people are stood on caricatures painted on the floor that look somewhat like them, although they are clearly not often strong likenesses? Is this situation deliberately arranged? There seems to be more uncertainty here. Again, the hypothesis is made a posteriori and is possible. We can make a decision about what most likely happened by ranking the people vs the caricatures (lower rank is better, e.g. 1st, 2nd, 3rd etc.).

For every permutation of the people, create a score by summing the ranks of that permutation. Compare the number of permutations, x, that have a score ≤ than the one actually observed. If the total number of possible permutations is y, then a good estimate for the probability the situation occurred by chance is pch = x/y. The probability it is arranged is then  parr = 1 – pch = 1 – x/y. 

Sweatman and Tsikritsis (2017) used this kind of permutation method to test Pillar 43. Essentially, they measured the correlation between Stellarium and Pillar 43 and found it to be extremely strong.


The correlation of Pillar 43 with Stellarium

The Figure above gives an impression of how strong this correlation is, but some might dispute this. Nevertheless, we can use the permutation (or combinatorial) method to test it. Let's consider each animal symbol on the pillar and compare them to the Greek constellations visible to people at Gobekli Tepe (of which there are ~ 40). Note that in order for this test to be valid we must use a pre-defined constellation set, like those in Stellarium, or else the process is statistically meaningless. We cannot make up our own constellations. Nor can we just use the dots without lines. That would be absurd; to compare 2-d shapes with just some dots makes no sense at all. How can the comparison be made? Note also that the relevant constellations defined by Stellarium are pretty much like those in other popular astronomical software. There's little variation in them, but even if there was, the correlation with Stellarium would still be relevant since it is one the most popular astronomical softwares available.  

First, we pin the scene by assuming the eagle/vulture = Sagittarius, since it is the only Greek constellation on the path of the sun (the circular disk at the centre of the scene) that has the shape of the head and wings of the vulture/eagle. This also orients the scene to dusk (a setting sun), which is therefore the orientation we should consider for all our pattern matches. This assumption frames the hypothesis. Logically, we do not need to show this assumption is true in advance (requiring this would be anti-scientific), although if other evidence is found that supports it, that would clearly enhance the hypothesis via Occam's razor. 

Scorpion: we should find a scorpion-like constellation below Sagittarius. Of course, this is exactly what we find. We could have found anything at all in this position, but we find exactly what is expected. Although the scorpion has the wrong vertical orientation, this is still highly significant (~ 1/40)

Canid: we should find a canid to the left of Scorpius. Again, this is exactly what we find, as Lupus is in exactly the right pose. We could have found anything in this position, but what we actually find is exactly what is expected. Perfect. The other Greek canid constellations are either invisible (Canis Major is too far south to be observed at the time) or are not nearly as good a match to the pose (Canis Minor) (~1/40)

Tall bird with snake/fish: we should find a tall bird + fish/snake constellation to the right of Scorpius. Instead we find Ophiuchus with his serpent, albeit slightly out of position. Although this is only a partial symbolic match, it is still highly significant as there could have been anything in this position. The shape of Ophiuchus is also quite a good match for the shape of the symbol (~4/40).

Duck/goose: we should find a water bird, a duck or goose, below the scorpion. Instead we find libra, which was either represented as the claws of the scorpion or the scales in ancient Greece and Babylon. So, not a good symbolic match. Moreover, the duck/goose on Pillar 43 is mostly obscured, making a shape match tricky, so we should probably omit this one from our analysis. Fortunately, Libra in Stellarium has the shape of a rubber duck, so a match is still plausible.

Bending bird: we should find a bird bent into a right-angle shape in the position of Pisces. Instead we have the two fish of Pisces. Although there are a few Greek bird constellations, none have the right-angle shape shown on Pillar 43. So this match is highly significant: even though the species is wrong, the shape is an excellent match, the best among all the Greek constellations. This is highly significant (~1/40).

Vertically splayed quadruped: we should find a vertically splayed quadruped constellation in the position of Virgo. And this is exactly what we find. Although Virgo is human, it is one of only a few visible Greek constellations with the shape of a splayed quadruped in this vertical orientation. (~4/40)

Horizontal quadruped: we should find a horizontal quadruped constellation in the position of Gemini or Taurus (the switch from Gemini to Taurus for the winter solstice constellation occurs around 10,700 BCE). Although the Greek constellation Gemini is a pair of human twins with linked arms, that fits the description quite well. Taurus is a quadruped viewed from the side, but it's usually represented as just the head and shoulders, so not such a good fit. But there are several other Greek constellation that also have the shape of a horizontal quadruped with legs underneath. (~8/40)

When we put all these correlations together we find an extremely strong correlation overall. My recently submitted manuscript (with my colleague Dr Dimitrios Gerogiorgis) goes into much more detail, and finds a correlation in the region of ~ 1:10 million overall. This is extremely significant and deserves an explanation.

Does it matter that some of the animal symbols are not in exactly the right position? Not much. In fact, the positional correlation of the symbols on Pillar 43 is actually quite good and adds further confidence to the hypothesis. Essentially, the strength of the permutation correlation should be multiplied by around 3.5. This was analysed rigorously already in my paper with Alistair Coombs (Athens Journal of History, 2019).


Bayes' theorem

Recall that for the permutation test we have parr = 1 – x/y. 

Bayes’ theorem tells us this can be biased by prior knowledge, parr = 1 – a . x/y

If it is already known the hypothesis is wrong, then a = y/x

If it is already known the hypothesis is correct, then a = 0

In general, without any prior knowledge, it is fairest to set a = 1. This is an unbiased test.

It follows that any evidence that supports the hypothesis, besides the statistical test, will cause a < 1 ( and a > 1 for counter-evidence). This agrees with Occam’s razor.

For example, the positional correlation of the main part of Pillar 43 analysed by Sweatman and Coombs (2019), in the absence of any other evidence, suggests setting a = 1/3.5.

Sweatman and Tsikritsis used an unbiased statistical test for Pillar 43.


The philosophy of hypothesis testing

This kind of analysis is known as ‘hypothesis testing’. It plays an important role in science, but science is more general than just hypothesis testing. Science considers all kinds of evidence, not just statistical tests. It also relies heavily on mathematics and logical deduction, as well as more qualitative inference, although then the conclusions must be more qualitative too.

The concept of ‘confidence’ is crucial in science; nothing is ever proven right or wrong. There are only degrees of confidence, and they depend entirely on the kind and strength of evidence provided.

You will probably have heard; ‘scientific theories can only be proven wrong, they cannot be proven right’. This is wrong, as there is no proof in science, either for or against a hypothesis. There are only degrees of confidence. Proof is valid in mathematics and logic, but not science.

So what about this; ‘scientific theories can only be proven almost certainly wrong, they cannot be proven almost certainly right’. Again, this is wrong. The ‘pill is deadly’ hypothesis is an obvious counter-example as it shows that some kinds of scientific hypothesis can be shown to be ‘almost certainly right’.

You will probably have also heard ‘Correlation does not imply causation’. This quote is often misunderstood. Clearly, without knowing any context, it is true that correlation does not automatically imply causation. However, where one explanation for the observed correlation far outweighs all the others, then correlation does imply causation, with an associated confidence level. Again, the 'pill is deadly' is a suitable example for this.

Clearly, it all depends on the kind of hypothesis being tested, and whether alternative explanations (models) are likely. The key here is to think about how many explanations there are, or mechanisms or models, that could conceivably explain the correlation, and how they compare to each other (i.e. use Occam's razor).

If we consider the ‘pill is deadly’ problem, it seems there is only one explanation for the observed correlation – the pill is deadly.  It seems that nothing else can explain the data (I’m assuming the experiment is carried out perfectly). If we ask a different question; ‘how is the pill deadly?’, we cannot answer this question with this data. This is because there are an unknown number of mechanisms by which the pill could be deadly (biochemistry is very complicated). And we cannot know anything about this with this data. We would need to collect different data to test different kinds of hypothesis about the mechanism.


Testing other models

Although it might seem there is only one explanation for the 'pill is deadly' problem, this is not true. In fact, there is another explanation for the observed correlation. That is, it might be that N people died of other causes, and the other 50-N people that died were killed by taking the pill. Each one of these scenarios, for a different value of N, is a different explanation, or model, for the observed correlation. Therefore, each model should be tested separately, and the results compared. However, because it is rare for anyone to die suddenly, the most likely result is that the pill is very deadly, although we cannot rule out a tiny survival rate.

The same is true for the 'people in a room' and Pillar 43 permutation statistical tests. It is possible that some of the observed correlation in each case is caused by a different mechanism while the rest of it is due to the hypothesis being correct. This doesn't matter. Even if only a subset of the matches are due to the hypothesis being correct, it still counts.

On the other hand, other kinds of model should also be tested. This is where some or all of the observed correlation in each case is caused by a different mechanism, while the rest of it is pure luck. We should use Occam's razor to decide between the different models, i.e. which has the greatest explaining power. 

However, when one correlation is extremely significant and it has only one reasonable explanation, it in general rules out other explanations inconsistent with it to the same level of confidence. This is because parr = 1 - x/y. This is the case for the astronomical interpretation. The strength of the astronomical explanation tends to rule out other explanations that are inconsistent with it since it is the only reasonable explanation for that correlation.

The philosophy of pattern matching

Note that comparing artefacts by eye is routine in archaeology. Whether it is bones or stones or pieces of pottery, comparing them by eye to judge their provenance is standard practice, so the above procedure is in line with general methods accepted in archaeology. Nevertheless, a more rigorous digital analysis is carried out in my recently submitted manuscript with Dr Gerogiorgis.

But this digital analysis is not essential because our brain is brilliant at detecting correlations in patterns. This is how we sense and navigate the world, recognise faces, read and communicate. Everything we do involves detecting correlations - matching patterns we sense with those we are already familiar with. The stronger the correlation, the better we can recognise what the object, face, text or sound is. So the statistics of pattern matching is fundamental to our existence - it is not just an arcane topic in archaeoastronomy.

Indeed, denying this kind of pattern matching exercise is possible, is to deny the very nature of our existence.  We operate according to the same principles.

Let's put this another way. Let's consider an extreme example - known as a 'limit' in physics jargon. Suppose instead the patterns on Pillar 43 consisted of dots joined by lines. And suppose they were incredibly similar to the expected ones in Stellarium. What then? Could anyone then deny that Pillar 43 encoded a set of constellations? Suppose the correlation was so good it was measured in the region of 1 in 1 decillion. Would that be good enough? Yes? Good (if your answer was no, then I think you are irrational).

So it must be admitted that it is possible to match patterns with confidence. This means it must be possible, in principle, to interpret ancient artworks with confidence too. Moreover, it doesn't matter that the astronomical hypothesis was invented after Pillar 43 was viewed because, clearly, it would be logically impossible to invent a hypothesis before Pillar 43 was viewed. Requiring a hypothesis a priori, i.e. before the Pillar is viewed, is anti-science. It would also invalidate all decoding of ancient languages, including the decoding of Egyptian hieroglyphics which were decoded only after the Rosetta stone was found. And, it would invalidate the decoding of Palaeolithic lunar calendars by Bacon et al. (Cambridge Archaeological Journal, 2022), but I don't see the YouTube sceptics complaining about that paper.

So then we should ask, what level of correlation is acceptable for Pillar 43? Or equivalently, how much noise can we tolerate? Is it ok that the patterns on Pillar 43 are animal symbols rather than points and lines? Is it ok that they are not perfect matches to the expected constellations in Stellarium? Is it okay if some of their positions are slightly out?

The answer is up to you, but from a scientific perspective its the statistics that tell us how confident we can be in the hypothesis. And Occam's razor tells us we should compare hypotheses to find the most likely explanation - the more we can explain with less information as input, the better the hypothesis. 


Other models for Pillar 43

Notroff et al. proposed the animal symbols on Pillar 43 are mythological creatures or guardians instead, and perhaps different animals were important to different groups of people. They also made this proposal a posteriori, i.e. after first making their observations. However, this proposal cannot be tested statistically and it explains nothing about the the specific arrangement of the animal symbols on Pillar 43. But, the proposal the animals were mythological creatures is not inconsistent with them also being constellations. So their proposal really adds very little to the debate, and it says nothing about the possibility the astronomical explanation is incorrect.

Regarding the possibility Pillar 43 signifies a skull cult (a more recent a posteriori suggestion by the site's more recent archaeological leadership; Gresky et al. (2017)), not only does the circular disk at the centre of Pillar 43 not resemble a skull (there is clear evidence that the artists could have made it look like a skull or head if they wanted to - simply look at the other carvings), but again this proposal cannot be tested statistically and it explains nothing about the specific arrangements of animals on Pillar 43 or any of the other geometric symbols. It is also almost certainly wrong, because for the astronomical interpretation we find parr is extremely close to 1, and the only reasonable explanation for that correlation is the astronomical hypothesis. Occam's razor also says we should overwhelmingly prefer the astronomical interpretation.

So, we can determine how deadly the pill is, but not why it is deadly, given only the death-rate. And we can determine whether it is likely the people in the room were deliberately arranged, but not how. And we can determine how likely it is the animal symbols on Pillar 43 represent constellations similar to the Greek ones. But this analysis does not explain how this happened.

Regarding this latter issue, the probability of two unrelated cultures inventing similar constellation sets is so small, it is reasonable to assume there was some degree of cultural transmission of constellation data between Gobekli Tepe and us. This is a far more likely explanation for the observed correlation. Occam's razor to the rescue again.

However, Bayes’ theorem tells us that other evidence can influence the statistical test for Pillar 43, i.e. Occam’s razor applies. Sweatman and Tsikritsis made an unbiased test; a = 1. This is fine, provided there are reasonable grounds for believing the symbolism at Gobekli Tepe could be astronomical.

Fortunately, there is now plenty of good evidence that Sweatman and Tsikritis were right; there are strong grounds for believing the symbolism at Gobekli Tepe is mainly astronomical. See my previous post about the 'lunisolar calendar' paper submitted for peer review which discusses this extensive evidence. See also my previous post about the 'Origins of the Greek constellations' paper submitted with Dr Gerogiorgis for peer review.

This then justifies Sweatman and Tsikritsis' unbiased statistical test. In fact, given the strength of the evidence in those two submitted papers, it's likely that a << 1. This evidence also reveals how this correlation happened. As expected, it is very likely it was culturally transmitted.


Notroff et al.'s rebuttal.

As well as suggesting an alternative explanation for the animal symbols (see above), Notroff et al. (2017) proposed four other arguments against our 2017 paper as follows;
  1. They thought Gobekli Tepe’s enclosures were roofed, and that would limit astronomical observations.
  2. They suggested that the ~1000 year gap between the Younger Dryas impact and the dating of Enclosure D was a problem.
  3. They thought our selection of pillars was arbitrary.
  4. They were sceptical that Greek constellations could be recognised on a Pillar that preceded them by something like 9000 years.
In our counter-rebuttal (Sweatman and Tsikritsis (2017*)), we responded along the following lines;
  1. It makes no difference to the decoding of the symbols whether GT was roofed or not. It’s totally irrelevant.
  2. The ~1000 year gap between the YD impact and the earliest radiocarbon date of Enclosure D is entirely expected. After all, we would not expect a grand structure to be built immediately after the impact event, since at the time of the impact dwellings were typically much more primitive – they were basically relatively small semi-subterranean round-houses. There was no sign of the monumentality we observe at Gobekli Tepe then. Tell Qaramel probably had the largest known structures at that time, but even these don’t come close to Gobekli Tepe's enclosures, and the prominent symbolism isn’t present there either. No, it must take a long time to go from Early Natufian architecture to Gobekli Tepe's enclosures – perhaps 1000 years. And it takes time for a religion or cult based on the Younger Dryas impact to develop and then motivate the construction of such grand enclosures with such fantastic carvings. So a significant time gap between the YD impact and the temple-like structures of Gobekli Tepe is entirely expected.
  3. The accusation that our selection of pillars is arbitrary is nonsense. In what sense is it arbitrary? We selected the only pillar that had enough information to decode with confidence – the most highly decorated pillar where there is an obvious solar symbol. This is not arbitrary. Everything else follows from that.
  4. As for the stability of constellations, Notroff et al. offer no evidence to support their view. It’s just their opinion. However, Klauss Schmidt, who led excavations initially, suggested the boar symbol could be related to the Erymanthian Boar of classical Greece, and he suggested the snakes could be pre-cursors to the Uraeus symbol of ancient Egypt. If these symbols have endured, then why not constellations? But for some unspecified reason Klauss didn’t think the scorpion was related to Scorpius, despite the obvious solar disk on Pillar 43. This seems to be inconsistent, but is a common bias against ancient astronomy among archaeologists (see here). Given all the likely astronomical symbolism at Gobekli Tepe, this bias is unsupportable. Also, consider European Palaeolithic cave art in caves such as Lascaux and Chauvet. The art in those caves is separated by over 20,000 years, and yet there is very little difference in style and technique. So that artistic tradition must have been culturally transmitted down the generations for over 20,000 years. If artistic culture can be transmitted for 10s of thousands of years, so can constellations. Furthermore, there is good evidence (see D'Huy, Berezkin and related papers) that some constellations, like Orion and the Pleiades, are extremely old and might have originated from the middle Palaeolithic (> 50,000 years ago). Thus, the possibility that some constellations, which reference patterns in the sky which are relatively fixed and everyone can see, have come down to us from Gobekli Tepe is entirely plausible. Of course, I’m not saying there haven’t been any changes to these constellations. There clearly have been. All I’m saying is that many of them are still recognisable.

Archaeologists' biases

Perhaps the bias against archaeoastronomy displayed by Notroff et al., and some other archaeologists and YouTube commentators, is simply because they prefer to look down and not up. In fact, to my knowledge, no courses on archaeoastronomy are included in any archaeological or anthropological undergraduate degree programmes in any university anywhere in the world. This problem is harming the archaeology and anthropology professions because it is clear our ancient ancestors were very interested in the sky. And this is obvious to everyone else.

-------------------------------------------------------------------------
Sweatman and Tsikritsis (2017), Mediterranean Archaeology and Archaeoastronomy vol. 17(1), p 233-250.
Notroff et al. (2017), Mediterranean Archaeology and Archaeoastronomy vol. 17(2), 57-74.
Sweatman and Tsikritsis (2017*), Mediterranean Archaeology and Archaeoastronomy vol. 17(2), 57-74.
Gresky, Haelm and Clare (2017), Science Advances vol. 3(6), e1700564.


Comments

Popular posts from this blog

Holliday et al.'s (2023) Gish Gallop: timing of the Younger-Dryas onset and Greenland platinum spike

Gobekli Tepe's Pillars