Stellar Exploratory Data Analysis or How to create the HR Diagram with R

I recently have started to refresh my skills with R programming language. I am doing the  Harvard Course on Data Science on EdX. I am using R Studio for doing all the exercises. In the second part of the course, Visualisation, which is an area of research interest for me, there is an exercise on stars dataset. But this exercise was available only to those who were crediting the course. Since I was not crediting, but only auditing I left the exercise as it is. But after a week or so I looked at the stars dataset. And thought I should do some explorations on this. For this we have to load the R package dslabs specially designed for this course. This post is detailing the exploratory data analysis with this dataset. (Disclaimer: I have used help from ChatGPT in writing this post for both content and code.)

> library(dslabs)

Once this is loaded, we load the stars dataset

data(stars)

Structure of the dataset

To understand what is the data contained in this data set and how is it structured we can use several ways. The head(stars) command will give use first few lines of the data set.

> head(stars) star magnitude temp type 1 Sun 4.8 5840 G 2 SiriusA 1.4 9620 A 3 Canopus -3.1 7400 F 4 Arcturus -0.4 4590 K 5 AlphaCentauriA 4.3 5840 G 6 Vega 0.5 9900 A

While the  tail(stars) gives last few lines of the data set

tail(stars) star magnitude temp type 91 *40EridaniA 6.0 4900 K 92 *40EridaniB 11.1 10000 DA 93 *40EridaniC 12.8 2940 M 94 *70OphiuchiA 5.8 4950 K 95 *70OphiuchiB 7.5 3870 K 96 EVLacertae 11.7 2800 M

To understand structure further we can use the str(stars) command

> str(stars) 'data.frame': 96 obs. of 4 variables: $star : Factor w/ 95 levels "*40EridaniA",..: 87 85 48 38 33 92 49 79 77 47 ...$ magnitude: num 4.8 1.4 -3.1 -0.4 4.3 0.5 -0.6 -7.2 2.6 -5.7 ... $temp : int 5840 9620 7400 4590 5840 9900 5150 12140 6580 3200 ...$ type : chr "G" "A" "F" "K" ...

In RStudio we can also see the data with View(Stars) function in a much nicer (tabular) way. It opens up the data in another frame as shown below.

Thus we see that it has 96 observations with four variables, namely star, magnitude, temp and type. The str(stars) command also tells use the datatype of the columns, they are all different: factor, num, int, chr. Let us understand what each of the column represents.

Name of stars

The star variable has the names of the stars as seen in the table above. Many of the names are of ancient and mythological origins, while some are modern. Most are of Arabic origin, while few are from Latin. Have a look at Star Lore of All Ages by William Olcott to know some of the mythologies associated with these names. Typically the alphabets after the star names indicate them being part of a stellar system, for example Alpha Centauri is a triple star system. The nomenclature is such that A represents the brightest member of the system, B the second brightest and so on. Also notice that some names have Greek pre-fixes, as in the case of of Alpha Centauri. This Greek letter scheme was introduced by Bayer in 1603 and is known as Bayer Designation. The Greek letters  denote the visual magnitude or brightness (we will come to the meaning of this next) of the stars in a given constellation. So Alpha Centuari would mean the brightest star in the Centaurus constellation. Before invention of the telescope the number of stars that are observable were limited by the limits of human visual magnitude which is about +6. With invention of telescope and their continuous evolution with increasing light gathering power, we discovered more and more stars. Galileo is the first one to view new stars and publish them in his Sidereal Messenger. He shows us that seen through the telescope, there are many more stars in the Pleiades constellation than can be seen via naked eyes (~+6 to max +7 with about 4200 stars possibly visible).

Soon, so many new stars were discovered that it was not possible to name them all. So coding of the names begun. The large telescopes which were constructed would do a sweep of the sky using big and powerful lenses and would create catalogue of stars. Some of the names in the data set indicate these data sets, for example HD denotes Henry Draper Catalogue.

Magnitudes of stars

Now let us look at the other three columns present us with observations of these stars. Let us understand what they mean. The second column represents magnitude of the stars. The stellar magnitude is of two types: apparent and absolute. The apparent magnitude is a measure of the brightness of the star and depends on its actual brightness, distance from us and any loss of the brightness due to intervening media. The magnitude scale was devised by Claudius Ptolemy in second century. The first magnitude stars were the brightest in the sky with sixth being the dimmest. The modern scale follows this classification and has made it mathematical. The scale is reverse logarithmic, meaning that lower the magnitude, brighter is the object. A magnitude difference of 1.0 corresponds to a brightness ratio of $\sqrt[5]{100}$ or about 2.512. Now if you are wondering why the magnitude scale is logarithmic, the answer lies in the physiology of our visual system. As with the auditory system, our visual system is not linear but logarithmic. What this means is that if we perceive an object to be of double brightness of another object, then their actual brightness (as measured by a photometer) are about 2.5. This fact is encapsulated well in the Weber-Fechnar law. The apparent magnitude of the Sun is about -26.7, it is after all the brightest object in the sky for us. Venus, when it is brightest is about -4.9. The apparent magnitude of Neptune is +7.7 which explains why it was undiscovered till the invention of the telescope.

But looking at the table about the very first entry lists Sun’s magnitude as +4.8. This is because the dataset contains the absolute magnitude and not the apparent magnitude. Absolute magnitude is defined as “apparent magnitude that the object would have if it were viewed from a distance of exactly 10 parsecs (32.6 light-years), without dimming by interstellar matter and cosmic dust.” As we know, the brightness of an object is inversely proportional to square of the distance (inverse square law). Due to this fact very bright objects can appear very dim if they are very far away, and vice versa. Thus if we place the Sun at a distance of about 32.6 light years it will be not-so-bright and will be an “average” star with magnitude +4.8. The difference in these two magnitudes is -31.57 and this translates to huge brightness difference of 3.839 $\times$ 1012. And of course this  definition does not take into account the interstellar matter which further dims the stars. Thus to find the absolute magnitude of the stars we also need to know their distance. This is possible for some nearby stars for which the parallax has been detected. But for a vast majority of stars, the parallax is too small to be detected because they are too faraway. The distance measure parsec we saw earlier is defined on basis of parallax, one parsec is the distance at which 1 AU (astronomical unit: distance between Earth and Sun) subtends an angle of one arcsecond or 1/3600 of a degree.

Thus finding distance to the stars is crucial if we want to know their actual magnitudes. For finding the cosmic distances various techniques are used, we will not go into their details. But for our current purpose, we know that the stars dataset has absolute magnitudes of stars. The range of magnitudes in the dataset is

> range(stars$magnitude) [1] -8 17 Thus stars in the dataset have a difference of 25 magnitudes, that is a brightness ratio of 105! Which are these brightest and dimmest stars? And how many stars of each magnitude are there in the data set? We can answer these type of questions with simple queries to our dataset. For starters let us find out the brightest and dimmest stars in the dataset. Each row in the dataset has an index, which is the first column in the table from RStudio above. Thus if we were to write: > stars[1] it will give us all the entries of the first column, star 1 Sun 2 SiriusA 3 Canopus 4 Arcturus 5 AlphaCentauriA 6 Vega 7 Capella 8 Rigel 9 ProcyonA 10 Betelgeuse ... ... But if we want only a single row, instead of a column, we have to tell that by keeping a , in the index 1. Thus for the first row we write  > stars[1,] > star magnitude temp type 1 Sun 4.8 5840 G Thus to find the brightest or dimmest star we will have to find its index and then we can find its name from the corresponding column. So how do we do that? For this we have functions which.max and which.min, we use them thus: > which.max(stars$magnitude) [1] 76

We feed this to the dataset and get
 > stars[76,] star magnitude temp type 76 G51-I5 17 2500 M

This can also be done in a single line

> stars[which.min(stars$magnitude), ] star magnitude temp type 45 DeltaCanisMajoris -8 6100 F Now let us check the distribution of these magnitudes. The simplest way to do this is to create a histogram using the hist function. hist(stars$magnitude)

This gives the following output

As we can see it has by default binned the magnitudes in bins of 5 units and the distribution here is bimodal with one peak between -5 and 0 and another peak between 10 and 15. We can tweak the width of the bars to get a much finer picture of the distribution. For this hist function has option to add breaks manually. We have used the seq function here ranging from -10 to 20 in steps of 1.

> hist(stars$magnitude, breaks = seq(-10, 20, by = 1)) And this gives us: Thus we see that the maximum number of stars (9) are at -1 magnitude and three magnitudes have one star each while +3 magnitude doesn’t have any stars. This histogram could be made more reader friendly if we can add the count on the bars. For this we need to get some coordinates and numbers. We first get the counts mag_data <- hist(stars$magnitude, breaks = seq(-10,20, 1), plot = FALSE)

This will give us the actual number of counts

> [1] 0 1 2 1 7 6 4 3 3 9 6 4 4 0 2 5 2 2 2 1 5 7 3 7 5 3 2 0 0 0

Now to place them at the middle of the bars of histogram we need midpoints of the bars, we use mag_data$mids to find them and mag_data_counts for the count for labels. > text(mag_data$mids, mag_data$counts, labels = mag_data$counts, pos = 3, cex = 0.8, col = "black")

To get the desired graph

Thus we have a fairly large distribution of stellar magnitudes.

Now if we ask ourselves this question How many stars in this dataset are visible to the naked eye? What can we say? We know that limiting magnitude for naked eye is +6. So, a simple query should suffice:

count(stars %>% filter(magnitude <= 6)) n 1 57

(Here we have used the pipe function  %>%  to pass on data from one argument to another from the dplyr pacakge. This query shows that we have 57 stars which have magnitude less than or equal to 6. Hence these many should be visible… But wait it is the absolute magnitude that we have in this dataset, so this question itself cannot be answered unless we have the apparent magnitudes of the stars. Though computationally correct, this answer has no meaning as it is cannot be treated same as the one with apparent magnitude which we experience while watching the stars.

Temperature of Stars

The third column in the data set is the temp or the temperature. Now, at one point in the history of astronomy people believed that we would never be able to understand the structure or the content of the stars. But the invention of spectroscopy as a discipline and its application to astronomy made this possible. With the spectroscope applied to the end of the telescope (astronomical spectroscopy), we could now understand the composition of the stars, their speed and their temperature. The information for the composition came from the various emission and absorption lines in the spectra of the stars, which were then compared with similar lines produced in the laboratory by heating various elements. Helium was first discovered in this manner: first in the spectrum of the Sun and then in the laboratory. For detailed story of stellar spectroscopy one can see the book Astronomical Spectrographs and Their History by John Hearnshaw. Though an exact understanding of the origin of the spectral line came only after the advent of quantum mechanics in early part of 20th century.

But the spectrum also tells us about the surface temperature of the stars. How this is so? For this we need to invoke one of the fundamental ideas in physics: the blackbody radiation. Now if we find the intensity of radiation from a body at different wavelengths (or frequencies) we get a curve. This curve is typical and for different temperatures we get unique curves (they don’t intersect). Of course this is true for an ideal blackbody which is an idealized opaque, non-reflective body. Stellar spectrum is like that of an ideal blackbody,  this continuous spectrum is punctuated with absorption and emission lines as shown in the book cover above.

The frequency or wavelength at which the radiation has maximum intensity (brightness/luminosity) is related to the temperature of the body, typical curves are shown as above. Stars behave almost as ideal black bodies. Notice that as the temperature of the body increases the peak radiation wavelength increases (frequency is reduced) as shown in the diagram above. These relationships are given by the formula

$$L = 4 \pi R^{2} \sigma T^{4}$$

where $L$ is the luminosity, $R$ is the radius, $\sigma$ is Stephan’s constant and $T$ is the temperature. This equation tells us that $L$ is much more dependent on the $T$, so hotter stars would be more brighter.

It was failure of the classical ideas of radiation and thermodynamics to explain the nature of blackbody radiation that led to formulation of quantum mechanics by Max Planck in the form of Planck’s law for quantisation of energy. For a detailed look at the history of this path breaking episode in history of science one of the classics is Thomas Kuhn’s Black-Body Theory and the Quantum Discontinuity, 1894—1912.

That is to say hotter bodies have shorter peak frequencies. In other words, blue stars are hotter than the red ones. (Our hot and cold symbolic colours on the plumbing peripherals needs to change: we have it completely wrong!) Thus the spectrum of the stars gives as its absolute temperature, along with all other information that we can obtain from the stars. The spectrum is our only source of information for stars. This is what is represented in the third column of our data. For our dataset the range of stellar temperatures we have a wide range of temperatures.

range(stars$temp) [1] 2500 33600 Let us explore this column a bit. If we plot a histogram with default options we get: > hist(stars$temp)

This is showing maximum stars have a temperature below 10000. We can bin at 1000 and add labels to get a much better sense. Which star has 0 temperature??

hist(stars$temp, breaks = seq(0,35000, 1000)) > temp_data <- hist(stars$temp, breaks = seq(0,35000, 1000), plot = FALSE) > text(temp_data$mids, temp_data$counts, labels = temp_data$counts, pos = 3, cex = 0.8, col = "black") This plot gives us much better sense of the distribution of stellar temperatures. With most of the temperatures being in 2000-3000 degrees Kelvin range. The table()  function also provides useful information about distribution of temperatures in the column. > table(stars$temp)

2500 2670 2800 2940 3070 3200 3340 3480 3750 3870 4130 4590 1 10 7 5 1 3 4 1 1 2 3 3 4730 4900 4950 5150 5840 6100 6580 6600 7400 7700 8060 9060 1 5 1 2 2 2 1 1 2 1 2 1 9300 9340 9620 9700 9900 10000 11000 12140 12400 13000 13260 14800 1 2 3 1 4 1 1 1 1 1 1 1 15550 20500 23000 25500 26950 28000 33600 1 4 2 5 1 2 1

While the summary() function provides the basic statistics:

> summary(stars$temp) Min. 1st Qu. Median Mean 3rd Qu. Max. 2500 3168 5050 8752 9900 33600    Type of Stars The fourth and final column of our data is type. This category of data is again based on the spectral data of stars and is type of spectral classification of stars. “The spectral class of a star is a short code primarily summarizing the ionization state, giving an objective measure of the photosphere’s temperature. ” The categories of the type of stars and their physical properties are summarised in the table below. The type of stars and their temperature is related, with “O” type stars being the hottest, while “M” type stars are the coolest. The Sun is an average “G” type star. There are several mnemonics that can help one remember the ordering of the stars in this classification. One that I still remember from by Astrophysics class is Oh Be A Fine Girl/Guy Kiss Me Right Now. Also notice that this “type” classification is also related to size of the stars in terms of solar radius. In our dataset, we can see what type of stars we have by > stars$type [1] "G" "A" "F" "K" "G" "A" "G" "B" "F" "M" "B" "B" "A" "K" [15] "B" "M" "A" "K" "A" "B" "B" "B" "B" "B" "B" "A" "M" "B" [29] "K" "B" "A" "B" "B" "F" "O" "K" "A" "B" "B" "F" "K" "B" [43] "B" "K" "F" "A" "A" "F" "B" "A" "M" "K" "M" "M" "M" "M" [57] "M" "A" "DA" "M" "M" "K" "M" "M" "M" "M" "K" "K" "K" "M" [71] "M" "G" "F" "DF" "M" "M" "M" "M" "K" "M" "M" "M" "M" "M" [85] "M" "DB" "M" "M" "A" "M" "K" "DA" "M" "K" "K" "M"

Our Sun is G-type star in this classification (first entry). If we use the table() function on this column we get the frequency of each type of star in the dataset.

> table(stars$type) A B DA DB DF F G K M O 13 19 2 1 1 7 4 16 32 1 And to see a barplot of this table we will use ggplot2() package. Load the package using library using library(ggplot2) and then > stars %>% ggplot(aes(type)) + geom_bar() + geom_text(stat = "count", aes(label = after_stat(count)), vjust = -0.5, size = 4) Thus we see that “M” type stars are the maximum in our dataset. But we can do better, we can sort this data according the frequency of the types. For this we use the code: > type_count <- table(stars$type) > # count the frequencies > sorted_type <- names(sort(type_count)) > # sort them > stars$type <- factor(stars$type, levels = sorted_type) > # reorder them with levels and plot them > stars %>% ggplot(aes(type)) + geom_bar(fill = "darkgray") + geom_text(stat = "count", aes(label = after_stat(count)), vjust = -0.5, size = 4)

And we get

To plot HR Diagram

Now, given my training in astronomy and astrophysics, the first reaction that came to my mind after seeing this data was this is the data for the HR Diagram! The HR diagram presents us with the fundamental relationship of types and temperature of stars. This was an crucial step in understanding stellar evolution. The intials HR stand for the two astronomers who independently found this relationship: The diagram was created independently in 1911 by Ejnar Hertzsprung and by Henry Norris Russell in 1913.

By early part of 20th century several star catalogues had been around, but nothing stellar evolution or structure was known. The stellar spectrographs revealed what elements were present in the stars, but the energy source of the stars was still an unresolved question. Classical physics had no answer to this fundamental question about how stars were able to create so much energy (for example, see Stars A Very Short Introduction by James Kaler on the idea that charcoal powers the Sun by Lord Kelvin). Added to this was the age of the stars, from geological data and idea of geological deep time, the Sun was estimated to be 4 billion years old as was the Earth. So stars had been producing so much energy for such a long time! But that is not the point of this post, the HR diagram definitely helped the astronomers think about the idea that stars might not be static but evolve in time. The International Astronomical Union conducted a special symposium titled The HR Diagram in 1977. The proceedings of the symposium have several articles of interest on the history of creation and interpretation of the HR Diagram.

I think it was but natural that astronomers tried to find correlations between various properties of thousands of stars in these catalogues. And when they did they find a (co-)relationship between them. The HR diagram exists in many versions, but the basic idea is to plot the absolute magnitude and temperature (or colour index). Let us plot these two  to see the co-relation, for this we again use the ggplot2() pacakge and its scatterplot function geom_point().

> stars %>% ggplot(aes(temp, magnitude)) + geom_point()

This gives us the basic plot of HR diagram.

Immediately we can see that the stars are not randomly scattered on this plot, but are grouped in clusters. And most of them lie in a “band”. Though there are outliers at the lower temperature and magnitude range and high magnitude and temperature around 10-15 thousand range. We see that most stars lie in a band which is called the “Main Sequence”. We can try to fit a function here in this plot using some options in the ggplot() library, we use geom_smooth() function for this and get:

stars %>% ggplot(aes(temp, magnitude)) + geom_point() + geom_smooth( se = FALSE, color = “red”)

Of course this smooth curve is a very crude (perhaps wrong?) approximation of the data, but it certainly points us towards some sort of correlation between the two quantities for most of the stars. But wait, we have another categorical variable in our dataset, the type of stars. How are the different types of stars distributed on this curve? For this we introduce type variable in the aesthetics argument of ggplot() to colour the stars on our plot according to this category:

> stars %>% ggplot(aes(temp, magnitude, color = type)) + geom_smooth( se = FALSE, color = "red") + geom_point()

This produces the plot

Thus we see there is a grouping of stars by the type. Of course the colours in the palette here are not the true representatives of the star colours. The HR diagram was first published around 1911-13, when quantum mechanics was in its nascent stages. The ideas of Rutherford’s model were still extant and was just out. The fact that this diagram indicated a relationship between the magnitude and temperature, led to thinking about stellar structure itself and its ways of producing energy with fundamentally new ideas about matter and energy from quantum mechanics and their transformation from relativistic physics. But that is a story in future. For now, let us come to our HR diagram. From the dataset we have one more variable, the star name which could be used in this plot. We can name all the stars in the plot (there are only 96). For this we use the geom_text() function in ggplot()

> stars %>% ggplot(aes(temp, magnitude, color = type), label = star) + geom_smooth( se = FALSE, color = "red") + geom_point() + geom_text((aes( label = star)), nudge_y = 0.5, size = 3)

This produces a rather messy plot, where most of the starnames are on top of each other and not readable:

To overcome this clutter we use another package ggrepel() with the following code:

> stars %>% ggplot(aes(temp, magnitude, color = type), label = star) + geom_smooth( se = FALSE, color = "red") + geom_text_repel(aes(label = star))

This produces the plot with the warning "Warning message: ggrepel: 13 unlabeled data points (too many overlaps). Consider increasing max.overlaps ". To overcome this we increase the max.overlaps to 50.

> stars %>% ggplot(aes(temp, magnitude, color = type), label = star) + geom_point() + geom_smooth( se = FALSE, color = "red") + geom_text_repel(aes(label = star), max.overlaps = 50)

This still appears cluttered a bit, scaling the plot while exporting gives this plot, though one would need to zoom in to read the labels.

Of course with a different data set, with larger number and type of stars we would see slightly different clustering, but the general pattern is the same.

We thus see that starting from the basic data wrangling we can generate one of the most important diagrams in astrophysics. I learned a lot of R in the process of creating this diagram. Next task is to

How big is the shadow of the Earth?

The Sun is our ultimate light source on Earth. The side of the Earth facing the Sun is bathed in sunlight, due to our rotation this side changes continuously. The side which faces the Sun has the day, and the other side is the night, in the shadow of the entire Earth. The sun being an extended source (and not a point source), the Earth’s shadow had both umbra and penumbra. Umbra is the region where no light falls, while penumbra is a region where some light falls. In case of an extended source like the Sun, this would mean that light from some part of the Sun does fall in the penumbra.  Occasionally, when the Moon falls in this shadow we get the lunar eclipse. Sometimes it is total lunar eclipse, while many other times it is partial lunar eclipse. Total lunar eclipse occurs when the Moon falls in the umbra, while partial one occurs when it is in penumbra. On the other hand, when the Moon is between the Earth and the Sun, we get a solar eclipse. The places where the umbra of the Moon’s shadow falls, we get total solar eclipse, which is a narrow path on the surface of the Earth, and places where the penumbra falls a partial solar eclipse is visible. But how big is this shadow? How long is it? How big is the umbra and how big is the penumbra? We will do some rough calculations, to estimate these answers and some more to understand the phenomena of eclipses.

We will start with a reasonable assumption that both the Sun and the Earth as spheres. The radii of the Sun, the Earth and the Moon, and the respective distances between them are known. The Sun-Earth-Moon system being a dynamic one, the distances change depending on the configurations, but we can assume average distances for our purpose.

[The image above is interactive, move the points to see the changes. This construction is not to scale!. The simulation was created with Cinderella ]

The diameter of the Earth is approximately 12,742 kilometers, and the diameter of the Sun is about 1,391,000 kilometers, hence the ratio is about 109, while the distance between the Sun and the Earth is about 149 million kilometers. A couple of illustrations depicting it on the correct scale.

The Sun’s (with center A) diameter is represented by DF, while EG represents Earth’s (with center C) diameter. We connect the centers of Earth and Sun. The umbra will be limited in extent in the cone with base EG and height HC, while the penumbra is infinite in extent expanding from EG to infinity. The region from umbra to penumbra changes in intensity gradually. If we take a projection of the system on a plane bisecting the spheres, we get two similar triangles HDF and HEG. We have made an assumption that our properties of similar triangles from Euclidean geometry are valid here.

In the schematic diagram above (not to scale) the umbra of the Earth terminates at point H. Point H is the point which when extended gives tangents to both the circles. (How do we find a point which gives tangents to both the circles? Is this point unique?). Now by simple ratio of similar triangles, we get

$$\frac{DF}{EG} = \frac{HA}{HC} = \frac{HC+AC}{HC}$$

Therefore,

$$HC = \frac{AC}{DF/EG -1}$$

Now, $DF/EG = 109$, and $AC$ = 149 million km,  substituting the values we get the length of the umbra $HC \approx$  1.37 million km. The Moon, which is at an average distance of 384,400 kilometers,  sometimes falls in this umbra, we get a total lunar eclipse. The composite image of different phases of a total lunar eclipse below depicts this beautifully. One can “see” the round shape of Earth’s umbra in the central three images of the Moon (red coloured) when it is completely in the umbra of the Earth (Why is it red?).

When only a part of umbra falls on the moon we get a partial lunar eclipse as shown below. Only a part of Earth’s umbra is on the Moon.

So if the moon was a bit further away, lets say at 500,000 km, we would not get a total solar eclipse. Due to a tilt in Moon’s orbit not every new moon is an eclipse meaning that the Moon is outside both the umbra and the penumbra.

The observations of the lunar eclipse can also help us estimate the diameter of the Moon.

Similar principle applies (though the numbers change) for solar eclipses, when the Moon is between the Earth and the Sun. In case of the Moon, ratio of diameter of the Sun and the Moon is about 400. With the distance between them approximately equal to the distance between Earth and the Sun. Hence the length of the umbra using the above formula is 0.37 million km or about 370,000 km. This makes the total eclipse visible on a small region of Earth and is not extended, even the penumbra is not large (How wide is the umbra and the penumbra of the moon on the surface of the Earth?).

When only penumbra is falling on a given region, we get the partial solar eclipse.

You can explore when solar eclipse will occur in your area (or has occurred) using the Solar Eclipse Explorer.

This is how the umbra of the Moon looks like from space.

And same thing would happen to a globe held in sunlight, its shadow would be given by the same ratio.

Thus we see that the numbers are almost matched to give us total solar eclipse, sometimes when the moon is a bit further away we may also get what is called the annular solar eclipse, in which the Sun is not covered completely by the Moon. Though the total lunar eclipses are relatively common (average twice a year) as compared to total solar eclipses (once 18 months to 2 years). Another coincidence is that the angular diameters of the Moon and the Sun are almost matched in the sky, both are about half a degree (distance/diameter ratio is about 1/110). Combined with the ratio of distances we are fortunate to get total solar eclipses.

Seeing and experiencing a total solar eclipse is an overwhelming experience even when we have an understanding about why and how it happens. More so in the past, when the Sun considered a god, went out in broad daylight. This was considered (and is still considered by many) as a bad omen. But how did the ancient people understand eclipses?  There is a certain periodicity in the eclipses, which can be found out by collecting large number of observations and finding patterns in them. This was done by ancient Babylonians, who had continuous data about eclipses from several centuries. Of course sometimes the eclipse will happen in some other part of the Earth and not be visible in the given region, still it could be predicted.   To be able to predict eclipses was a great power, and people who could do that became the priestly class. But the Babylonians did not have a model to explain such observations. Next stage that came up was in ancient Greece where models were developed to explain (and predict) the observations. This continues to our present age.

The discussion we have had applies in the case when the light source (in this case the Sun) is larger than the opaque object (in this case the Earth). If the the light source is smaller than the object what will happen to the umbra? It turns out that the umbra is infinite in extent. You see this effect when you get your hand close to a flame of candle and the shadow of your hand becomes ridiculously large! See what happens in the interactive simulation above.

References

James Southhall Mirrors, Prisms and Lenses (1918) Macmillan Company

Eric Rogers Physics for the Inquiring Mind (1969) Princeton

A Short History of Initial Development of Quantum Mechanics

A timeline created using h5p. The reference book used is Lev Tarasov – Basic Concepts of Quantum Mechanics.

Examples of physics quiz with H5P content type

This post has two examples from the same problem using H5P multiple choice and fill in the blank content types.

This content is released under Creative Commons BY SA 4.0 License

What is mass renormalization?

A simple explanation of mass renormalization:

The difficulties associated with an infinite rest mass can be successfully overcome in calculation of various effects with the help of mass renormalization which essentially consists in the following. Suppose that it is required to calculate a certain effect and an infinite rest mass appears in the calculations. The quantity obtained as a result of calculations is infinite and is consequently devoid of any physical meaning. In order to obtain a physically reasonable result, another calculation’ is carried out, in which all factors, except those associated with the phenomenon under consideration, are present. This calculation also includes an infinite rest mass and leads to an infinite result. Subtraction of the second infinite result from the first leads to the cancellation of infinite quantities associated with the rest mass. The remaining quantity is finite and characterizes the phenomenon being considered. Thus, we can get rid of the infinite rest mass and obtain physically reasonable results which are confirmed by experiment. Such a method is used, for example, to calculate the energy of an electric field.

– A. N. Matveev, Electricity and Magnetism, p. 16

Rotating Earth: the proofs or significance of Leon Foucault’s pendulum – Part 1

In an earlier post, we had discussed proofs of the round shape of the Earth. This included some ancient and some modern proofs. There was, in general, a consensus that the shape of the Earth was spherical and not flat and the proofs were given since the time of ancient Greeks. Only in the middle ages, there seems to have been some doubt regarding the shape of the Earth. But amongst the learned people, there was never a doubt about the shape of the Earth. Counter-intuitive it may seem when you look at the near horizon, it is not that counter-intuitive. We can find direct proofs about it by looking around and observing keenly.
But the rotation of Earth proved to be a more difficult beast to tame and is highly counter-intuitive. Your daily experience does not tell you the Earth is rotating, rather intuition tells you that it is fixed and stationary. Though the idea of a moving Earth is not new, the general acceptance of the idea took a very long time. And even almost 350 years after Copernicus’ heliocentric model was accepted, a direct proof of Earth’s rotation was lacking. And this absence of definitive proof was not due to a lack of trying. Some of the greatest minds in science, mathematics and astronomy worked on this problem since Copernicus but were unable to solve it. This included likes of Galileo, Newton, Descartes, and host of incredibly talented mathematicians since the scientific revolution. Until Leon Foucaultin the mid-1800s provided not one but two direct proofs of the rotation of the Earth. In this series of posts, we will see how this happened.
When we say the movement of the Earth, we also have to distinguish between two motions that it has: first its motion about its orbit around the Sun, and second its rotational motion about its own axis. So what possible observational proofs or direct evidence will allow us to detect the two motions? In this post, we will explore how our ideas regarding these two motions of the Earth evolved over time and what type of proofs were given for and against it.

Even more, there was a simple geometrical fact directly opposed to the Earth’s annual motion around the Sun and there was nothing that could directly prove its diurnal rotation. (Mikhailov, 1975)

Let us consider the two components of Earth’s motion. The first is the movement around the Sun along the orbit. The simplest proof for this component of Earth’s motion is from the parallax that we can observe for distant stars. Parallax is the relative change in position of objects when they are viewed from different locations. The simplest example of this can be seen with our own eyes.
Straighten your hand, and hold your thumb out. Observe the thumb with both the eyes open. You will see your thumb at a specific location with respect to the background objects. Now close your left eye, and look at how the position of the thumb has changed with respect to the background objects. Now open the right eye, and close the left one. What we will see is a shift in the background of the thumb. This shift is related by simple geometry to the distance between our eyes, called the baseline in astronomical parlance. Thus even a distance of the order of a few centimetres causes parallax, then if it is assumed that Earth is moving around the Sun, it should definitely cause an observable parallax in the fixed stars. And this was precisely one of the major roadblock

Earth moving around an orbit raised mechanical objections that seemed even more serious in later ages; and it raised a great astronomical difficulty immediately. If the Earth moves in a vast orbit, the pattern of fixed stars should show parallax changes during the year. (Rogers, 1960)

The history of cosmic theories … may without exaggeration be called a history of collective obsessions and controlled schizophrenias.
– Arthur Koestler, The Sleepwalkers

Though it is widely believed that Copernicus was the first to suggest a moving Earth, it is not the case. One of the earliest proponents of the rotating Earth was a Greek philosopher named Aristarchus. One of the books by Heath on Aristarchus is indeed titled Copernicus of Antiquity (Aristarchus of Samos). A longer version of the book is Aristarchus of Samos: The Ancient Copernicus. In his model of the cosmos, Aristarchus imagined the Sun at the centre and the Earth and other planets revolving around it. At the time it was proposed, it was not received well. There were philosophical and scientific reasons for rejecting the model.

First, let us look at the philosophical reasons. In ancient Greek cosmology, there was a clear and insurmountable distinction between the celestial and the terrestrial. The celestial order and bodies were believed to be perfect, as opposed to the imperfect terrestrial. After watching and recording the uninterrupted waltz of the sky over many millennia, it was believed that the heavens were unchangeable and perfect. The observations revealed that there are two types of “stars”. First the so-called “fixed stars” do not change their positions relative to each other. That is to say, their angular separation remains the same. They move together as a group across the sky. Imagination coupled with a group of stars led to the conceiving of constellations. Different civilizations imagined different heroes, animals, objects in the sky. They formed stories about the constellations. These became entwined with cultures and their myths.

The second type of stars did change their positions with respect to other “fixed stars”. That is to say, they changed their angular distances with “fixed stars”. These stars, the planets, came to be called as “wandering stars” as opposed to the “fixed stars”.

Ancient Greeks called these lights πλάνητες ἀστέρες (planētes asteres, “wandering stars”) or simply πλανῆται (planētai, “wanderers”),from which today’s word “planet” was derived.
Planet

So how does one make sense of these observations? For the fixed stars, the solution is simple and elegant. One observes the set of stars rising from the east and setting to the west. And this set of stars changes across the year (which can be evidenced by changing seasons around us). And this change was found to be cyclical. Year after year, with observations spanning centuries, we found that the stars seem to be embedded on inside of a sphere, and this sphere rotates at a constant speed. This “model” explains the observed phenomena of fixed stars very well.
The unchanging nature of this cyclical process observed, as opposed to the chaotic nature on Earth, perhaps led to the idea that celestial phenomena are perfect. Also, the religious notion of associating the heavens with gods, perhaps added to them being perfect. So, in the case of perfect unchanging heavens, the speeds of celestial bodies, as evidenced by observing the celestial sphere consisting of “fixed stars” was also to be constant. And since celestial objects were considered as perfect, the two geometrical objects that were regarded as perfect the sphere and the circle were included in the scheme of heavens. To explain the observation of motion of stars through the sky, their rising from the east and setting to the west, it was hypothesized that the stars are embedded on the inside of a sphere, and this sphere rotates at a constant speed. We being fixed on the Earth, observe this rotating sphere as the rising and setting of stars. This model of the world works perfectly and formed the template for explaining the “wandering stars” also.
These two ideas, namely celestial objects placed on a circle/sphere rotating with constant speed, formed the philosophical basis of Greek cosmology which would dominate the Western world for nearly two thousand years. And why would one consider the Earth to be stationary? This is perhaps because the idea is highly counter-intuitive. All our experience tells us that the Earth is stationary. The metaphors that we use like rock-solid refer to an idea of immovable and rigid Earth. Even speculating about movement of Earth, there is no need for something that is so obviously not there. But as the history of science shows us, most of the scientific ideas, with a few exceptions, are highly counter-intuitive. And that the Earth seems to move and rotate is one of the most counter-intuitive thing that we experience in nature.
The celestial observations were correlated with happenings on the Earth. One could, for example, predict seasons as per the rising of certain stars, as was done by ancient Egyptians. Tables containing continuous observations of stars and planets covering several centuries were created and maintained by the Babylonian astronomers. It was this wealth of astronomical data, continuously covering several centuries, that became available to the ancient Greek astronomers as a result of Alexander’s conquest of Persia. Having such a wealth of data led to the formation of better theories, but with the two constraints of circles/spheres and constant speeds mentioned above.
With this background, next, we will consider the progress in these ideas.
A stabilised image of the Milky Way as seen from a moving Earth.

Science, a humanistic approach

Science is an adventure of the whole human race to learn to live in and perhaps to love the universe in which they are. To be a part of it is to understand, to understand oneself, to begin to feel that there is a capacity within man far beyond what he felt he had, of an infinite extension of human possibilities . . .
I propose that science be taught at whatever level, from the lowest to the highest, in the humanistic way. It should be taught with a certain historical understanding , with a certain philosophical understanding , with a social understanding and a human understanding in the sense of the biography, the nature of the people who made this construction, the triumphs, the trials, the tribulations.
I. I. RABI
Nobel Laureate in Physics

via Project Physics Course, Unit 4 Light and Electromagnetism Preface
Do see the Project Physics Course which has come in Public Domain hosted at the Internet Archive, thanks to F.  James Rutherford.

Flying Circus of Physics…

The Flying Circus of Physics began one dark and dreary night in 1968 while I was a graduate student at the University of Maryland. Well, actually, to most graduate students nearly all nights are dark and dreary, but I mean that that particular night was really dark and dreary. I was a full-time teaching assistant, and earlier in the day I had given a quiz to Sharon, one of my students. She did badly and at the end turned to me with the challenge, “What has anything of this to do with my life?”
I jumped to respond, “Sharon, this is physics! This has everything to do with your life!”
As she turned more to face me, with eyes and voice both tightened, she said in measured pace, “Give me some examples.”
I thought and thought but could not come up with a single one. I had spent at least six years studying physics and I could not come up with even a single example.That night I realized that the trouble with Sharon was actually the trouble with me: This thing called physics was something people did in a physics building, not something that was connected with the real world of Sharon or me. So, I decided to collect some real-world examples and, to catch her attention, I called the collection The Flying Circus of Physics.

Archimedes and the Law of Lever

Archimedes & The Law Of The Lever

The lever presents us with one of the most simple of machines that humans have invented. In fact the lever is one of the six simple machines, which are the building blocks of any complicated, mechanical equipment that we produce. The total six simple machines are:
▪ Lever
▪ Wheel and axle
▪ Pulley
▪ Inclined plane
▪ Wedge
▪ Screw

Simple machines are devices which use mechanical advantage to multiply force. Simple machines can be used to increase the output force, this is at the cost of a proportional decrease in the distance moved by the load. With the mechanical advantage that you get in a lever, you can lift large loads, with application of much less force. We use levers in a variety of ways in our daily life. Just to name a few, the weighing balance, see-saw, even when you lift a weight with your own hand! How many can you identify?

But being a simple machine does not mean that the secret of it can be derived very easily. Just given with a lever, some weights and no other knowledge it is hard for us to derive the mathematical expression for the law of lever. Whereas the Newtonian mechanics gives us a proof, we will see how this proof was presented first. The proof is by Archimedes. Archimedes was the first person to reason and build a theory about the lever, amongst many others things he did besides crying Eureka!! We will sketch the outline of his proof about the law of lever.

If two weights w, W are placed on a horizontal weightless stick, which rests on a support called the fulcrum. One of the following three scenarios will occur. Either the stick will tilt towards the right or the left side, otherwise it will remain horizontal. When the stick remains horizontal it is said to be in equilibrium position. Archimedes considered the question:

“If W is at a distance D from the fulcrum and w is at a distance d from the fulcrum, what condition on W, D, w, d corresponds to equilibrium?”

The answer known to almost all students of physics is that the products wd and WD should be equal. Archimedes did not give a proof to this law in this form. For it is said he would have been offended by multiplying two entirely different quantities such as weight and length. The balance is expressed by him in terms of equality of two proportions W : d = D : w. The statement is that the weights balance at distances inversely proportional to their magnitudes.

Archimedes made some assumptions, which were supposed to be self evident. The assumptions are:

1. Equal weights at equal distances from the fulcrum balance. Equal weights at unequal distances do not balance, but the weight at the greater distance tilts the lever towards itself.
2. When two weights are balancing and we add some weight to one of the weights, the weights no longer balance. The weight to which we add goes down.
3. When two weights balance, we take some weight away from one, the weights no longer balance. The side holding the weight we did not change goes down.

Archimedes uses this assumptions to prove propositions which lead us to the law of the lever. The point in the proofs is that the assumptions should not contradict each other.

Proposition 1: Weights that balance at equal distances from the fulcrum are equal.

Proof: If they are not equal, remove the greater weight difference of the two weights. We have now two equal weights at equal distances from the fulcrum. But according to assumption 3 they do not balance. This contradicts assumption 1. Hence the proof.

Proposition 2: Unequal weights at equal distances do not balance, but the side holding the higher weight goes down.

Proof: Take out the difference between weights, by assumption 1, the remaining two weights balance. If we put back the weight that we have taken, then by assumption 2 weights now do not balance.

Proposition 3: Unequal weights balance at unequal distances from the fulcrum, the heavier weights being at the shorter distance.

Proof: Suppose that the heavier weight is W which is placed at A, and lighter weight w placed at B and the fulcrum is at C, consider the case that they balance each other. Remove W – w from the heavier weight W, thus we have two equal weights. Now by assumption 3, the remaining weights do not balance, but w goes down.

But this is not possible for the following reasons:
Either AC = CB, AC greater than CB, or AC less than CB. Archimedes rules out the first two possibilities. If AC = CB, by assumption 1 the remaining weights balance. If AC is greater than CB then again by assumption 1 weight at A will go down. Thus AC must be less than CB.

Proposition 4: If two equal weights have different centers of gravity, then the center of gravity of the two together is the midpoint of the line segment joining their centers of gravity.

[Archimedes does not define the term “center of gravity”, but approaches the term axiomatically. For a mathematical definition knowledge of integral calculus is required. But we can still understand the term in physical terms. We can assume that all the weight is concentrated at the center of gravity. Another ways is to visualize that the entire weight acts as if it is located at one point, which we know as centre of gravity. ]

Proof: Let the equal weights be w and W, with centers of gravity located at A and B. Let M be the midpoint of the segment AB. Assume that the weights balance at C, a point different than M.

The distance AC is not equal to the distance CB. Hence by Assumption 1 the weights do not balance each other, no matter how C is chosen, as long as it is different from M. This implies that M must be the balancing point.

Here a tacit assumption is made that any two weights have a center of gravity, that is, a balancing point. It is a restatement of Assumption 1 in terms of center of gravity.

Corollary: If an even number of equal weights have their centers of gravity situated along a straight line such that the distances between the consecutive weights are all equal. Then the centre of gravity of the entire system is located on the midpoint of the line segment joining the centers of gravity of the two weights at the middle.

The figure below illustrates the corollary. The center of gravity of the six weights is at C.

Proposition 5: Commensurable weights balance at distances from the fulcrum that are inversely proportional to the magnitudes of the weights. More precisely, if commensurable weights W and w are at distances D and d form the fulcrum, then:

D/d = (1/W)/(1/w) = w/W

[By commensurable it is meant that the ratio of w/W is rational, i.e. there is a third number m such that W = pm, and w = qm and w/W = q/p, and p and q are whole numbers.]

Proof: For convenience let us take a specific case let the ratio of the weights be known, lets say 2:5. So that w/W = 2/5. Let w and W be located at A and B. Let M be their balancing point. What is needed for the proof is that AM : BM = 5:2.

AM/BM = [1/2] / [1/5] = 5/2

We cut the segment AB in to 5 + 2 = 7 equal parts. Divide the weight W into 2*5 = 10 equal parts, and place 5 of them at the midpoints of the five sections just to the right of B, one in each section and five of them in congruent sections to the left of B. Similarly divide w, into 2*2 = 4 equal parts and place 2 of them on the left of A at the and 2 of them to the right of A.

Thus we have a collection of 10 weights each W/10, whose centre of gravity is as same as that of W i.e. B. Similarly the centre of gravity of the w/6 weights is same as that of w, i.e. A. We now have a system of 14 equal weights, which are equally spaced. This collection of 14 weights by the corollary to Proposition 4 balances around the midpoint of the segment holding the 14 weights. This implies the ratio of the lever arms is 5:2, since AM has 2 little weights, and BM has 5 of them. This completes the proof.

Proposition 6: Incommensurable weights balance at distances inversely proportional to their magnitudes.

Proof: Let the weights be W and w at respective distances D and d from the fulcrum. Assume that WD = wd and that the two weights do not balance each other. Assume that weight W goes down.

Remove a small amount of weight from W to obtain W’ such that W’ still goes down but W’ and w are commensurable. Since W’D < wd, W’ rises. This contradiction – that both W’ rises and goes down – proves the proposition.

Is that what prompted him to say

“Give me a place to stand and with a lever I will move the whole world.”

Given the things that he asks for i.e. a place to stand and a lever, will a single person be able to lift the earth?

Let us do some order of magnitude physics for this. We assume that we have a strong rigid rod which will act as lever, which is sufficiently long. Lets say that we have a strong person of who can apply a force of $10^3 N$. We know that the weight of earth is $10^{25} N$. We then calculate in order to displace the earth by 1 cm, how much distance Archimedes will have to move. So that $\frac{10^{25} \, N}{ 10^3 N} \times 1\, cm \, = \, 10^{22} \,cm}$. Now $10^{22} \,cm \, = \, 10^{20} \, m$. This is too large a distance. So even if Archimedes moved at an fantastic rate of 100 m/s it would still take $10^{18} \, s$ to complete the arc of $10^{20} \, m$. This is equal to $3 \times 10^{10}$ years or 3 million years. So to do this fantastic job, Archimedes would have to be incredibly powerful, supplying $10^3 \, N$  force for such long, and also should have a fantastic age of 3 million years!!

But anyways Archimedes was a great man, no doubt about that. Archimedes is referred to as greatest mathematician, physicist and engineer of the antiquity. We will get to other wonderful discoveries of the Archimedes soon…

Mach gives albeit a somewhat different account of the derivation of this law.

References:

Archimedes: What Did He Do Besides Crying Eureka?
Sherman Stein

The Science of Mechanics
Ernst Mach

PS: Typing math without LaTeX is a pain….
PPS: Finally \ \LaTeX \ on the blog \ \ldots \