I traced my mathematical lineage back into the XIV century at The Mathematics Genealogy Project. Imagine my surprise when I discovered that a big branch in the tree of my scientific ancestors is composed not only by mathematicians but also by big names in the fields of Physics, Chemistry, Physiology and even Anatomy.
There is some blue blood in my family: Garrett Birkhoff, William Burnside (both algebraists). Archibald Hill, who shared the 1922 Nobel Prize in Medicine for his elucidation of the production of mechanical work in muscles. He is regarded, along with Hermann Helmholtz, as one of the founders of Biophysics.
Thomas Huxley (a.k.a. “Darwin’s Bulldog”, biologist and paleontologist) participated in a famous debate in 1860 with Samuel Wilberforce, Lord Bishop of Oxford. This was a key moment in the wider acceptance of Charles Darwin’s Theory of Evolution.
My namesake Franciscus Sylvius, another professor in Medicine, discovered the cleft in the brain now known as Sylvius’ fissure (circa 1637). One of his advisors, Jan Baptist van Helmont, is the founder of Pneumatic Chemistry and disciple of Paracelsus, the father of Toxicology (for some reason, the Mathematics Genealogy Project does not list any of these two in my lineage—I wonder why).
There are other big names among the branches of my scientific genealogy tree, but I will postpone this discovery towards the end of the post, for a nice punchline.
Posters with your genealogy are available for purchase from the pages of the Mathematics Genealogy Project, but they are not very flexible neither in terms of layout nor design in general. A great option is, of course, doing it yourself. With the aid of
GraphViz and the
networkx, this becomes a straightforward task. Let me show you a naïve way to accomplish it:
Let us start by searching for a name in the database online. Once in screen, note the string of numbers at the end of the URL obtained:
Every individual in the database has a unique ID in this fashion. Note also, in the source of the page, the field
Advisor: If the advisor of an individual is not
Unknown, the page will link to the corresponding page. We can easily retrieve the ID of the advisor from the source code. Even better, we can code a small script in
python to recursively go upwards in the database gathering the ID’s of your ancestors in a dictionary. One fast way to accomplish this could be as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 import urllib def augment_genealogy(subject_id,subject_tree): # This function assumes that subject_id in not in subject_tree f = urllib.urlopen("http://genealogy.math.ndsu.nodak.edu/id.php?id="+subject_id) subject = f.read() f.close() if not subject.count("Advisor: Unknown"): # How many advisors did subject have? # For each advisor, retrieve their information, and attach # them to subject as parents advisor_list =  for advisor in range(subject.count("Advisor")): subject = subject.partition("Advisor").partition("id=") advisor_id = subject[0:subject.index("\"")] advisor_list.append(advisor_id) if advisor_id not in subject_tree: augment_genealogy(advisor_id, subject_tree) subject_tree[subject_id] = advisor_list return subject_tree else: subject_tree[subject_id]= return subject_tree
But what good is a genealogy tree if we cannot see the names of the ancestors? The simple script below takes care of retrieving the names online for each of the IDs in our previously created dictionary. Of course, one could instead include the appropriate string manipulation in the retrieval script above, and kill two birds with one stone.
1 2 3 4 5 6 7 8 # Create a dictionary that maps to each id, its name names = dict() for id in data: f = urllib.urlopen("http://genealogy.math.ndsu.nodak.edu/id.php?id="+id) s = f.read() f.close() name = s.partition("The Mathematics Genealogy Project - ") names[id] = unicode(name[0:name.index(">")].strip(), "ascii", "ignore").encode()
These are the names in my tree, for example: Do you recognize any of them?
Let us move to
sage and plot a graph of what we have obtained. Let us try with my ID (
"113998"). We have different graphing options with
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 import networkx import matplotlib.pyplot data = augment_genealogy("113998",dict()) Tree = networkx.MultiDiGraph(data) # First graph matplotlib.pyplot.figure(figsize=(50,50)) pos = networkx.spring_layout(Tree, iterations=900) networkx.draw(Tree, pos, node_size=0, alpha=0.4, font_size=24) matplotlib.pyplot.savefig('/Users/blanco/Desktop/tree.png') # Second graph matplotlib.pyplot.figure(figsize=(50,50)) pos = networkx.shell_layout(Tree) networkx.draw(Tree, pos, node_size=0, alpha=0.4, font_size=24) matplotlib.pyplot.savefig('/Users/blanco/Desktop/tree.png')
Because of the cooperative nature among scientists, it is not unusual to encounter several ancestors over different generations linked to the same advisor. This usually destroys the structure of binary tree that one would expect for this type of graph, making the plotting of the data quite a challenge.
An easy workaround is to install
GraphViz, and use interaction to this software from the
networkx libraries. The following script attaches to each label the corresponding name, and produces the desired genealogy tree:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 T = DiGraph(data) T.graphviz_to_file_named('/Users/blanco/Desktop/tree.dot') # Here is where we change the labels from numbers to names # We need to do it this way to avoid the issue "different people with same name" fr = open("/Users/blanco/Desktop/tree.dot", "r") workingString = fr.read() fr.close() newString = str() while workingString.count("[label=\""): breakString = workingString.partition("[label=\"") newString += breakString + breakString stepString = breakString newString += names[stepString[0:stepString.index("\"")]]+"\"" workingString = stepString.partition("\"") newString += workingString fw = open("/Users/blanco/Desktop/newtree.dot", "w") fw.write(newString) fw.close() os.system('open -a /Applications/GraphViz.app /Users/blanco/Desktop/newtree.dot')