Sunday, June 12, 2016

[FIELDS_MEDALISTS]

On the Wikipedia page for the "Fields Medal", there is a table of Fields Medalists who were awarded the very high honor in the field of Mathematics. For fun, I took the data from this table and started playing around with the content in the "citation" column.

That is to say, in the table on the Fields Medal Wikipedia page, one has "fields" of "metadata" if you will, for Year, ISM Location, Medalists, Affiliation (when awarded), Affiliation (current/last), and one for Citation. An example of one of the "records" in the table is (in a form of comma-separated values styled representation):

1936, "Oslo, Norway", "Lars Ahlfors", "University of Helsinki, Finland", "Harvard University, US", "Awarded medal for research on covering surfaces related to Riemann surfaces of inverse functions of entire and meromorphic functions. Opened up new fields of analysis."

What I wanted to do, in Python, was make a "list" ("associative array") of all of the data from the table. I would then play with random selections of given "citation" data in the table.

I did this and I had a lot of fun. Without going into too many details, let's say that I created a list with just the citations, if that's all I want. I would make a statement starting with:

FIELDS_MEDALISTS = [...]

...with a "nested list" of the citations, within the FIELDS_MEDALISTS list, or whatever I want to call it, i.e. "CITATIONS", etc.

Once I had access to the citation data, the text or content of them, if you will, I thought that I could randomly choose citations. Ideally, I'd like to be able to do something slightly more sophisticated with the text of the citations, like make a word cloud of the words, minus stop-words, that are most common.

I think that you can see where I am trying to go with this: CREATE A RANDOM FIELDS MEDALIST CITATION (probably using markov chains, not sure yet). The idea, then, would be to write a kind of "FIELDS MEDALIST PREDICTOR".. that optimally would be ablt o predict future winners of the award. Although, honestly, I doubt I can do a very good job, since I'm not yet all that masterful when it comes to machine learnig and so forth. But the dataset itself, the "table" I called it, is pretty small. There haven't been all that many Fields Medalists in human history.

[more to come]...

No comments:

Post a Comment