Professor Tony Smith has taught in the MUSA program since its inception. He is recognized as an international expert in the fields of spatial statistics and spatial analysis. Professor Smith's 'Notebook on Spatial Data Analysis' from his ESE 502 class was one of the first comprehensive, free and open statistics textbooks on the web. If you are interested in learning hypothesis testing across spatial phenomena, Tony's book is a must read.
We asked Professor Smith to tell us about his work and reflect on his decades working with data at Penn.
Ken: I once heard that you began your career as an architect and then found spatial statistics. What got you interested in statistics and what made you change fields?
Tony: While studying architecture at Berkeley I became interested in spatial behavior, and wanted to learn other ways of analyzing such behavior. So I began to search the literature for fields where spatial behavior was being studied. This led me to Regional Science at Penn, where Walter Isard and his colleagues were actively involved in developing mathematical models of spatial behavior.
Ken: At one point, you were part of a department at Penn called Regional Science. How is/was Regional Science different from our modern City Planning program?
Tony: During my graduate years, Regional Science was closely associated with the City Planning program at Penn, but was more focused on quantitative methods of analyzing spatial problems. From that perspective, the current City Planning program is now much closer to the field of Regional Science in many ways. This is particularly true of the MUSA program, which is very much in the spirit of the Regional Science program that I knew at Penn.
Ken: How long have you been working with data? In that time, what is the single greatest innovation that changed how you analyze data? How did your day to day change after that innovation?
Tony: I should start by saying that data analysis itself has undergone dramatic changes since my graduate days at Penn in the 1960’s. Even doing a simple regression on Main-Frame computers at that time was a major task, usually taking several days depending on “turn-around” times. So my analytical efforts focused almost entirely on theoretical studies requiring only pencil and paper. It was not until the 1980’s that all of this changed with the advent of the desk-top computer – which, needless to say, has been the “single greatest innovation” in my professional lifetime. For me, this opened up the possibility of doing spatial data analysis in a truly interactive way, and has profoundly shaped my career ever since.
Ken: If you could analyze any dataset (or linked datasets) in existence, what would it be and what questions would you ask of it?
Tony: In my early architecture days, I wanted to study how people actually use space, and in particular, how their behavior is influenced by the space around them. Only now with the advent of CGS and individual receptors (like smart phones) is it becoming possible to study spatial behavior of populations at the individual scale. As one example, traffic jams typically result from spatial interactions of drivers that create “wave effects” in flow conditions. While computer simulations have made some progress in replicating these effects, it is now becoming possible to observe how the interaction behavior of individual drivers leads to such “emergent” phenomena -- and hopefully to learn what interventions might best avoid them. More generally, we are now seeing a revolution in spatial data analysis at the individual level.
Ken: What is one big research question that you have wrestled with in your career that you feel remains yet unanswered? Why do you think the question is so hard to answer?
Tony: It is difficult to point to a single dominant research question, but again going back to my architecture days, I have maintained a strong interest in visual representations of data. As my colleague Dana Tomlin likes to say: “Your eyes are your best analyzers”. This is particularly challenging in multidimensional data where we are invariably restricted to at most three-dimensional “slices” through the data. I am always interested in new approaches to this problem, and am currently looking at the dynamical “grand-tour” method in GGOBI software. Much in the same way that one studies three-dimensional objects by turning them around to get a continuous range of two-dimensional views, the grand tour method dynamically projects n-dimensional data patterns into a continuously varying sequence of two-dimensional planes. I believe that this general method can be refined to address a broad range of spatial questions, and I am currently investigating such possibilities.