Applying Visualization to Classroom Seating Habits

Figure 1The default view of the classroom seating habits visualizationtime-series data for all studentsThe seating habits for two students are compared.The time-series data for the same two students are comparedstatic view of his or her seating habitsGitHub-inspired buttons decorate the top of the controls pane.A script took care of processing the data by collapsing and normalizing them so that they could be fed into the d3 toolkit.  Another script processed the time-series data.student matrixThe actual classroom: 1-390Ali Almossawi By Ali Almossawi, SDM ’11
June 18, 2012

The System Design and Management (SDM) Program has been everything I expected it to be and more; my cohort is wonderfully rich and the courses are both practical and relevant. I have even learned some rather unexpected, yet applicable, lessons during my time here.

In my 15.514 Financial Accounting course, I started noticing that the space between my seat and the professor seemed to always be occupied by the same set of heads. That meant that not only were those students always sitting in the same seats, but also that I was doing so as well.

Because I had just taken a course in data visualization, had worked on a data analysis and visualization project with former MIT professor Alan MacCormack, and wanted to try out a new visualization toolkit, it seemed an interesting exercise to investigate whether or not students did indeed like to sit in the same seats. Looking at it through the SDM lens, the society of the classroom seemed to be a system complex enough to warrant analysis. Therefore, fueled solely by my curiosity and funded by about a week’s worth of effort, I decided to use the visualization toolkit to examine the seating habits of my 15.514 colleagues throughout the semester.

Data extraction
A matrix created in a spreadsheet representing the classroom 1-390 served as the initial canvas on which students’ coordinates were marked. Each dot in the matrix was labeled with a student’s name; its position indicated a row and column, and hence the particular seat that he or she was in. These were noted down through observation during class via a sophisticated set of methods, namely, the legendary cough-and-turn-to-the-side method and the more subtle turn-to-the-back-and-pretend-to-look-at-the-clock method.

Given that the semester was already underway, data for past lectures was acquired by the laborious, albeit trivial, process of watching taped lectures on Stellar.

Data transformation and loading
The data were then transformed into two representations and persisted in two data files. The first data file was used to show the frequency with which students sat in particular seats, as shown in Figure 1.

Figure 1.
Herein, x and y are the rows and columns, respectively, and the "count" field indicates the number of times the student sat in that particular seat throughout the semester.

The second data file was used to show a time-series for an arbitrary set of students, that is to say, a timeline of where they sat in each lecture throughout the semester. The rationale behind it is that instead of just seeing a static view of who sat where and with what frequency, a user may to see how students move about, lecture by lecture, in order to perhaps see if there are sets of students who move around together, or conversely if there are ones who always avoid each other.

Here, the data file had an additional field called "missing" that indicated whether or not a student attended a particular lecture. In the case of him or her being absent, the coordinates from the previous lecture were used. The "count" field was eliminated since elements were no longer consolidated. Instead, each student’s set had 20 elements representing the course’s 20 lectures.

The sets in both data files were then trivially converted into JavaScript arrays and fed into the visualization toolkit described below. Though the toolkit can work with a variety of data files, such as comma-separated-value (CSV) files and JavaScript Object Notation (JSON) files, plain-old arrays were used for this project since the dataset was small.

Visualization
D3 is a powerful visualization toolkit that came out of Stanford’s Visualization Group. For anyone who has done Web development, it is easy to use since it borrows concepts and metaphors from the style sheet language CSS and feels a lot like the popular abstraction language jQuery; after all, it is JavaScript.

Scalable Vector Graphics (SVG) is a set of specifications for rendering two-dimensional vector graphics within the Web browser. D3 works with SVG elements allowing one to easily bind arbitrary data with some set of SVG elements, which can then be manipulated or animated. The advantage of SVG is that it is supported natively in all major browsers (except IE) and works in iOS, which is one area where Flash-based applications break.

For the visualization, an abstract view of the classroom was rendered on screen with rows being indicated by hairline horizontal lines. The top of the screen constituted the front of the classroom and the left was where the two doors were located. The seats were shown as being the same distance from each other and along parallel rows. The default view of the visualization used the first data file (the static one) and for each student, bound each element to a different circle with its radius correlated to the "count" field of that particular element. Hence, a student who sat in a total of four seats during the semester would have four circles rendered on screen, each at the coordinate corresponding seat location. If he or she sat in a seat more often, the circle around that seat increases in size. Circles were semi-opaque so that overlaps were more apparent.

After refreshing the browser window and seeing the circles pop up via "fade-in" animations, it was observed that there were three groups of students: 1) those who preferred to sit in the same zone, that is the same set of seats, be they either in tight clusters in some part of the classroom or across some particular row; 2) those who jumped around a lot, and 3) those who preferred to sit in the same seat. The frequency was in that order, from most to least. So for the SDM ’11 cohort, and going by the data for this one class, it appears that most students didn’t in fact sit in the same seats, but rather in the same zones.

Three colors were used to shade the circles belonging to each of the groups. Hovering over a circle adds strokes around all the circles belonging to that student and highlights the student’s name to the right of the screen.

An option on the webpage called "Play time-series" allows the users to pick an arbitrary number of students and see how they sit relative to each other throughout the semester. In that mode, circles are all the same size and. as mentioned earlier, missing data (i.e. days on which students were absent) are faded out and assume the coordinates of the previous lecture.

Even with a dataset as small as this one, the visualization revealed that it may be possible to categorize students’ seating habits.

Some areas of future exploration include: determining how the relationships between students influence where they sit and investigating correlations between where people sit and their gender, what time they arrive to class, their personality type, whether or not they have a class immediately before, and their final grade. A researcher from Europe has since done the same experiment with her class and we’re in the process of comparing findings.

Screenshots

The default view of the classroom seating habits visualization

The time-series data for all students are shown at once. The animation remains smooth throughout. Overlaps may be noticed when one student has missing data.

The seating habits for two students are compared.

The time-series data for the same two students are compared

After starting the time-series player for a student, a static view of his or her seating habits may be displayed too; it is overlaid on top of the other elements.

GitHub-inspired buttons decorate the top of the controls pane. The rest of the pane is made up of students’ names, categorized based on their seating habit. Each category is colored differently, as shown in the vertical bar that stretches the height of each group. Custom checkboxes ensure that they are rendered the same in all browsers.

A script took care of processing the data by collapsing and normalizing them so that they could be fed into the d3 toolkit. Another script processed the time-series data.

The initial capturing of the data was done by adding student names to matrices that represent the spatial layout of the classroom.

The actual classroom: 1-390 (source and copyright)

Ali Almossawi, SDM ’11 , came to the program from the software engineering industry, having worked for the government and then as part of a startup that he co-founded. He holds M.S. in software engineering.

Ali Almossawi
Photo by Kathy Tarantola Photography