<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Emilee Rader &#187; visualization</title>
	<atom:link href="http://bierdoctor.com/category/visualization/feed/" rel="self" type="application/rss+xml" />
	<link>http://bierdoctor.com</link>
	<description>Assistant Professor, Technology &#38; Social Behavior @ Northwestern University</description>
	<lastBuildDate>Thu, 02 Sep 2010 04:50:39 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>happy new year!</title>
		<link>http://bierdoctor.com/2010/01/01/happy-new-year-2/</link>
		<comments>http://bierdoctor.com/2010/01/01/happy-new-year-2/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 05:04:14 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[reflection]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=431</guid>
		<description><![CDATA[Continuing the yearly tradition I started last January, here is my 2009 word cloud, made using wordle.net from the text of the blog posts I wrote in 2009. The most common words don&#8217;t look too different from the 2008 wordle! It&#8217;s an SVG image, so if you&#8217;re having problems viewing it try using a more [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing the yearly tradition I started last January, here is my 2009 word cloud, made using <a href="http://wordle.net">wordle.net</a> from the text of the blog posts I wrote in 2009. The most common words don&#8217;t look too different from the <a href="http://bierdoctor.com/2009/01/01/210/">2008 wordle</a>! </p>
<p><embed src="http://bierdoctor.com/images/svg/2009wordle.svg" height="350" width="450"></embed></p>
<p>It&#8217;s an SVG image, so if you&#8217;re having problems viewing it try using a more recent browser version. It works for me in both the latest Safari and Firefox. See both the <a href="http://bierdoctor.com/images/svg/madmissionWordle2008-3.svg">2008</a> and <a href="http://bierdoctor.com/images/svg/2009wordle.svg">2009</a> wordles, big.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2010/01/01/happy-new-year-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>productivity</title>
		<link>http://bierdoctor.com/2009/06/02/productivity/</link>
		<comments>http://bierdoctor.com/2009/06/02/productivity/#comments</comments>
		<pubDate>Tue, 02 Jun 2009 18:27:48 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[administrivia]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[reflection]]></category>
		<category><![CDATA[software tools]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://madmission.bierdoctor.com/2009/06/02/productivity/</guid>
		<description><![CDATA[well, the paper is submitted. but man, i NEVER want to do that again. and by &#8220;that&#8221; i mean write a single-author paper in about a week. i&#8217;d been working on analysis (along with all my other dissertation- and work-related stuff) for months, but when we returned from the holiday weekend &#8212; where i tried [...]]]></description>
			<content:encoded><![CDATA[<p>well, the paper is submitted. but man, i NEVER want to do that again. and by &#8220;that&#8221; i mean write a single-author paper in about a week. i&#8217;d been working on analysis (along with all my other dissertation- and work-related stuff) for months, but when we returned from the holiday weekend &#8212; where i tried and failed to write &#8212; all i had done was a bunch of statistics, graphs, and notes.</p>
<p>i&#8217;ve been using this service called <a href="http://www.rescuetime.com/">RescueTime</a> for the past several weeks as a way to track my hours for different projects i am working on, and as an indicator of my productivity in general. basically, you install a little app on your computer, and it sends data about what applications are active to the RescueTime server. you can log in and see reports of how much time you are spending looking at which apps and web pages (for $8/mo. you can get reports broken down by window title, not just application).</p>
<p>i have been happy to learn that i don&#8217;t &#8220;waste&#8221; as much time as i might have thought. but this past week isn&#8217;t a very accurate indication of my normal work habits. i went from notes and graphs to a 10-page ACM-format paper in a week:</p>
<p><a href="http://bierdoctor.com/images/gif/rescuetime.gif" target="_blank"><img src="http://bierdoctor.com/images/png/0526.png" border="0" height="481" width="391" /></a><br />
(<a href="http://bierdoctor.com/images/gif/rescuetime.gif" target="_blank">click for animated gif</a> showing May 26 &#8211; June 1)</p>
<p>it&#8217;s nice to see that i&#8217;ve still &#8220;got it&#8221;, i guess. but that was not a fun week.</p>
<p>i highly recommend RescueTime, if like me you want to be more meta about how you spend your time, and like looking at data.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2009/06/02/productivity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>hierarchy structure comparison</title>
		<link>http://bierdoctor.com/2009/05/05/hierarchy-structure-comparison/</link>
		<comments>http://bierdoctor.com/2009/05/05/hierarchy-structure-comparison/#comments</comments>
		<pubDate>Tue, 05 May 2009 17:41:04 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[analysis]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[dissertation]]></category>
		<category><![CDATA[measures]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://madmission.bierdoctor.com/2009/05/05/hierarchy-structure-comparison/</guid>
		<description><![CDATA[in my dissertation experiment, participants recruited from two different intellectual communities on campus organized a set of documents into file-and-folder hierarchies (single-categorization tree structures), using an online system created specifically for the experiment. as part of my analysis, i&#8217;m interested in comparing the hierarchies they created, to find out whether there are any reliable differences [...]]]></description>
			<content:encoded><![CDATA[<p>in my dissertation experiment, participants recruited from two different intellectual communities on campus organized a set of documents into file-and-folder hierarchies (single-categorization tree structures), using an online system created specifically for the experiment. as part of my analysis, i&#8217;m interested in comparing the hierarchies they created, to find out whether there are any reliable differences between the &#8220;community membership&#8221; groups.</p>
<p>i needed to select a couple of dimensions on which to compare the hierarchies. so far, i&#8217;ve picked three:</p>
<ol>
<li>file path and label (do people pick the same words for the same files)</li>
<li>file grouping (which files are grouped together in the same folder)</li>
<li>breadth vs. depth (how structurally &#8220;complex&#8221; are the hierarchies)</li>
</ol>
<p>i calculated the dissimilarity between all possible pairs of participants, on the first two of the measures above (i have not yet come up with a measure i am happy with for the third). the measures represent the percent of the time two users did not choose the same exact words (#1), or group the same two files together in a folder (#2).</p>
<p>i used multidimensional scaling (MDS) to cluster these dissimilarity values. this produces a visual representation that plots each participant in relation to every other participant, according to the similarity/dissimilarity information. a really nice, informative web page about MDS can be found <a href="http://www.analytictech.com/networks/mds.htm">here</a>.</p>
<p>MDS takes a set of proximities that can represent any number of dimensions (i.e., many factors might have contributed to the particular pattern of proximities observed), repeatedly transforms the information such that only the most important dimensions (mathematically speaking) are retained, and then smushes it into 2 dimensions so that we humans can make sense of the resulting graph (3d plots are just hard to parse). each MDS solution has an associated stress value, indicating how much distortion occurred as part of this &#8220;smushing&#8221; process. this is kindof similar to the distortion in the size of various continents apparent in maps when moving from a 3d representation of (globe) to a 2d representation like the <a href="http://en.wikipedia.org/wiki/Mercator_projection">Mercator projection</a>. the lower the stress value in MDS, the less distortion.</p>
<p>due to all this dimensional reduction and smushing, the specific coordinate system in a MDS plot usually has no relationship to the real world (unless you started with data that could already be represented accurately in 2 dimensions); when interpreting the MDS it is the relative distances between the points, not the absolute distances in the coordinate system, that are important. MDS is a complicated mathematical analysis technique, but interpreting the results is usually a qualitative activity; one looks at the plots and tries to identify clusters or patterns that are meaningful in the context of the data and research questions.</p>
<p>ok, so, <a href="http://bierdoctor.com/images/svg/groupingMDS.svg">below is an MDS plot</a> for the file grouping measure (there are no axis labels because coordinates are not meaningful) &#8212; these are SVG images; apologies if your browser cannot display them:</p>
<p><a href="http://bierdoctor.com/images/svg/groupingMDS.svg"><img class="alignnone" src="http://bierdoctor.com/images/svg/groupingMDS.svg" alt="" width="578" height="578" /></a></p>
<p>to me, it seems like there aren&#8217;t any clear patterns in this graph. i interpret this to mean there aren&#8217;t any consistent similarities or differences  that are unique to one community or the other, in the files participants chose to group together into the same folder. the stress on this one is a little high, but not horrible.</p>
<p>now take a look at the <a href="http://bierdoctor.com/images/svg/labelsMDS.svg">MDS plot</a> for the labeling measure:</p>
<p><a href="http://bierdoctor.com/images/svg/labelsMDS.svg"><img class="alignnone" src="http://bierdoctor.com/images/svg/labelsMDS.svg" alt="" width="578" height="578" /></a></p>
<p>this one looks more like it has a pattern; there&#8217;s some overlap but i can definitely see more blue circles on top, and more red triangles toward the bottom. and, the stress on this one is much lower.</p>
<p>so from these graphs, it seems like participants from the same community are more similar in the vocabulary they use, than participants from different communities. however, when deciding which files &#8220;go together&#8221; in the same folder, there doesn&#8217;t seem to be a clear pattern based on community membership. i also did some statistical tests, which confirm this qualitative assessment.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2009/05/05/hierarchy-structure-comparison/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>lattice graphics in R</title>
		<link>http://bierdoctor.com/2008/06/08/lattice-graphics-in-r/</link>
		<comments>http://bierdoctor.com/2008/06/08/lattice-graphics-in-r/#comments</comments>
		<pubDate>Mon, 09 Jun 2008 02:15:36 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[software tools]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://madmission.bierdoctor.com/2008/06/08/lattice-graphics-in-r/</guid>
		<description><![CDATA[if you&#8217;ve decided to switch over to doing stats in R because the graphs R generates look so nice, you&#8217;ve probably experienced frustration with the lattice graphics package. it is highly customizable, which also means it is crazy complicated. and, the documentation doesn&#8217;t seem to be written for non-expert users. while searching for documentation on [...]]]></description>
			<content:encoded><![CDATA[<p>if you&#8217;ve decided to switch over to doing stats in R because the graphs R generates look so nice, you&#8217;ve probably experienced frustration with the lattice graphics package. it is highly customizable, which also means it is crazy complicated. and, the documentation doesn&#8217;t seem to be written for non-expert users. while searching for documentation on the densityplot function earlier today, i came across <a href="http://osiris.sunderland.ac.uk/~cs0her/Statistics/UsingLatticeGraphicsInR.htm">this webpage</a>, which provides a fairly straightforward overview of the lattice package, and some nice examples of the types of graphs it is able to produce.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2008/06/08/lattice-graphics-in-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>scatterplots and timelines</title>
		<link>http://bierdoctor.com/2008/04/09/scatterplots-and-timelines/</link>
		<comments>http://bierdoctor.com/2008/04/09/scatterplots-and-timelines/#comments</comments>
		<pubDate>Wed, 09 Apr 2008 05:19:15 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://madmission.bierdoctor.com/2008/04/09/scatterplots-and-timelines/</guid>
		<description><![CDATA[i have finished a few new visualizations for the ctools data. i&#8217;ve been working on a python script that creates two different timelines for each site, one showing activity (events) broken down by user and the other activity broken down by file. i recommend using Opera to view the files below (or Adobe Illustrator), and [...]]]></description>
			<content:encoded><![CDATA[<p>i have finished a few new visualizations for the ctools data. i&#8217;ve been working on a python script that creates two different timelines for each site, one showing activity (events) broken down by user and the other activity broken down by file.</p>
<p>i recommend using <a href="http://www.opera.com/">Opera</a> to view the files below (or Adobe Illustrator), and NOT Firefox. SVG support in Firefox is apparently still too beta. i switched to Opera a couple of days ago, because it renders the graphics much, much faster and it supports zooming in and out of images.</p>
<p>DSO ctools site: <a href="http://bierdoctor.com/images/svg/1084809818012-1151812_file_timeline.svg">file timeline</a>, <a href="http://bierdoctor.com/images/svg/1084809818012-1151812_user_timeline.svg">user timeline</a>, and <a href="http://bierdoctor.com/images/svg/1084809818012-1151812_2006-12-31_2008-01-01.svg">activity scatterplot</a></p>
<p>PhD Program ctools site: <a href="http://bierdoctor.com/images/svg/c44a578d-f2bc-4958-00fd-c209cb8bcade_file_timeline.svg">file timeline</a>, <a href="http://bierdoctor.com/images/svg/c44a578d-f2bc-4958-00fd-c209cb8bcade_user_timeline.svg">user timeline</a>, and <a href="http://bierdoctor.com/images/svg/c44a578d-f2bc-4958-00fd-c209cb8bcade_2006-12-31_2008-01-01.svg">activity scatterplot</a></p>
<p>i&#8217;ve removed the usernames form the user timelines &#8212; if you want to know which line is you, send me mail and i can send you a personalized image file.</p>
<p>in the activity scatterplot, circle size represents how many users accessed that file, and the color represents how deep the file is in the resources hierarchy. for an interesting contrast, see the scatterplots and file timelines for two other sites: [ site a: <a href="http://bierdoctor.com/images/svg/6660a471-1c00-4331-80e1-bc74698fd8e2_file_timeline.svg">timeline</a> | <a href="http://bierdoctor.com/images/svg/6660a471-1c00-4331-80e1-bc74698fd8e2_2007-04-30_2007-09-01.svg">scatterplot</a>] and [site b: <a href="http://bierdoctor.com/images/svg/cddeea35-039b-4a5e-002e-a475cc5593c5_file_timeline.svg">timeline</a> | <a href="http://bierdoctor.com/images/svg/cddeea35-039b-4a5e-002e-a475cc5593c5_2007-04-30_2007-09-01.svg">scatterplot</a>]. i&#8217;ve removed the titles for privacy reasons. it is fairly obvious from these graphs that the activity patterns and hierarchy structure for these two sites are very different.</p>
<p>my script is designed to create timelines for specific site id&#8217;s, or for a random sample of sites having events in 2005-2007. i&#8217;ve run several random samples of 100 sites, and my main observation is that most of these sites do not get continuous use &#8212; many are used for only a few months and then never looked at again. of course, there were over 8000 sites active in 2007 alone, and there&#8217;s no way for me to look at visualizations for all of them. i need to narrow down the list before i start recruting for my field study. i believe i should be looking for sites with more than just 2 or 3 active users, and which seem to have steady activity for longer than one semester.</p>
<p>i should also mention that these visualizations were inspired by, although definitely not as beautiful as, those by Viegas [ <a href="http://alumni.media.mit.edu/~fviegas/">MIT page</a> | <a href="http://www.research.ibm.com/visual/fernanda.html">IBM page</a> ].</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2008/04/09/scatterplots-and-timelines/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>another scatterplot</title>
		<link>http://bierdoctor.com/2008/03/24/another-scatterplot/</link>
		<comments>http://bierdoctor.com/2008/03/24/another-scatterplot/#comments</comments>
		<pubDate>Mon, 24 Mar 2008 17:30:03 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://madmission.bierdoctor.com/2008/03/24/another-scatterplot/</guid>
		<description><![CDATA[here&#8217;s another scatterplot i&#8217;ve been working on [ look it scales! ]: scatterplots themselves aren&#8217;t that exciting. each circle in this chart is the one project site, and what this visualization illustrates is how many events took place on each site during a given time interval, and over how many days those events were spread [...]]]></description>
			<content:encoded><![CDATA[<p>here&#8217;s another scatterplot i&#8217;ve been working on [ <a href="http://bierdoctor.com/images/svg/scatterplot_2007-08-31_2007-12-31.svg">look it scales!</a> ]:</p>
<p><embed src="http://bierdoctor.com/images/2008/03/scatterplot_2007-08-31_2007-12-31.svg"/></p>
<p>scatterplots themselves aren&#8217;t that exciting. each circle in this chart is the one project site, and what this visualization illustrates is how many events took place on each site during a given time interval, and over how many days those events were spread out. so for the purple circle at the top, 150 events took place over ~50 different days out of a 4-month time window. the size of the circles reflects how many different users were involved in those events, and the color of the circles indicates the age of the site (orange is younger, purple is older).these scatterplots are primarily just for practice; it is hard to see much on a scatterplot with <a href="http://bierdoctor.com/images/svg/scatterplot_2007-08-31_2007-12-31all.svg">ALL sites included</a> (the one above has only 50 sites); it&#8217;s just too &#8220;busy&#8221;. so my next task is to start generating visualizations for individual sites, which is something it is really impractical to do by hand. it&#8217;s been a long time since i did any programming &#8212; this is kindof fun!</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2008/03/24/another-scatterplot/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>visualization</title>
		<link>http://bierdoctor.com/2008/03/20/visualization/</link>
		<comments>http://bierdoctor.com/2008/03/20/visualization/#comments</comments>
		<pubDate>Thu, 20 Mar 2008 06:35:52 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[software tools]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://madmission.bierdoctor.com/2008/03/20/visualization/</guid>
		<description><![CDATA[i&#8217;ve wanted to create some visualizations of event log data from my ctools project for quite a while now, but every time i thought about getting started, the idea of learning a new programming language or toolkit seemed like too much work for too little payoff. but when i decided to scale back my aspirations [...]]]></description>
			<content:encoded><![CDATA[<p>i&#8217;ve wanted to create some visualizations of event log data from my ctools project for quite a while now, but every time i thought about getting started, the idea of learning a new programming language or toolkit seemed like too much work for too little payoff. but when i decided to scale back my aspirations &#8212; i.e., admit to myself that i&#8217;m not the fastest programmer so producing a fully interactive looks-like-art visualization is beyond the scope of my dissertation project &#8212; i figured out an approach that just might turn out to be useful.</p>
<p>my objective is to make some pictures that will help me see patterns in the data, and help me communicate those patterns to others in presentations and papers. i&#8217;m NOT trying to create standalone interactive visualization applets to host on the web and allow others to explore the data.</p>
<p>what i really need is a way to produce some static visualizations that allow me to explore the data in more customizable ways than the usual graphing functions in statistical software allow. even the types of visualizations in <a href="http://services.alphaworks.ibm.com/manyeyes/page/Visualization_Options.html">manyeyes</a> are too restrictive (plus i can&#8217;t exactly just up and put all this data online). for example, i&#8217;d like to create a series of images that show how a site&#8217;s file and folder hierarchy grows and changes over time, or perhaps illustrating which users access which files on a site most frequently, or where specific files are located in the hierarchy at different points in time.</p>
<p>i looked into a couple of options before settling on my approach: using <a href="http://www.python.org/">Python</a> to connect to MySQL and obtain query results, munge them appropriately, and then generate static <a href="http://www.w3.org/Graphics/SVG/">SVG (scalable vector graphics)</a> images. one nice thing about this approach is that i&#8217;ve been wanting to play around with Python for a while (and there&#8217;s a lot of documentation on the web), and SVG images can be opened and edited in Adobe Illustrator to prep for use in presentations.</p>
<p>some other approaches i considered:</p>
<p>- <a href="http://processing.org/">Processing</a>, which i ultimately ruled out because i don&#8217;t really need interactivity, and just spitting out some static pictures seems to require the same kinds of steps as python/svg.</p>
<p>- <a href="http://nodebox.net/code/index.php/Home">NodeBox</a>, which seems like a really cool project, but i ruled it out because it isn&#8217;t clear how much documentation is out there, and i&#8217;m not sure how comprehensive the libraries are</p>
<p>my first attempt at a SVG file generated using python&#8230; not too fancy, but so far so good! each circle represents one ctools project site in use during 2007, and the size of the circle indicates the number of active users on each site. if you&#8217;re not using Firefox, you may not be able to see the <a href="http://bierdoctor.com/images/2008/03/scatterplot10.svg">image below</a> without a SVG viewer plugin.</p>
<p><embed src="http://bierdoctor.com/images/2008/03/scatterplot10.svg"/></p>
<p>here&#8217;s one with 50 sites rather than 10 [ <a href="http://bierdoctor.com/images/svg/scatterplot50.svg">link</a> ]</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2008/03/20/visualization/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
