<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Emilee Rader</title>
	<atom:link href="http://bierdoctor.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://bierdoctor.com</link>
	<description>Post-Doctoral Fellow, Center for Technology and Social Behavior @ Northwestern University</description>
	<lastBuildDate>Fri, 05 Mar 2010 09:00:25 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>&#8220;getting the most out of twitter&#8221;</title>
		<link>http://bierdoctor.com/2010/03/05/getting-the-most-out-of-twitter/</link>
		<comments>http://bierdoctor.com/2010/03/05/getting-the-most-out-of-twitter/#comments</comments>
		<pubDate>Fri, 05 Mar 2010 09:00:25 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[in the news]]></category>
		<category><![CDATA[social filtering]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=478</guid>
		<description><![CDATA[There was an interesting article the other day on nytimes.com titled &#8220;Getting the Most out of Twitter&#8221; which essentially argues that lurking (i.e., reading others&#8217; posts and not posting yourself) is the way to go:
Even the most prolific users say Twitter has become more useful as a way to tap in to the discussions of [...]]]></description>
			<content:encoded><![CDATA[<p>There was an interesting article the other day on nytimes.com titled &#8220;<a href="http://www.nytimes.com/2010/03/04/technology/04basics.html">Getting the Most out of Twitter</a>&#8221; which essentially argues that lurking (i.e., reading others&#8217; posts and not posting yourself) is the way to go:</p>
<blockquote><p>Even the most prolific users say Twitter has become more useful as a way to tap in to the discussions of the day than to broadcast their own thoughts. And once you get pulled in, you might just find you have something to say after all.</p>
<p>Biz Stone, Twitter’s co-founder, suggests that naysayers simply log on to <a title="Link to the site." href="http://twitter.com/">Twitter’s home page</a> and search for a topic they are interested in, whether it’s their favorite sports team, the name of their company or a topic in the news.</p>
<p>Within a minute, they understand the appeal, he said.</p></blockquote>
<p>This is interesting to me, for a couple of reasons. First, it seems like up to now, most of the hype in the press about Twitter has been focused on production&#8212;posting Tweets&#8212;rather than consumption. This article seems to take the information available on Twitter as a given, and focuses on the potential benefit information consumers might receive by using Twitter for &#8220;social filtering&#8221;. It even provides some advice for how to get better results from one&#8217;s &#8220;social filtering&#8221; endeavors on Twitter.</p>
<p>Second, in taking the information on Twitter as a given, it sidesteps questions about the incentives mismatch&#8212;if the real benefit of Twitter is in consumption and &#8220;social filtering&#8221;, who are the information producers and what are *their* motivations? What influences their choices about what to contribute, and how do these influences shape the information available via Twitter?</p>
<p>The article quotes someone named <a href="http://danzarrella.com/">Dan Zarrella</a>, who has a blog called &#8220;The Social Media Scientist&#8221;. On March 1st he posted a comparison of link-sharing on Twitter and Facebook, <a href="http://danzarrella.com/data-shows-twitter-centric-stories-are-not-heavily-shared-on-facebook.html">Data Shows: “Twitter”-Centric Stories are Not Heavily Shared on Facebook</a>. I was initially excited about this&#8212;I&#8217;m all for cross-site comparisons, and I&#8217;m really interested in comparing &#8220;social filtering&#8221; across different systems. However. Dan does not reveal where his data came from, and only briefly mentions sampling:</p>
<blockquote><p>I’ve begun by capturing links posted to social media sites from 10 extremely popular news outlets. Some of the top blogs, both mainstream and geeky, as well as a handful of the most web-enabled newspapers of record. Then I’m counting the number of times those links are shared on Facebook (in three different ways) and on Twitter (through good old ReTweets).</p></blockquote>
<p>This was disappointing. A social media dataset is only as good as one&#8217;s data collection and sampling methods, and without detailed information about such things, the results and any conclusions based on them are suspect. Even more disappointing: only two commenters (out of 23) and zero &#8220;tweeters&#8221; (out of 189, as tracked by <a href="http://disqus.com/comments/">DISQUS Comments</a>) ask about where the data came from.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2010/03/05/getting-the-most-out-of-twitter/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>adventures in social filtering</title>
		<link>http://bierdoctor.com/2010/03/02/adventures-in-social-filtering/</link>
		<comments>http://bierdoctor.com/2010/03/02/adventures-in-social-filtering/#comments</comments>
		<pubDate>Tue, 02 Mar 2010 09:00:59 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[research]]></category>
		<category><![CDATA[social filtering]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=476</guid>
		<description><![CDATA[I&#8217;ve been thinking a lot lately about &#8220;social filtering&#8221;, or the practice of discovering information by paying attention to what others are paying attention to. Evidence of the attention of others is explicitly captured and aggregated by various social media applications like Digg, delicious, and Twitter / TweetMeme. This is hardly a new concept (see [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been thinking a lot lately about &#8220;social filtering&#8221;, or the practice of discovering information by paying attention to what others are paying attention to. Evidence of the attention of others is explicitly captured and aggregated by various social media applications like <a href="http://digg.com/">Digg</a>, <a href="http://delicious.com/">delicious</a>, and <a href="http://twitter.com/">Twitter</a> / <a href="http://tweetmeme.com/">TweetMeme</a>. This is hardly a new concept (see &#8220;<a href="http://portal.acm.org/citation.cfm?id=142751">Edit wear and read wear</a>&#8220;, Hill et al., CHI 1992); however, rather than tracking passive traces of use, these applications collect and aggregate explicit actions&#8212;posting a link to delicious or Twitter is essentially a user endorsement of the content.</p>
<p>I can&#8217;t quite put my finger on it yet, but I feel like there&#8217;s something fundamentally different about social filtering as a side effect of saving a story or bookmark or reference for oneself, and posting it so that will be broadcast to others.</p>
<p>For example, I have been using <a href="http://www.mendeley.com/how-it-works/">Mendeley</a> for the past 6 months as my reference manager, which is both a desktop application and a social media system. Today I received the March newsletter from Mendeley, which pointed out the &#8220;<a href="http://www.mendeley.com/blog/academic-features/the-top-10-journal-articles-published-in-2009-by-readership-on-mendeley/?utm_source=Newsletter_Feb_2010&amp;utm_medium=Email&amp;utm_content=Top_10_most_read_headline&amp;utm_campaign=Newsletter_Feb_2010">Top 10 most read articles on Mendeley published in 2009</a>&#8220;. Number one on the list is the following: Alon, Uri (2009). <a href="http://dx.doi.org/10.1016/j.molcel.2009.09.013">How to Choose a Good Scientific Problem</a>. Molecular cell, 35(6), 726-8. I was intrigued by the title, so I looked up the paper, and came across another by the same author titled, &#8220;<a href="http://dx.doi.org/10.1016/j.molcel.2010.01.011">How to Build a Motivated Research Group</a>&#8220;. Both of these papers contain interesting and valuable insights, and I expect that I will return to them multiple times as I progress in my career.</p>
<p>It seems like the primary use case for Mendeley is storing and organizing references, and sharing them with a group of collaborators. The Mendeley Desktop app synchronizes automatically with the server, so the usage data that was aggregated to produce the &#8220;Top 10 in 2009&#8243; list is more like the &#8220;read wear&#8221; of Hill et al. than the active endorsements of Twitter posts. I doubt I follow anyone on Twitter who reads the journal &#8220;Molecular Cell&#8221;, so I probably would never have come across these papers if I hadn&#8217;t seen them in the newsletter email today. Were the Mendeley users who read this paper even aware that their actions might contribute to the information discovery of others? Are these the kind of content items that anyone would choose to post to Twitter?</p>
<p>Endorsements are different from &#8220;read wear&#8221; in that they require an extra action on the part of the user, beyond reading it or saving it for themselves, to share the content. How does the &#8220;read wear&#8221; vs. endorsement distinction, as incorporated in a social media system, affect the content available to users via the system?</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2010/03/02/adventures-in-social-filtering/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>supplemental statistics</title>
		<link>http://bierdoctor.com/2010/02/27/supplemental-statistics/</link>
		<comments>http://bierdoctor.com/2010/02/27/supplemental-statistics/#comments</comments>
		<pubDate>Sat, 27 Feb 2010 07:51:47 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[analysis]]></category>
		<category><![CDATA[in the news]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=472</guid>
		<description><![CDATA[I came across a really interesting paper recently after seeing it referred to in a news story: Female teachers&#8217; math anxiety affects girls&#8217; math achievement  (Beilock et al., 2010, PNAS, with Supplemental information)
The researchers recruited 17 first- and second-grade teachers (all female) and assessed the math achievement of the students in their classrooms at the [...]]]></description>
			<content:encoded><![CDATA[<p>I came across a really interesting paper recently after seeing it referred to in a news story: <a href="http://www.pnas.org/content/early/2010/01/14/0910967107.abstract">Female teachers&#8217; math anxiety affects girls&#8217; math achievement </a> (Beilock et al., 2010, PNAS, with <a href="http://www.pnas.org/content/early/2010/01/14/0910967107/suppl/DCSupplemental">Supplemental information</a>)</p>
<p>The researchers recruited 17 first- and second-grade teachers (all female) and assessed the math achievement of the students in their classrooms at the beginning and end of the year, as well as the teachers&#8217; anxiety level. They also measured &#8220;students’ beliefs about gender and academic success in domains like math&#8221;. They found that higher teacher math anxiety was associated with an increase in girls&#8217; tendency to adhere to &#8220;boys are good at math, girls are good at reading&#8221; gender stereotypes. They also found that girls who were more likely to hold &#8220;boys are good at math, girls are good at reading&#8221; stereotypes had lower end-of-year math achievement scores. Interestingly, when they put these predictors together in one regression (teacher anxiety and gender stereotypes predicting math achievement), teacher anxiety was &#8220;no longer a significant predictor&#8221; (and the coefficient decreased from -3.33 to -2.48). The paper presents a &#8220;mediation analysis&#8221; called &#8220;bias-corrected bootstrapping&#8221; that suggests math anxiety in female teachers affects girls&#8217; gender stereotypes, which affects math achievement scores. I don&#8217;t know much about this analysis method, so I dug up a couple of papers so I can learn more about it. Yay, stats!</p>
<p>I have two issues with the way the results are presented in this paper. First, it took me way too long to figure out what they actually did. I didn&#8217;t notice the supplemental material initially, which is where the all of the analyses are described, and the text of the actual article is too vague about the statistics for me to believe the results from just that part of the text. I realize that <a href="http://www.pnas.org/site/misc/iforc.shtml#length">PNAS limits submissions to 6 pages</a>, but I feel that for this particular paper the supplemental material is not supplemental at all&#8212;it is essential. After reading the supplement, it is pretty clear that the analysis was adequate.</p>
<p>But, my second issue is that the interpretation of the result seems more concerned with sign and significance than with effect size. The paper doesn&#8217;t ground the numbers in real-world implications, nor does it present descriptive statistics on the instruments used (these are relegated to the online appendix as well). For example, it is impossible to interpret the coefficient in this statement without having some idea what the units mean for both teacher anxiety and math achievement: &#8220;In addition, the more girls at the end of the year endorsed the notion that boys are good at math and girls are good at reading, the lower was their math achievement (r = −0.28, P = 0.025).&#8221; This oversight surprises me. So what if gender stereotype belief is a significant predictor of math achievement for girls, if this is only associated with very small differences in test scores? Yes, it is still interesting that the effect was present in girls and not in boys, but if the magnitude of the effect is small, in my mind the implications of this particular study are more about gender stereotypes and behavior modeling, and less about figuring out how to help girls do better in math. (There&#8217;s a brief acknowledgement of effect size in the third to last paragraph: &#8220;It is important to note that the effects reported in the current work, although significant, are small.&#8221;)</p>
<p>Finally, it&#8217;s interesting to me, and a bit depressing, that in a paper about math anxiety and achievement, the complicated statistics are relegated to an appendix. There is no way to know if the authors expected the statistical analyses to be transparent/obvious enough they didn&#8217;t need to include the details in the paper, or if they felt the paper would be more understandable for readers without the stats. This is something I struggle with&#8212;how to appropriately describe complicated quantitative analyses for a multi-disciplinary audience that may or may not understand what I&#8217;m talking about, or even want to learn. I&#8217;m not sure I like the stats appendix solution, but I like it a lot more than two other alternatives I&#8217;ve seen: 1) the &#8220;sink or swim&#8221; approach&#8212;describing the analyses as if to an expert, and less experienced readers are left to flounder; and 2) only using stats one believes most members of the community should be familiar with.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2010/02/27/supplemental-statistics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>web applications</title>
		<link>http://bierdoctor.com/2010/02/25/web-applications/</link>
		<comments>http://bierdoctor.com/2010/02/25/web-applications/#comments</comments>
		<pubDate>Thu, 25 Feb 2010 06:50:36 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[advice]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[software tools]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=467</guid>
		<description><![CDATA[Alina Lungeanu and I started collecting data last week on our experiment! I don&#8217;t want to say too much about the hypotheses, etc. in case potential participants google me and find this blog post, so instead today I&#8217;m writing about why I&#8217;m glad I&#8217;m not a web application developer.
For the experiment we are using the [...]]]></description>
			<content:encoded><![CDATA[<p>Alina Lungeanu and I started collecting data last week on our experiment! I don&#8217;t want to say too much about the hypotheses, etc. in case potential participants google me and find this blog post, so instead today I&#8217;m writing about why I&#8217;m glad I&#8217;m not a web application developer.</p>
<p>For the experiment we are using the same web application created for my dissertation research, with a few small tweaks, and a new set of materials. Whenever you&#8217;re doing a study that involves participants using a prototype or other system built specifically for the experiment, it is imperative to do a lot of testing. The last thing you want is for the results of the study to reflect bugs or usability problems and not the actual phenomena of interest. So, before using the experiment app for my dissertation research, I set aside plenty of time for testing and recruited people to bang on the system and try to break it.</p>
<p>This time around, the tweaks to the system were so minor that I basically tested use cases that involved the new features, and nothing else. I figured not much had changed, so I could assume what worked before would still be working. This, as it turns out, is an assumption that doesn&#8217;t hold true in the wonderful world of web application development. With a web application, it isn&#8217;t just the application code itself you have to worry about. About a year has gone by since my initial data collection, and in that time web browsers have gone through several rounds of updates and major releases. Also, we&#8217;re using a different web server this time around. And finally, there&#8217;s been an update to one of the toolkits the application uses for the file-and-folder interface. So in reality, a LOT has changed from a year ago.</p>
<p>Fortunately, in the first experiment session we uncovered a minor &#8220;<a href="http://en.wikipedia.org/wiki/Race_condition">race condition</a>&#8221; bug that hadn&#8217;t presented itself in either my dissertation data collection, or testing for this experiment (I say &#8220;fortunately&#8221; because we discovered the problem early). A race condition exists when multiple related (but separate) requests are sent from the client to the web server. Because these are *separate* requests, there&#8217;s no explicit sequencing, and unpredictable or undesirable application behavior can result if/when these requests are processed in the wrong order. This was a simple bug to fix, and so far no other bugs have presented themselves.</p>
<p>The reason I am glad I&#8217;m not a web application developer, is with all these infrastructural components that can change (browsers, servers, toolkits&#8230;), keeping a web application working seems to be like hitting a moving target. Firefox 3.6 included optimizations to <a href="http://hacks.mozilla.org/2010/01/javascript-speedups-in-firefox-3-6/">speed up javascript</a>, for example, which may have contributed to the race condition bug in the experiment app. A new version of Internet Explorer was released, and the toolkit the experiment app uses also released a new version with changes based on the changes to IE. It amazes me that Gmail and all those other web apps I use on a daily basis continue to work at all!</p>
<p>So my advice to anyone considering using a home-grown web application in their research is, come up with a <a href="http://en.wikipedia.org/wiki/Test_suite">test suite</a>, document it, and run through all the test cases *every time* you intend to use the application in a new study. Even if the application itself hasn&#8217;t changed.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2010/02/25/web-applications/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>large datasets and threats to validity</title>
		<link>http://bierdoctor.com/2010/02/23/large-datasets-and-threats-to-validity/</link>
		<comments>http://bierdoctor.com/2010/02/23/large-datasets-and-threats-to-validity/#comments</comments>
		<pubDate>Tue, 23 Feb 2010 16:05:29 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[analysis]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[in the news]]></category>
		<category><![CDATA[research design]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=464</guid>
		<description><![CDATA[I just read &#8220;Limits of Predictability in Human Mobility&#8221; by Chaoming Song, et al. (Science, Vol. 327, 2010). The paper reports an analysis of a really amazing dataset: three months of cell phone records for ~10 million customers of a large European carrier (&#8220;anonymized by the data source&#8221;). These records include information about the cell [...]]]></description>
			<content:encoded><![CDATA[<p>I just read &#8220;<a href="http://www.sciencemag.org/cgi/content/abstract/327/5968/1018">Limits of Predictability in Human Mobility</a>&#8221; by Chaoming Song, et al. (Science, Vol. 327, 2010). The paper reports an analysis of a really amazing dataset: three months of cell phone records for ~10 million customers of a large European carrier (&#8220;anonymized by the data source&#8221;). These records include information about the cell phone towers people connected to when they placed calls, what time calls were placed, and how long the calls lasted.</p>
<p>The research question/motivation for the analysis stated in the paper is, &#8220;What is the role of randomness in human behavior and to what degree are individual human actions predictable?&#8221; The main finding is that people&#8217;s daily mobility patterns are very predictable, even when they travel quite far on a daily basis.</p>
<p>I suppose researchers have not been able to model human mobility quite like this before&#8212;that&#8217;s a LOT of data, even the sample of 50,000 they used for the analysis reported in this paper. But I&#8217;m not sure I understand the claim that this finding is surprising: &#8220;Yet it is not the 93% predictability that we find the most surprising. Rather, it is the lack of variability in predictability across the population.&#8221; How surprising is it that humans are creatures of routine and habit? What was more surprising to me about this paper was the idea that &#8220;current models of human activity are fundamentally stochastic&#8221;, i.e. the previous status quo was the assumption that there is an important random component to human activity.</p>
<p>The paper seems to be saying &#8220;look, the assumptions made by people who study this kind of thing are WRONG, and we can prove it&#8221;. I typically like those kind of papers. But my feeling about this one is, while it is useful to know that people&#8217;s mobility patterns aren&#8217;t random if you study these kinds of things, it also seems like a giant WELL, DUH. I suspect this paper is receiving attention because of the sexy dataset, and not because it presents counterintuitive results. (I first heard about the paper on Twitter.)</p>
<p>For me, the most interesting aspect of the paper is how it glosses over what could very well be a huge sampling bias. This is not unique to this particular paper&#8212;many analyses of large &#8220;social computing&#8221; datasets have similar sampling biases due to technical constraints or dataset limitations. When one analyses an existing dataset, one must make do with the information one is given. (A related example is using <a href="http://www.ip2location.com/">IP geolocation</a> to identify users&#8217; locations when the only information you have about them is their IP address&#8212;there&#8217;s systematic bias in that data for sure, but in many cases there&#8217;s just no better way to do it.)</p>
<p>So for example, the cell phone mobility dataset contains location information only for instances when phone calls were placed (or received?)&#8212;i.e., the phone had to be in communication with a tower for the tower location to be recorded. This means the dataset contains locations for people *only when they&#8217;re using their phones*. If a person doesn&#8217;t make any calls, their location is not captured in the dataset. Are there systematic differences in mobility behavior between people who make calls and those who don&#8217;t? I&#8217;m someone who makes maybe one phone call a day, although I send a lot of text messages and use the packet data service quite a lot. Could it be possible that people who make a lot of calls have more predictable mobility patterns? While I am *sure* this possibility has occurred to the authors, the paper doesn&#8217;t address that question. I also wonder what systematic differences exist between people who have cell phones and those who don&#8217;t. But the paper doesn&#8217;t include a discussion of sampling bias, or any other potential threats to validity.</p>
<p>This is a big problem, I think, in the reporting of results from super large datasets. The datasets are SO big, that they are thought of as more like population data than sample data, and the results are taken for &#8220;truth&#8221; without being subjected to appropriate scrutiny. Take this blurb written about the article, from an NPR story:</p>
<blockquote><p>A new study used cell phone billing data for 50,00 people in a European country to show that people&#8217;s travel patterns are extremely predictable. That&#8217;s true for both homebodies and jet setters. Regardless of age, language group, etc, people&#8217;s movements were predictable 93 percent of the time. The study shows the emerging power of using cell phone data for social science research. (from <a href="http://www.npr.org/templates/story/story.php?storyId=123879603">http://www.npr.org/templates/story/story.php?storyId=123879603</a>)</p></blockquote>
<p>I think it is extremely important when reporting analyses of large datasets to be exceedingly clear about issues like sampling bias and generalizability, and I&#8217;d like to see a requirement that papers address these issues. For example, this particular paper might have reported statistics on what proportion of the dataset had to be excluded due to lack of location data. Or, the authors could have undertaken a secondary data collection to try to find out whether those excluded people differed from the analyzed sample in some systematic way.</p>
<p>I&#8217;m not saying I think the findings of this particular paper are invalid&#8212;the results make perfect sense, and perhaps that&#8217;s why the paper doesn&#8217;t even mention threats to validity. But then, how is the finding counterintuitive if it makes so much sense we don&#8217;t even question it a little bit?</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2010/02/23/large-datasets-and-threats-to-validity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>google buzz</title>
		<link>http://bierdoctor.com/2010/02/14/google-buzz/</link>
		<comments>http://bierdoctor.com/2010/02/14/google-buzz/#comments</comments>
		<pubDate>Sun, 14 Feb 2010 21:08:14 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[advice]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[in the news]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[tangential]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=448</guid>
		<description><![CDATA[I am dismayed by the way Google has rolled out Buzz, and I am not alone. Many bloggers and news organizations have raised issues with Google&#8217;s misguided assumption that email contacts form the same kind of social network as users of Facebook and Twitter (etc.) have built up over time. For example, a NY Times [...]]]></description>
			<content:encoded><![CDATA[<p>I am dismayed by the way Google has rolled out Buzz, and I am not alone. Many bloggers and news organizations have raised issues with Google&#8217;s misguided assumption that email contacts form the same kind of social network as users of Facebook and Twitter (etc.) have built up over time. For example, a NY Times article, <a href="http://www.nytimes.com/2010/02/13/technology/internet/13google.html">Critics Say Google Invades Privacy With New Service,</a> makes the following point:</p>
<blockquote>
<div>
<p>“People thought what they had was an address book for an e-mail program, and Google decided to turn that into a friends list for a new social network,” said Marc Rotenberg, executive director of the Electronic Privacy Information Center, an advocacy group in Washington. “E-mail is one of the few things that people understand to be private.”</p>
<p>Mr. Rotenberg said that his organization planned to file a complaint with the Federal Trade Commission claiming that the Google’s use of e-mail conversations to build a social network was unfair and deceptive.</p>
</div>
</blockquote>
<div>
<p>I use Gmail and many other Google products. In fact, several times a week I get unsolicited email from strangers that is NOT spam &#8212; it is more like &#8220;wrong number&#8221; email. I suppose Google Buzz would include those people in my social network, eh?</p>
<p>Whenever I thought about all the data about me that was in Google&#8217;s possession, I always felt a twinge of discomfort. But I believed them when they said protecting my privacy was of the utmost importance. In fact, Google lists five privacy principles on its <a href="http://www.google.com/privacy.html">Privacy Center</a> webpage, that sound pretty good:</p>
</div>
<blockquote>
<div>1. Use information to provide our users with valuable products and services.</div>
<div>2. Develop products that reflect strong privacy standards and practices.</div>
<div>3. Make the collection of personal information transparent.</div>
<div>4. Give users meaningful choices to protect their privacy.</div>
<div>5. Be a responsible steward of the information we hold.</div>
</blockquote>
<div>Unfortunately, it seems to me that Google has violated pretty much all of their privacy principles with the rollout of Buzz. I rationalized my discomfort with allowing Google access to pretty much every type of private, personal data I can think of by telling myself that they could be trusted with this responsibility.</p>
</div>
<div>However, their choice to jumpstart Buzz critical mass seems to have been motivated out of a desire to compete with Twitter and Facebook, NOT to provide a valuable service while protecting privacy. Disappointing, to say the least. I no longer feel like I can trust Google with my data. I wonder how many other people feel this way too, and how much time it would take to extract myself from all the Google services I use&#8230;</p>
</div>
<div>If you want to stop using Buzz, Gmail Help has <a href="http://mail.google.com/support/bin/answer.py?hl=en&amp;answer=171460">some instructions</a>, which have changed at least once in the past 24 hours as Google responds to the public outcry (<a href="http://bierdoctor.com/website/wp-content/uploads/2010/02/disabling_buzz-feb-12.png">Feb 12 2010</a> | <a href="http://bierdoctor.com/website/wp-content/uploads/2010/02/disabling-buzz_feb-13.png">Feb 13 2010</a>). Simply hiding the Buzz link in Gmail is NOT enough &#8212; the key is modifying one&#8217;s Google Profile in 4 steps, or deleting the profile altogether. And for those of you who have a public Google *Groups* profile, this seems to be a *separate* Google profile from the Google capital-P Profile. Confusing? You betcha.</div>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2010/02/14/google-buzz/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>happy new year!</title>
		<link>http://bierdoctor.com/2010/01/01/happy-new-year-2/</link>
		<comments>http://bierdoctor.com/2010/01/01/happy-new-year-2/#comments</comments>
		<pubDate>Sat, 02 Jan 2010 05:04:14 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[reflection]]></category>
		<category><![CDATA[visualization]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=431</guid>
		<description><![CDATA[Continuing the yearly tradition I started last January, here is my 2009 word cloud, made using wordle.net from the text of the blog posts I wrote in 2009. The most common words don&#8217;t look too different from the 2008 wordle! 

It&#8217;s an SVG image, so if you&#8217;re having problems viewing it try using a more [...]]]></description>
			<content:encoded><![CDATA[<p>Continuing the yearly tradition I started last January, here is my 2009 word cloud, made using <a href="http://wordle.net">wordle.net</a> from the text of the blog posts I wrote in 2009. The most common words don&#8217;t look too different from the <a href="http://bierdoctor.com/2009/01/01/210/">2008 wordle</a>! </p>
<p><embed src="http://bierdoctor.com/images/svg/2009wordle.svg" height="350" width="450"></embed></p>
<p>It&#8217;s an SVG image, so if you&#8217;re having problems viewing it try using a more recent browser version. It works for me in both the latest Safari and Firefox. See both the <a href="http://bierdoctor.com/images/svg/madmissionWordle2008-3.svg">2008</a> and <a href="http://bierdoctor.com/images/svg/2009wordle.svg">2009</a> wordles, big.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2010/01/01/happy-new-year-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>CHI 2010 paper accepted</title>
		<link>http://bierdoctor.com/2009/12/13/chi-2010-paper-accepted/</link>
		<comments>http://bierdoctor.com/2009/12/13/chi-2010-paper-accepted/#comments</comments>
		<pubDate>Sun, 13 Dec 2009 06:42:23 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[conference]]></category>
		<category><![CDATA[publications]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=413</guid>
		<description><![CDATA[I am happy to announce that my submission to CHI 2010, &#8220;The Effect of Audience Design on Labeling, Organizing, and Finding Shared Files&#8221;, was accepted! This paper presents the results from the quantitative part of my dissertation. Here&#8217;s the final version. 
ABSTRACT: In an online experiment, I apply theory from psychology and communications to find out [...]]]></description>
			<content:encoded><![CDATA[<p>I am happy to announce that my submission to <a href="http://www.chi2010.org/">CHI 2010</a>, &#8220;The Effect of Audience Design on Labeling, Organizing, and Finding Shared Files&#8221;, was accepted! This paper presents the results from the quantitative part of my <a href="http://bierdoctor.com/papers/ejr-thesis.pdf">dissertation</a>. Here&#8217;s the <a href="http://bierdoctor.com/papers/rader-chi2010-final.pdf">final version</a>. <a href="http://bierdoctor.com/papers/rader-chi2010-final.pdf"><img class="alignnone" src="http://bierdoctor.com/images/icons/PDF-FILE.GIF" alt="" width="16" height="16" /></a></p>
<p>ABSTRACT: In an online experiment, I apply theory from psychology and communications to find out whether group information management tasks are governed by the same communication processes as conversation. This paper describes results that replicate previous research, and expand our knowledge about audience design and packaging for future reuse when communication is mediated by a co-constructed artifact like a file-and-folder hierarchy. Results indicate that it is easier for information consumers to search for files in hierarchies created by information producers who imagine their intended audience to be someone similar to them, independent of whether the producer and consumer actually share common ground. This research helps us better understand packaging choices made by information producers, and the direct implications of those choices for other users of group information systems.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2009/12/13/chi-2010-paper-accepted/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>beyond significance testing</title>
		<link>http://bierdoctor.com/2009/12/12/beyond-significance-testing/</link>
		<comments>http://bierdoctor.com/2009/12/12/beyond-significance-testing/#comments</comments>
		<pubDate>Sat, 12 Dec 2009 06:33:42 +0000</pubDate>
		<dc:creator>emilee</dc:creator>
				<category><![CDATA[analysis]]></category>
		<category><![CDATA[methods]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=405</guid>
		<description><![CDATA[I&#8217;ve been reading a book lately before bed, a little bit at a time: Beyond Significance Testing, by Rex B. Kline. It isn&#8217;t exactly a suspenseful page-turner; maybe if I tried reading it some other time of day than when I am already sleepy I might be able to get through it faster.
The purpose of [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been reading a book lately before bed, a little bit at a time: <a href="http://www.amazon.com/Beyond-Significance-Testing-Reforming-Behavioral/dp/1591471184/">Beyond Significance Testing</a>, by Rex B. Kline. It isn&#8217;t exactly a suspenseful page-turner; maybe if I tried reading it some other time of day than when I am already sleepy I might be able to get through it faster.</p>
<p>The purpose of the book is to convince readers that Null Hypothesis Significance Testing (NHST) should no longer be practiced, and to suggest alternatives like using confidence intervals and always reporting effect sizes. I think my favorite quote from the book so far is this one, in a paragraph devoted to the suggestion that one way to fix the NHST problem is just to use more careful, less overreaching language in talking about p-values and significance tests (like phasing out the word &#8220;significant&#8221;):</p>
<blockquote><p>You can put candles in a cow pie, but that does not<br />
make it a birthday cake.</p></blockquote>
<p>You can tell what the author thinks of that idea. Ouch.</p>
<p>Part I of the book also makes an interesting argument that NHST is not only bad social science, it is bad FOR social science. The idea is that because p-values are colloquially understood to mean something they actually do not, researchers believe the findings of a single study are more robust and reliable than they actually are. For example, a p-value represents the conditional probability of the data given the null hypothesis, NOT the probability that the null hypothesis is true given the data. According to the book, this and other misinterpretations about the logic of significance testing cause the literature to be biased toward research results &#8220;about fad topics that clutter the research literature but have little scientific value&#8221;, that are never replicated:</p>
<blockquote><p>&#8230;if one believes that <em>p</em> &lt; .01 implies that the result is likely to be repeated more than 99 times out of 100, why bother to replicate? A related cognitive error is the belief that statistically significant findings should be replicated, but not ones for which [the null hypothesis] was not rejected (F. Schmidt &amp; Hunter, 1997).</p></blockquote>
<p>This bias perpetuates research for which the practical, meaningful significance of the results is not clear.</p>
<p>A lot of these arguments make sense to me&#8212;it seems like the process of NHST hides a lot of the error and uncertainty that is part of doing science, making it seem like the results of individual studies are more definitive and certain than they actually are. I&#8217;m looking forward to making it through the rest of the book and starting to practice the alternatives it suggests.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2009/12/12/beyond-significance-testing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>group information repositories</title>
		<link>http://bierdoctor.com/2009/10/28/group-information-repositories/</link>
		<comments>http://bierdoctor.com/2009/10/28/group-information-repositories/#comments</comments>
		<pubDate>Thu, 29 Oct 2009 03:00:19 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[past projects]]></category>
		<category><![CDATA[research]]></category>

		<guid isPermaLink="false">http://bierdoctor.com/?p=384</guid>
		<description><![CDATA[This post is the fourth in a series of posts covering projects I worked on before my current postdoc position. Below I have included the abstract for my dissertation, Social Influences on User Behavior in Group Information Repositories. I have produced two papers from my thesis; one was published in CHI 2009, and the other [...]]]></description>
			<content:encoded><![CDATA[<p>This post is the fourth in a series of posts covering projects I worked on before my current postdoc position. Below I have included the abstract for my dissertation, <em><a href="http://bierdoctor.com/papers/ejr-thesis.pdf">Social Influences on User Behavior in Group Information Repositories</a></em>. I have produced two papers from my thesis; one was <a href="http://portal.acm.org/citation.cfm?id=1518701.1519019">published in CHI 2009</a>, and the other was accepted to <a href="http://www.chi2010.org/">CHI 2010</a>.</p>
<p>Rader, E. (2010). The Effect of Audience Design on Labeling, Organizing, and Finding Shared Files. <em>To appear in CHI 2010</em>. <a href="http://bierdoctor.com/papers/rader-chi2010-final.pdf"><img style="border: 0px initial initial;" src="http://bierdoctor.com/images/icons/PDF-FILE.GIF" alt="" width="16" height="16" /></a></p>
<p>Rader, E. (2009). Yours, Mine, and (Not) Ours: Social Influences on Group Information Repositories. <em>Proceedings of CHI 2009</em>. <a href="http://bierdoctor.com/papers/rader-ctools-chi-v5.pdf"><img style="border: 0px initial initial;" src="http://bierdoctor.com/images/icons/PDF-FILE.GIF" alt="" width="16" height="16" /></a></p>
<p><strong>Dissertation Abstract</strong>: Group information repositories are systems for organizing and sharing files kept in a central location that all group members can access. These systems are often assumed to be tools for storage and control of files and their metadata, not tools for communication. The <em>storage</em> approach focuses on providing users with detailed information about the objects in the system&#8212;where they are, which users have been looking at them, how they&#8217;ve been used in the past, etc. However, group information repositories tend to grow and become disorganized over time, such that users have difficulty finding what they need. A different approach is to think of these systems as <em>social</em> tools that could be governed by the same processes as face-to-face communication, like grounding and audience design.</p>
<p>The purpose of this research is to better understand user behavior in group information repositories, and to determine whether social factors might shape users&#8217; choices when labeling and organizing information. While the functionality and capabilities of these systems are essentially the same as the desktop metaphor of personal information management (PIM) systems, I argue that social pressures and processes affect the information structure of the repository, and how it grows and evolves over time. Through a series of interviews with users of a typical group information repository system and an analysis of system log data, I found that users tend to restrict their activities in a repository to files they &#8220;own&#8221;, are reluctant to delete files that could potentially be useful to others, dislike the clutter that results, and can become demotivated if no one views files they uploaded.</p>
<p>I also conducted a two-part online experiment in which participants labeled and organized short text files into a file-and-folder hierarchy. Eighty-four participants were recruited from two intellectual communities (41 Computer Science graduate students, and 43 Information Science graduate students), such that some participants would share community membership common ground with each other, and some would not. Participants were instructed to organize the files for one of three different audiences: themselves, someone from the same intellectual community, and someone from the other community. Forty-eight participants returned four to six weeks later and completed a series of search tasks, in which they browsed hierarchies created by other participants to find specific files. Including both labeling/organizing and finding tasks in the experiment allowed me to detect potential performance differences when participants searched hierarchies created by others from the same community (or not), and tailored for different audiences. I found that when participants created hierarchies for an audience they imagined was like them, everyone found files in fewer clicks, regardless of whether they were from the same community as the person who created the hierarchy. Further, quantitative analyses of three aspects of the hierarchies (topology, vocabulary, and semantics) helped to explain these results. Users performed better when file and folder labels were more similar to the text of the documents they represented; this correlation was significantly stronger when participants organized the documents for someone who was similar to them.</p>
<p>These results confirm that <em>audience design</em>, a communication process, can in fact impact group information management tasks. The findings from both studies suggest that sharing files via a group information repository is more complicated than simply making them available on a server so that others might access them. My research indicates that processes which have been shown to affect spoken communication also impact word choices when the &#8220;interaction&#8221; is mediated by a repository. Social factors affect users&#8217; choices regarding how files in the repository are organized and labeled and what information is retained over time; this in turn affects access to information. Knowing that repositories are social systems will allow system designers to incorporate information that makes the users more salient and familiar to each other, so the process of negotiating shared meaning is better supported by the repository system.</p>
Copyright &copy; 2010 <strong><a href="http://bierdoctor.com/">Emilee Rader</a></strong>]]></content:encoded>
			<wfw:commentRss>http://bierdoctor.com/2009/10/28/group-information-repositories/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
