Jump to content

Average SFN Member


DJBruce

Recommended Posts

As excited as I am about the prospect of data analysis, I disagree passionately with your choice of bar graph format.

 

I would have liked to do some other analysis, but how the data was formatted it makes doing other graphical stuff hard to do.

Link to comment
Share on other sites

The post counts gathered from the survey appear to suggest an approximately bimodal distribution

 

Maybe not. There may be the superficial appearance of bimodality, but I'd be more impressed with that thought if there was more than one response option between the peaks, or if there were any kind of smoother approach to the peaks on any side. (Indeed, it actually appears almost trimodal, given the large peak at 5000-10000). Of course, you quickly recognized all of this, and noted:

 

I believe that this may suggest a bias in the data collected as a quick look at the SFN’s member list suggests a post count distribution that is monomodal skewed left. This is drawn from the fact that out of 1331 page member list, only 27 pages contain members who have more than 100 posts on SFN.1 This bias is more than likely caused by the fact that the members who are most active and therefore more likely to notice and respond to the survey are the ones who also have the most posts.

 

Quite right. Phrased a little differently than that way you did so, this is a variable in which, interestingly, the people taking the survey are likely to differ dramatically from the target population, since both # of posts and willingness to take the survey are driven strongly by interest & investment in the site.

 

Nonetheless, that sort of thing really explains the difference in sample mean from the population mean more than anything else. The two peaks near the right of the graph pique (puns? really?) my interest a little more, and I don't think they're totally accounted for by what you said. I think what's going on there is actually likely to be some of what we call "anchoring." In short, "more than 1___, more than 5___, or more than 10___" are common heuristics used to guess numbers in societies with base 10 number systems. You see this sort of thing a lot in surveys where continuous variables are assessed ordinally, and the response categories have the sort of numbering system like the one you threw down.

 

 

Bruce, besides hitting us with some basic descriptives, do you have any plans ("do you have enough data/power" may be the better question) to run, perhaps, a few correlations, or anything more?

Link to comment
Share on other sites

First thank you very much for taking time to look through what I did, and giving some of your thoughts. :)

 

Quite right. Phrased a little differently than that way you did so, this is a variable in which, interestingly, the people taking the survey are likely to differ dramatically from the target population, since both # of posts and willingness to take the survey are driven strongly by interest & investment in the site.

 

Very, nicely said.

 

Bruce, besides hitting us with some basic descriptive, do you have any plans ("do you have enough data/power" may be the better question) to run, perhaps, a few correlations, or anything more?

 

I have been trying to do a lot of chi-square tests of independence sadly every one I try fails some of the conditions. Mainly the fact that the data was not collected through a simple random sample, but more importantly and frustratingly that many of the expected values are less than 1. Sadly, my statistics background only consists of AP Statistics, so at this point I am not sure what test I could possibly do with the limited data that I have. :( It also doesn't help that I collected much of it as categorical data.

 

If anyone with more stats experience has any suggestions on possible tests or methods I could try be my guess to suggest them.

Link to comment
Share on other sites

Here's one of the best advices for selling science you'll ever get: a nicely-colored five-minute plot will always impress people more than the 0.235(4) you pull out a two-day calculation (except people in your field who can judge the amount of work). Take a correlation you are interested in (e.g. "academic level" and "number of posts") and plot a 2D histogram N(academic level, number of posts).

 

Histograms for dummies (dunno your level so no offense meant): You start with N(...,...)=0. Then, for each question form, you read off A, the bar for academic level, and B, the bar for number of posts. Then, you increase the respective entry in the histogram: N(A,B) -> N(A,B)+1. Repeat that for all forms, then draw the thing.

 

EDIT: The "B)" smiley is a pain in the ass.

Edited by timo
Link to comment
Share on other sites

I added mine even though it is too late. I thought I might help dilute the "scientists are not religious" chestnut.

same here.

think that can be translated to "religious scientists are not welcome enough in science forums to log on in less than 17 days intervals"

..

just kidding ok, so don't y'all get your panties in a knot.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.