# Cluster vs Stratified Sampling

## Recommended Posts

I'm working on a homework assignment here, but I'm not actually looking for an answer to the problem, just a little help understanding which approach to take. My team is tasked with building a detailed sampling plan, but we're arguing about which approach to use. I was hoping I could get some insight as to why one method is better than the other for this situation.

We've narrowed it down to either stratified or cluster sampling, but there is a debate within my team as to whether our data sets are homogeneous or heterogeneous. In our population we have thousands of students staying in various university dormitories, and we want to determine how long they remain resident there on average. The dormitories are collected into regions around campus. North campus dormitories, south campus dormitories, etc (6 regions).

Now, I've been over our textbook quite a bit, and also, using the Wikipedia as a starting point, I've been through various articles on stratified sampling and cluster sampling. And as I mention above, the problem seems to break down to whether or not my sample is homogeneous.

Looked at one way, cluster sampling seems correct, because we have a heterogeneous group (students in different degree programs), but homogeneous average stay lengths (everyone's in a bachelor's program which is typically four years long).

But looked at another way, stratified sampling seems correct, because we have a completely homogeneous population -- there is insufficient homogeneity between the students, since they're all in four-year degree programs.

I have a general feeling about this is that we should go with stratified sampling (specifically, diversified) because of all the shortcomings of cluster sampling that you read about everywhere.

The biggest problem I've run into in researching these two methods is just understanding what exactly constitutes homogeneity in the data. Any insight or suggestions would be appreciated. Thanks!

##### Share on other sites

I'm not sure if you already solved this problem, given the date you posted this, but I will help you out anyway. Given the info above, you are probably much better off using stratified sampling because there is a lot of variation (I may be wrong on this, I will need to see the data to be certain) on the students permanent addresses, their degrees, economic status, loan status, sex, race, etc. Stratifying the samples will allow you to make comparisons much more easily between them. The problem I see with clustering is that you might end up with sampling biases.

##### Share on other sites

Thanks, I appreciate the input. We've decided to go with a stratified model.

## Create an account

Register a new account