Plot and ggplot2 with R-Studio

Data from Vision Problems in the U.S. provides estimates of the prevalence of eye-disorders in the US by state in adults 40 and older.

Here is the initial outputs of my work with R-Studio statistical software:

  1. When plot() is envoked on R-Studio, a matrix of plots are established by header columns. This particular function ran very slow, so I limited the data plotted to only entries from state: “OHIO”. Most noticeably, my values for vp (vision problem), age, race and sex are all categorical.
  2. I decided to further install the ggplot2 library for some more customizable plots. The color graph plots age against rate, with color determined by vp (vision problem). Each of my categorical variables has a “total/all” section that is not separated out from the individual state, race, gender or age data.
  3. I then attempted to use the box plot feature, which did not yield any additional insights.

Further steps to clearly visualize this data will address these problems:

  • What strategies are best to see averages for “all” categorical data alongside individual categorical data (e.g. female, or white, or 55-64 yrs)?
  • How can multiple categorical variables be presented at the same time (40-50yr Hispanic male vs. 51-60yr Black female vs. etc.)? Shapes, colors, other?
  • Should the data be presented as exploratory or tailored to meet a specific idea or perspective?
  • Which of these diseases/disorders result in a prescription for eyeglasses or vision correction?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s