Usernames are not included. Rnw Sweave document was compiled using the knitr package.


For instance this analysis by OkCupid preselect fixed of men and women. Ethnicity of OkCupid users Ethnicity is reported as below is to0 complex for analysis.

Its hard to find comparitive references elsewhere for this information. Cognitive ability is found to be negatively related to all measures of religious belief latent correlations.

Pulling Data FindUsers. To further validate the dataset, we examined the relationship between Zodiac and every other variable. Her body type is average and is 5' 4" tall. Fun Links.

Fitted probabilities p-hat of each user being female along witha decision threshold in red used to predict if user is female or not. I pulled it from some CMU on the omcupid, but I forget where. The above outcomes translate into a coherent story about the male and female MPHs. He eats mostly anything, drinks socially, but never takes drugs.

Data – the okcupid blog

To meet speed and simplicity goals, I stripped off everything except the first descriptor. This is a key question for any dating site. A stacked bar chart is a visually appealing and useful way to compare the. R - R script to read in profile data and produce a mosaicplot cross-classifying gender and sexual orientation produce a histogram of heights split by gender ReadingLevel.

I have used it to compute reading levels for essay responses, plot them, etc, though I don't have any of that code checked in. A majority of OkCupid users are found to be white, with small differences between the male and female populations of other ethnicities.

Exploratory analysis of okcupid dataset

Limitations of the dataset are discussed. How many more men than women use OkCupid?

Rnw Sweave document was compiled using the knitr package. DOI: Note both the x-axis height and y-axis is female: 1 if user is female, 0 if user is male have random jitter added to better visualize the of points involved for each height x gender pair.

Exploratory analysis of okcupid dataset

She speaks daatset english. He speaks only english. She eats mostly anything, drinks socially, but never takes drugs. Preview Distribution of Male and Female Heights t Distribution of Sex and Sexual Orientation A mosaicplot of the cross-classification of the users' sex and sexual orientation: Logistic Regression to Predict Gender Linear regression in red and logistic regression in blue compared.

Anonymized user ids will map to users in the other file, if the user existed in both. Usernames are not included.

How do male and female ages stack up? Compute differences between male and female populations by age.

The okcupid dataset: a very large public dataset of dating site users - openpsych

Julius D. For instance coverage of the Ashley Madison hack it reported many more men than women used the site. As an example of the analyses one can do with the dataset, a cognitive ability test is constructed from 14 suitable items.

It in tangible which can be compared to existing data. We found datasft scant evidence of any influence the distribution of p-values from chi square tests was flat.

