5.2 Using markov random fields for election prediction (Page 4/5)

Page 4 / 5

Z (θ) = \sum_{ξ} exp \{\sum_{i = 1}^{D}, θ_{i}, μ_{i}, (ξ_{j})\}

Note that the sum over $ξ$ in the partition function refers to the sum over all possible $ξ$ , not just the $ξ$ that have been observed. This fact makes computation of the partition function intractable and we must approximate it. Following a sampling-based learning technique, we conclude [link] :

ln Z (θ) \approx ln (\frac{1}{T}, \sum_{t = 1}^{T}, exp, \{\sum_{i = 1}^{D}, (θ_{i} - θ_{i}^{0}), μ_{i}, (ξ_{t})\}) + ln Z (θ^{0})

Where $θ^{0}$ is some set of parameters from which T samples are drawn (20 in our case). Since $ln Z (θ^{0})$ is a constant, we can leave it out of the optimization's objective function and we solve the MLE problem via gradient ascent.

\nabla θ_{i}^{t - 1} = \sum_{j = 1}^{13} μ_{i} (ξ_{j}) - \frac{13 \sum_{t = 1}^{T} (μ_{i}, (ξ_{t}), exp, \{\sum_{r = 1}^{D}, (θ_{r}^{t - 1} - θ_{r}^{t - 2}), μ_{r}, (ξ_{t})\})}{\sum_{t = 1}^{T} exp \{\sum_{r = 1}^{D}, (θ_{r}^{t - 1} - θ_{r}^{t - 2}), μ_{r}, (ξ_{t})\}}

θ^{t} = θ^{t - 1} + s * \nabla θ^{t - 1}

Where $s$ is some small step size. We update $θ^{0}$ on each iteration to be $θ^{t - 2}$ . This is due to the fact that the partition function approximation is only reasonable in a neighborhood of $θ^{0}$ [link] . It follows that the $ξ$ 's which are indexed by t are drawn from a model with parameters $θ^{t - 2}$ , while the $ξ$ 's indexed by j still represent the historical data.

Correcting for lack of data

Due to the small number of historical observations (13) and the large number of possible combinations for any edge ( $60 states * 60 states = 3, 600$ combinations), we must come up with a more concise way to learn the relationships between counties. To that end, we look not at the absolute voting percentages of counties but rather the difference in voting percentage between each pair of neighboring counties. This method has the added bonus of circumventing the problem of overall change that has affected every county. Unfortunately, there are still 119 possible differences that could occur (-59,-58,...,0,...58,59) and only 13 elections to determine the frequency with which each difference occurs. Therefore, we place each difference into a cluster, e.g. [-9,-6]. We use 11 clusters total and since the differences between counties are fairly consistent between years, the 13 observations should be sufficient for an approximation of the marginal probabilities for each edge. These approximation techniques do not affect the way we solve the problem via gradient ascent. However, once gradient ascent is finished we must convert our small $θ$ into standard long form (as displayed in Section 2.1).

Performing map inference

Due to our approximation techniques in the learning process, we are confronted with a problem when attempting to predict the 2012 election. Since the entire model is based off relativity, any outcome for a particular county is equally likely as long as the rest of the model shifts with it. In order to ensure we do not get extremely low or high results, we must fix some subset of the counties as a starting point for the model. In order to do this, we utilize linear regression techniques (as discussed in the next section). Once the model is partially filled in, we solve the binary program stated above with our learned $θ$ (in standard long form) via Gurobi Optimizer.

Multivariate regression

Multivariate Linear Regression is commonly used in social sciences as a means of predicting future outcomes based off of known data. It will provide us with a comparison as well as a starting off point for our Markov Random Field model. Our model will have Incumbent Party Vote % as the dependent variable. That is, if a Democratic president is currently in office, then we will be predicting the voting %'s earned by this year's Democratic Candidate.

<< Chapter < Page Page > Chapter >>

Read also:

Get Jobilize Job Search Mobile App in your pocket Now!

100% Free Mobile Applications
Receive real-time job alerts and never miss the right job again

Source: OpenStax, The art of the pfug. OpenStax CNX. Jun 05, 2013 Download for free at http://cnx.org/content/col10523/1.34

Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'The art of the pfug' conversation and receive update notifications?

Ask

	16 AP 16 Neurological Essay Exam By OpenStax Start Flashcards
	27 AP 27 Reproductive System Essay By OpenStax Start Flashcards
	21 AP Key Terms 21 Lymphatic Immune System By OpenStax Start Key Terms
	Neuropsychology Midterm By Kimberly Nichols Start Test
©flickr: Steve	C Programming Language By JavaChamp Team Start Quiz
©flickr:	Microbiology Final By Szilárd Jankó Start Quiz
©flickr: Luis	Chemistry Final Review By Madison Christian Start Exam
	24 AP 24 Metabolism Nutrition Essay By OpenStax Start Flashcards
	22 AP 22 Respiratory System Essay By OpenStax Start Flashcards
	Fundamentals of electrical engineering i By OpenStax Read Online Course