The Conversation

David Tuffley, Griffith University and Bela Stantic, Griffith University

With just a few days to go before the postal vote closes on the same-sex marriage issue, there are plenty of strong opinions on all sides of the debate.

Our detailed study of the opinions expressed on Twitter shows the result could be a narrow defeat of the Yes campaign, with 49.17% support.

That figure is at odds with early opinion polls, some of which predicted up to 60% support and more for the Yes campaign. So how did we reach this lower figure?


Read more: National poll vs sample survey: how to know what we really think on marriage equality


Big data

We used advanced data analytics, developed at Griffith University’s Big Data and Smart Analytics Lab, which have proven uncannily accurate at predicting the outcomes of hard-to-call polls. Despite strong polling to the contrary, our method predicted the outcome of the US presidential election.

We looked at the publicly available data from 458,565 anonymised Australian Tweets making reference to same-sex marriage over October 2017.

We gauged the sentiment of these tweets with a rule-based model that combines a domain-specific lexicon (a dictionary of terms with assigned sentiment weighting) with a series of intensifiers (the punctuation, emoticons and other heuristics). Together, this makes it possible to know which side of the debate the person sits on, and how strongly they feel about it.

Going beyond the sentiment of the Tweet, machine learning determines the gender, age and even educational level of the sender, along with the general intention of the Tweet. All of this is deduced from the person’s writing style, vocabulary and various other factors.

Digging deeper

On the face of it, when all the captured Tweets were considered, there appears to be overwhelming support for Yes, with 72% in favour.

But digging deeper, we see that some individuals sent more than 1,000 Tweets in support of Yes. Of the 458,565 Tweets we examined, the number of unique users came down to just 207,287.

Taking the sentiment of the unique users into account, the adjusted figure in support of Yes comes down to 57%. It is acknowledged that the campaign Tweets will have influenced public opinion to some degree.

Over-55s under-represented

Looking carefully at the demographics, it emerges that less than 15% of the total Tweets were sent by people over the age of 55. Of these over-55s, only 34% expressed support for Yes.

According to the Australian Bureau of Statistics (ABS), which is conducting the postal vote, from the total number of people in Australia eligible to vote, around 36% are over 55.

If we consider that the same proportion of over-55s and under-55s do not vote, then based on the opinion of the 207,287 unique social media users, the total support for the Yes position comes down to 49%.

So it is likely to be a close-run result, much closer than the earlier polls suggested, and leaning in the direction of No.

How reliable is the result?

One of the problems with predicting poll outcomes is that people are often reluctant to say out loud what they really think about issues. What people say online can often be more accurate than what they say to each other in this age of political correctness.

In the lead-up to the recent US presidential election, the polls pointed to a Hillary Clinton win because many people were publicly saying “No” to Trump when asked by pollsters. But in the privacy of the booth, people quietly voted according to what they actually thought.

Improvement in big data analytics are made possible through cheaper, faster computers, exponentially greater volumes of data, and more advanced deep learning algorithms. This winning trifecta is creating possibilities and value that did not exist even a few years ago.

The Big Data and Smart Analytics Lab uses the “human sensor approach” that has greatly improved prediction quality. This takes the whole person into consideration, including the ways they create the meaning that is transmitted on the web.

Privacy concerns are paramount, so all data are legally required to be anonymised.


Read more: Cognitive ability plays a role in attitudes to equal rights for same-sex couples


So how accurate is our result? We will know on November 15 when the ABS announces the result of the postal vote.

As of October 27 the ABS said it had received around 12.3 million survey responses, amounting to 77% of the 16 million eligible voters.

The ConversationForms must be received by the ABS by 6pm (local time) November 7 to be included in the count, so there is still time to cast your vote.

David Tuffley, Senior Lecturer in Applied Ethics and SocioTechnical Studies, School of ICT., Griffith University and Bela Stantic, Professor, Director of Big data and smart analytics lab, Griffith University

This article was originally published on The Conversation. Read the original article.