FROM DATA BREACH TO BREACH OF TRUST

Comments · 2653 Views

The blog titled "From Breach of Data to Breach of Trust" provides a brief overview of the series of vulnerabilities that occurred in March 2018 involving Facebook and Cambridge Analytica. This scandal impacted organizations, economies and forced new ways of Data security.

REWIND:

On 17 March 2018, the New York Times and the Guardian reported that a data mining firm named Cambridge Analytica had improperly obtained access to more than 50 million user profiles in a major data scandal. Experts believed that the firm could have used that data to gain an unfair advantage in targeting voters. The explosive expose which initially reported to have harvested 50 million profiles was later revised to a number of around 87 million profiles. 

WHAT AND HOW IT HAPPENED?

Cambridge Analytica was a company that offered services to businesses and political parties who wanted to change audience behavior. It analyzed huge amounts of consumer data and combined that with behavioral science to identify people who could be targeted with marketing material.

As we know, data is the most valuable asset on earth. The firm collected data from a wide range of sources including social media platforms such as Facebook and its own polling. The firm had 5000 data points on each American voter and used these points to predict the personality of every adult in US. As personality drives behavior and behavior influences your way of thinking, Cambridge Analytica used this to target people as a personality rather than as a voter. These bunch of people were called “The persuadables” - whose mind they could change.   

Cambridge Analytica assembled information through an app on site that collected details of Americans who took a personality test but also gathered data on those people’s Facebook friends without authorization. The data that was exposed included status, likes and private messages in some cases. This burgeoning scandal led to the breach of trust of users as their data was exposed without their consent or awareness.

HOW FACEBOOK AND CAMBRIDGE ANALYTICA WERE CONNECTED?

A data scientist at Cambridge University, Aleksandr Kogan, was hired by Cambridge Analytica, to develop an app called "This Is Your Digital Life". He provided the app to Cambridge Analytica and they in turn arranged an informed consent process for research in which around 270,000 users agreed to complete a paid survey that was only for academic use. However, Facebook allowed this app not only to collect personal information from survey respondents but also from respondents’ Facebook friends. Below is a snapshot of the type of question that was asked in the quiz.

BUT HOW DID CAMBRIDGE ANALYTICA GET ACESS TO USER’S DATA?

When you log into a website or app and it asks you if you want to login using your Facebook ID, that’s when you use Facebook login. Many people prefer this so that they don’t have to create new credentials. Obviously, it’s easier, because you don’t have to remember another username and password, but this convenience came with a price in this scandal that in a way proved the worst case of technology. The app developers got access to user accounts, their data as well as their friends data. This technique is called “seeding”. It was in this way, Cambridge Analytica acquired data from millions of Facebook users.

WHAT DATA DID THEY GET AND HOW THE MODEL WORKED?

Cambridge Analytica harvested personal information on where users lived and what pages they liked, which helped build psychological profiles that analyzed characteristics and personality traits by taking the quiz.

The firm rolled out a long form quantitative instrument to probe the underlying traits that measure personality. It was the OCEAN model- an acronym for

  1. Openness,
  2. Conscientiousness,
  3. Extroversion,
  4. Agreeableness,
  5. Neuroticism

It was also called “The Big Five model psychology”. This personality model defined personality by characterizing each person in 5 above dimensions or traits. But, to determine the personality of a person hundreds of questions were needed which was quite cumbersome and too much to ask for. So rather, they used the Facebook likes and provided these as input to the Machine Learning models. People who consented to take the personality tests were stored in the database. An approach identified correlations between personality types and Facebook likes based on the stored person database further allowing them to identify personality types of those even outside their database.

ROLE OF PSYCHOGRAPHICS:

Psychographics are behavioral – a means to segment by personality. The psychographic model used by Cambridge Analytica worked very much like the one used by Netflix to recommend movies. The masterminds of the app designed their own proprietary model that they called as “a multi-step co-occurrence approach”. This approach was similar to SVD (Singular Value Decomposition) or other matrix factorization methods. Dimensionality reduction of Facebook was the core of the model.

The main idea of SVD is to reduce a dataset containing a large number of values to a dataset containing significantly fewer values, but which still contains a large fraction of the variability present in the original data. The Matrix Factorization techniques are usually more effective because they allow us to discover the latent features underlying the interactions between users and items. The model used above two concepts and found correlations among group of people and group of things. Through this they categorized users’ and things they liked which further helped them predict the personality.

The whole point of using a dimension reduction model was to mathematically represent the data in a simpler form.

OUTPUT OF CAMBRIDGE ANALYTICA’s OCEAN MODEL:

Cambridge Analytica worked as a full-service propaganda machine during the presidential elections. The campaign team used the exposed data to determine user’s personality traits based on their Facebook activity as a micro-targeting technique along with the use of behavioral science. This was Cambridge Analytica’s strategy for Facebook advertisements to persuade users to vote in favor of their client. They would then display customized messages to different US voters on various digital platforms that would be most convincing to them and result in change of behavior.  

EFFECTS:

The scandal which was a grossly unethical experiment, played with psychology of the entire nation and came with real world consequences.  

  1. Around 87 million Facebook profiles had been caught up in the scandal
  2. Market fluctuation(Stock fell by $120 billion)

HOW THIS DATA BREACH COULD BE AVOIDED:

  1. Users should get an alert notifying of any unusual activity just like Gmail does by default. A notification can be sent to users when their accounts are being accessed by someone.
  2. Data encryption will help protect data that you have sent, received or stored in a device. It scrambles readable text so that it can be read by the person who has the secret code or decryption key.
  3. Storing Facebook users private data in blocks in a blockchain network as blockchain ensures data encryption. We can use this to store confidential information as modification would be a difficult task then. Any information related to a block can have link of the private information data block of the user. This will help in tracking any data modifications or data manipulations being tried by any third party.
  4. Provide users more control over the Facebook platform to determine what information a user wants to display and what should be kept confidential.

CONCLUSION:

In this rapidly developing world, technology is also moving fast, and people don’t understand how to keep up with the real rapid pace of technology change. If they take a second and ponder, they can see that their personal data is out there and being misused. Cambridge Analytica scandal served as a perfect poster child for this worrying picture. It proved to be not only a data breach, but also a breach of trust. Platforms like Facebook, which were created to connect and get the world closer were being weaponized with an impact which will never be recovered fully.

Comments