Forum

Thing - Demystifying data


July 1st, 2021 by Charlie Category: All 23 Things, Working with data

Did you know that data can be biased as well? Everyone holds some sort of unconscious belief about various social groups or can be led to make assumptions based on data presented to them. However, as artificial intelligences and machine learning continues to grow and develop, these human biases can be learned by the machines as well!

Simply put, machine learning is a type of AI that focuses on using data and algorithms to imitate the way humans learn. As these AI are learning from human-produced data, they, much like all humans, also make mistakes in the learning process. The issues these learned biased cause come about when these AI are put into practice.

Here are some common types of bias that show up in data and machine learning (but that we may also face when coming to our own decisions about the world!):

  1. Confirmation bias: being less critical of data that confirms what we already believe
  2. Selection bias: when data is not large or representative enough, misrepresenting the true population
  3. Outliers: extreme data points that alter the results
  4. Overfitting and underfitting: when there is too much or too little data to observe an accurate trend or produce accurate results
  5. Demographic bias: when data primarily reflects a certain demographic, affecting outcomes and assumptions

One of my favourite types of data bias is that of survivorship bias. We need to consider what we don’t see, and learn from this as well – there is a lot of worth in learning from failure and experience, rather than just success.

It’s important to be aware of the datasets we use, in terms of both what is and isn’t included, as well as our own unconscious beliefs when we collect, process and interpret data. The effects of bias in data can be serious and can impact peoples’ lives.

post by Charlie