If you follow the news, it’s impossible to miss the frequent stories about data breaches, shady tech company behavior, and the risks emerging AI technology poses to employment and other areas of human life.
At Maven, we’re quite optimistic about the future and believe that the benefits of our data-driven world far outweigh the risks. Part of the reason why is that we believe many of these risks and scandals can be easily mitigated and avoided so long as the professionals working in these fields have an ethical framework to guide their decision-making.
If I had to guess, I’d wager that only a small percentage of data professionals have ever taken a course on data or AI ethics. A greater percentage of us have likely taken trainings on legal compliance, but the law is a bare minimum standard of ethical behavior.
The Law Isn’t a Substitute For Ethics
A society’s ethical values often inform the laws it creates, but there are plenty of unethical and harmful behaviors that aren’t regulated by the law.
When it comes to regulating technology, it can be years or even decades before its impacts are properly understood and enough harm has been caused to motivate the government, which is often trying to strike a balance between regulation and innovation & growth.
For example, it wasn’t until the 1970s, with the creation of the EPA and passage of bills like the Clean Water Act, that the US Government got serious about regulating the pollution caused by industries that were born a century prior.
While we should certainly follow the law, the main point is that it isn’t always a North Star in terms of ethics. The other benefit of having high ethical standards is that when new laws and regulations are implemented, organizations that operate ethically are much less likely to find their business disrupted or require a costly transformation of their business.
Ethical Data Stewardship
Our digital footprints continue to expand as our lives are increasingly documented and conducted online. I honestly have no idea how many companies have my email address, home address, phone number, birth date, and other sensitive personal information.
I should have a better idea, but quite frankly I’ve given up on having any sort of digital privacy. That said, I do place an implicit level of trust in companies when I pass over my information, whether it’s to communicate with a doctor, transfer money, or buy something on Amazon.
And that’s a very important point - when users give you their personal information, they are trusting you to keep it secure and confidential, and assume it will only be used for obvious reasons.
If you don’t invest in proper data security, share it with third parties without the user’s consent, or pry into their personal information in ways that they would be surprised to hear about, you’re violating the trust they’ve put in you.
Ultimately, being an ethical steward of data boils down to the Golden Rule - treat user data how you would like your own to be treated by others.
The Risks of Generative AI
In the past few years, the sophistication of AI models has increased dramatically. I distinctly remember seeing GPT2, an early version of the model behind ChatGPT, demonstrated by a student. “That’s cute” I thought, as it struggled to write a coherent poem related to some topic.
I had no idea that only a few years later we would have AI assistants that could write expert-level code, pass the LSAT, and generate photo-realistic images. But here we are.
These tools are incredibly powerful and can make just about any white-collar worker more efficient at some aspect of their job. We like to say that “You won’t be replaced by AI, but you will be replaced by someone using AI”. In other words, these tools are going to become ingrained in our daily work sooner rather than later, and this is largely a good thing.
But we still need to remember that these tools aren’t perfect. They are capable of making up facts, which is also known as a “hallucination”, and if we blindly trust these tools, we could be led astray or make mistakes that are costly to our personal reputations and organizations.
At the end of the day, users of these tools need to know that they are responsible for how they use them. While there may be instances in the future where the creators of these tools are taken to task for the outputs their models produce, users need to engage their critical thinking and fact-check the output of these models, because no one is going to accept an employee pointing a finger at an AI tool as a valid excuse.
These are just a few examples of the ethical issues we face as data professionals in our increasingly AI-driven world. If you’d like to dive deeper, check out our new course on Data & AI Ethics, which tackles these topics, as well as things like Data Bias, Algorithmic harm, and more.
SUPER EARLY BIRD IS HERE!
For a limited time, save 25% on our upcoming Python & Power BI immersive programs!
Explore how our immersive programs with direct instructor access, weekly live sessions, and collaborative environments can elevate your skills and accelerate your career.
Chris Bruehl
Lead Python Instructor & Growth Engineer
Chris is a Python expert, certified Statistical Business Analyst, and seasoned Data Scientist, having held senior-level roles at large insurance firms and financial service companies. He earned a Masters in Analytics at NC State's Institute for Advanced Analytics, where he founded the IAA Python Programming club.