3 minute read

Welcome to Data Science in the Age of AI!

It is a very exciting time to work in Data Science. Each day brings announcements about major breakthroughs in AI. Multiple companies are competing with each other to innovate. New products appear every day. Social media is flooded with proclamations about the transformation of every detail of modern life. So, what does this mean for Data Scientists, our jobs and our industry? That’s the question that I hope to tackle with you in this series of blog posts, and I’m thrilled that you’ve decided to join me for the first installment.

The future is here

While we are in the middle of a period of enormous change and complexity, I believe that this series will illustrate a relatively simple thesis: In the age of AI, Data Science becomes more important than ever. Building new products and features will only get cheaper, but that means the new bottleneck is the core DS feature set: defining success, measuring it, and improving. AI can’t do this on its own. It can help capable individuals do it better, and it will be your job to seize that opportunity.

The story of AI thus far has been remarkable progress, we should be planning for future growth. Three changes will impact us simultaneously:

  • Scaling laws state that there is a predictable trajectory for model improvement, given increases in data and compute, and we should anticipate stronger models being released at a regular cadence over the next few years
  • While interacting with state-of-the-art AI is notably expensive right now, we are also witnessing an incredible race to efficiency, paralleling Moore’s Law, that is driving down compute costs; the current generation of cheap models are as powerful as the last generation’s frontier models.
  • AI has already transformed multiple industries and will bring along with it a proliferation of new products. The increase in the rate of new product and feature creation is downstream of the previous two trends, with more powerful models at lower costs providing greater accessibility to new entrepreneurs.

Why this series?

This series is for professional Data Scientists looking to better understand the future of their profession. While being one of the first projects of its kind to take on this topic, it is being written in a manner to better tackle the evolving landscape. First and foremost, this is an open-source project, licensed under the CC BY-NC-ND 3.0 License. If you are interested in talking more, you can always reach out to me on

The open-source, reader supported format brings a couple of advantages:

  • The series itself will evolve over time and will change along with the AI landscape
  • The series will always remain open to contributions from readers like you

Just as important, this series is written to be AI friendly. Yes, AI tools were extensively used during crafting, and you can easily add the content of this blog to a GPT to chat with. I find it much easier to find and deconstruct key points through chat. You can experience this too, as it will be a hallmark of reading into the future.

Last but not least, this series will serve as a guide to many of the new tools emerging on the market today. Relevant posts will mention the tools that are used throughout.

Tools used today

  • A local GitHub devcontainer.
  • I talked about moving the blog to a devcontainer in this post.
  • Copilot Chat, for quick edits and suggestions.

References

  1. Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling Laws for Neural Language Models. arXiv. https://arxiv.org/abs/2001.08361
  2. Chen, X., Liu, S., Lawrence, N. D., & Xu, Y. (2024). The Race to Efficiency: A New Perspective on AI Scaling Laws. arXiv. https://arxiv.org/abs/2501.02156
  3. Lee, T. (2024). How human translators are coping with competition from powerful AI. Understanding AI. https://www.understandingai.org/p/how-human-translators-are-coping

Updated: