Data Science Zero to Hero - 2.2 Data ETF (Extract, Transform, Load)

Where does data come from? Data can come from many different sources - it might be generated by users, collected from sensors, retrieved from databases, even scraped from websites. The methods of data collection may depend on the nature of the data source, and the process of managing this data and making it usable for analysis often involves ETL (Extract, Transform, Load). In many cases, extracted raw data is not loaded directly into a database; It has to be cleaned and transformed before it is suitable for machine learning....

June 27, 2023 · 4 min · Steven McGown

Data Science Zero to Hero - 2.1 The Machine Learning Cycle

Data Collection and Preparation: ML Concepts Data, data, data! If there’s one thing you should take away from this series, it’s that data is super important to data scientists and Machine Learning Engineers alike. In the previous posts, we talked about different ways of transforming and visualizing data with Python and those libraries are certainly powerful tools, but where does this data come from anyway? How do we collect it? Where do we store it?...

June 26, 2023 · 3 min · Steven McGown

Data Science Zero to Hero - 1.2 Pandas

Pandas are nature’s adorable, bamboo-munching- wait, not that kind of panda… Much less cute but much more useful for data science, Pandas is a popular open-source Python library that provides powerful data manipulation and analysis tools. The name “Pandas” is derived from “Panel Data,” reflecting its original focus on handling and analyzing financial data with panel data structures. It is built on top of NumPy and offers easy-to-use data structures and data analysis functionalities....

June 25, 2023 · 11 min · Steven McGown

Data Science Zero to Hero - 1.1 Numpy

Numpy is a Python library used for working with arrays, linear algebra, matrices, and much more. It’s a fantastic tool for anyone who wants to work with numerical data in Python, particularly in the context of data science and machine learning. Numpy is used extensively in machine learning algorithms, so it’s good to have some experience with it if you want to be able to create any ML solutions. This post will assume that you already have some programming experience with Python....

June 24, 2023 · 5 min · Steven McGown

Data Science Zero to Hero - 1.0 Foreword

A foreword on the “Data Science Zero to Hero” series

June 23, 2023 · 2 min · Steven McGown