Create Dataset Using Apache Parquet

Working with Dataset — Part 1: Create Dataset Using Apache Parquet

Sung Kim

--

I have been in Data Science profession for a while before the term “Data Science” became popularized. In those days as well as present days, the most widely used commercial data analytics software is SAS by SAS Institute. Like most people, I have transitioned to a more open-source software-based solution. One thing I really missed with SAS is the convenience of SAS dataset where all your intermediary datasets and final datasets can be saved and accessed at a…

--

--

Sung Kim

A business analyst at heart who dabbles in ai engineering, machine learning, data science, and data engineering. threads: @sung.kim.mw