Starting My Journey into Data Science
It's time to start studying again, and I've decided against pursuing .NET/Web Programming in favor of Data Science. I've only begun researching what I need to learn to get better acquainted with Data Science, and for now, I will focus on Python Programming.
I'm already pretty good with SQL and Relational Databases (SQL Server, Oracle), but there's much more to explore. Beyond Math and Statistics, I want to understand how to work with unstructured data.
Subject List
My initial study list (subject to change as I learn what I need) includes the following topics, presented in no particular order:
- Python Language
- R Language
- MongoDB / NoSQL
- Big Data (Hadoop, Hive)
- Cloud Tools (Amazon S3)
Additionally, I will need to brush up on my Math and Statistics skills, as it has been a few years since university.
Reading Estimate
- 4,000 Pages [5 to 6 books, each 500 to 800 pages]
- 10 months [100 pages per week] // Estimated Completion Time: October 2018
Resources
Python
- Intro to Python, 5th Edition - Mark Lutz
- Programming in Python - Mark Lutz
R
- The Art of R Programming - Matloff
Certifications
There are several certifications available for Python and R:
- 70-773 - Analyzing Big Data with Microsoft R [for MCSE - Data Management & Analytics] — $165 USD
- 98-381 - Introduction to Python [for MTA] — $127 USD
- MongoDB DBA Associate — $150 USD
Certifications are not my primary goal; rather, they serve as a measuring stick and pace-setter.
Updates
Update (12/13/17) - Python
I am currently studying the 5th Edition of Learning Python by Mark Lutz. It is a larger volume with 40 chapters. I am taking my time and have read about 10 chapters over 17 days. I will try to pick up the pace as I delve deeper into this book and hope to finish it by mid-January 2018.
The content in this book is substantial; I've already filled out a 70-page notebook and had to re-ink three fountain pens. At this rate, I will need three more notebooks.
Writing out the code manually helps me grasp it better and allows me to see differences, such as lists versus dictionaries.
Update (5/30/18) - Art of R Programming
I purchased The Art of R by Matloff and am currently studying it. I have installed R Studio on my i7 Windows 10 and i3 Linux systems.
Update (11/11/18) - Study Statistics First
This process is taking longer than anticipated. I have completed textbooks for both Python and R. Now, I am focusing my efforts on studying statistics itself rather than just tools or platforms. I aim to improve and hone my analytical mindset first, then augment it with the necessary tools.
Comments
Post a Comment