Learning Path that I took
Purpose
This guide was created by Fran for those in need of direction to take shortcuts based on Fran’s personal experiences and learning how to think analytically.
The fastest way to explore whether it is a field that one might enjoy and have an interest in is to read a book on the subject and speak with people who are in that field. Once that’s done and if possible, conducting hands-on exposure is the fastest way.
As such, I’m sharing the signs and winded paths that I once took when studying to be proficient in technical skills.
- Business Problem Framing
- People, Process, Technology (PPT) Framework
- Data Language
Business Problem Framing
Key Ideas:
Defining the elephant
: what “elephant” are we trying to solve?Data Languages (e.g., SQL, Python, R, etc.)
: How to communicate with the elephant? How to describe the elephant to others?Audience-based Resolution/Perspective
: How to describe the elephant to different people?
People, Process, Technology (PPT) Framework
When you are designing/developing new things, always remember this framework of People, Process, and Technology.
There’s no technology and process without the people (i.e., human operators). We, as creators, have to remember that we are helping people who run the processes better using technology as a tool. Technology should not be the end goal, but used as a means to bettering human experience.
I can go into more depth at a later time, but remember how Starbucks has “extra hot” while Amazon Go had to remove the human element to make the process work.
Data Language
Why are programming languages called languages? Because like the way we communicate with people from other countries, we speak to computers with their own languages that they understand.
I remember one of the introductory computer science courses, McGill’s infamous COMP202, taught this concept by asking you to tell a computer how to tie your shoes. What is a left? What is a right? What is a shoe string? What is a shoe? There are contingent definitions that are needed.
Deeper Dive
Why querying languages?
: Being proficient in methods around using SQL and other querying languages is foundational to exploring massive amounts of data. You need to speak an analytical language that a computer can understand. As a multi-lingual speaker myself, I can search up the same thing in different languages and get different results. The same goes for data. You can search up the same thing in different languages and get different results (Remember we are the blind men and the elephant).
Data Lifecycle
: Imagine a person pressing a button. The signal translates to a data point and it’s collected. Remember that your snapshot of the database may not be representative of the reality. Always try to see the source and understand the process of how the data is collected. This is why it’s important to understand the data lifecycle.
Interoperability
: This is more difficult to understand so I’ll keep it short and cover in a separate post. Remember that there are syntactic and semantic interoperability that needs to be considered (data types and structures). Ever had issues when communicating your ideas to someone else? It’s the same thing with data. Your date format might be confused for a different date format or even a completely different thing (e.g., why we have to explicitly say which date format we use like ISO 8601).
References
This is not an exhaustive list, but key resources that I used to learn.
- Roadmap.sh for AI Data Scientist (this is new one that I found)
- Google Colab
- Kaggle
- Postgresqltutorial
Disclosure
All opinions are my own.