Coursera
Coursera
ANALYZE
ACT
CAPSTONE
EMC Corporation's data analytics life cycle is cyclical with six steps:
1. Discovery
2. Pre-processing data
3. Model planning
4. Model building
5. Communicate results
6. Operationalize
An iterative life cycle was created by a company called SAS, a leading data analytics
solutions provider. It can be used to produce repeatable, reliable, and predictive results:
1. Ask
2. Prepare
3. Explore
4. Model
5. Implement
6. Act
7. Evaluate
Authors Thomas Erl, Wajid Khattak, and Paul Buhler proposed a big data analytics life
cycle in their book, Big Data Fundamentals: Concepts, Drivers & Techniques. Their life
cycle suggests phases divided into nine steps:
Skill
Analytical skills
Description
Skill
A technical mindset
Description
Data design
Description
Skill
Understanding context
Description
The analytical skill that has to do with how you group things
into categories
Skill
Data strategy
Description
Phase 1
Ask: Define the problem and confirm stakeholder expectations
Phase 2
Prepare: Collect and store data for analysis
Phase 3
Process: Clean and transform data to ensure integrity
Phase 4
Analyze: Use data analysis tools to draw conclusions
Phase 5
Share: Interpret and communicate results to others to make data-driven decisions
Phase 6
Act: Put your insights to work in order to solve the original problem
Imagine that you interview for a data analyst role at a local ice cream company. The hiring
manager explains that the company needs a data analyst because they want to learn more about
their customers. First, they want to understand their customers’ ice cream flavor preferences.
Then, they will use this customer data to help make important decisions.
The hiring manager explains that they do not collect any customer data, and they don’t know
where to begin. The hiring manager asks you: Can you please explain how you would approach
this task?
We need to collect data for a day, a week and a month, we will compare results between wich
flavors are the most selled ones and we can also make little cardboards with optional reviews and
offer them to our customers in order to know better why are they chosing the flavor and wich
flavor they would like to be in the menu. We can put all this information into a spreadsheet and
will make graphs to get a better visualization of the data, eventually we will analyze these results
and take actions like making more ice cream for the flavors that the people voted the most and
discard the less voted.
Key data analyst tools
As you are learning, the most common programs and solutions used by data analysts include
spreadsheets, query languages, and visualization tools. In this reading, you will learn more about
each one. You will cover when to use them, and why they are so important in data analytics.
Spreadsheets
Data analysts rely on spreadsheets to collect and organize data. Two popular spreadsheet
applications you will probably use a lot in your future role as a data analyst are Microsoft Excel and
Google Sheets.
A database is a collection of structured data stored in a computer system. Some popular Structured
Query Language (SQL) programs include MySQL, Microsoft SQL Server, and BigQuery.
Query languages
Visualization tools
Data analysts use a number of visualization tools, like graphs, maps, tables, charts, and more. Two
popular visualization tools are Tableau and Looker.
These tools
worksheets
- Looker communicates directly with a database, allowing you to connect your data right to the
visual
A career as a data analyst also involves using programming languages, like R and Python, which are
used a lot for statistical analysis, visualization, and other data analysis.
Key takeaway
You have a lot of tools as a data analyst. This is a first glance at the possibilities, and you will explore
many of these tools in-depth throughout this program.