0% found this document useful (0 votes)
91 views

Coursera

This document outlines the steps in the data analysis life cycle according to EMC Corporation, SAS, and common project-based and big data analytics approaches. EMC's cycle has 6 steps: discovery, pre-processing, model planning, model building, communicating results, and operationalizing. SAS's iterative cycle has 7 steps: ask, prepare, explore, model, implement, act, and evaluate. A typical project-based cycle has 5 steps: defining the problem, designing data requirements, pre-processing, analyzing, and visualizing data. Big data analytics follows a 9-step cycle including business evaluation, data identification, acquisition, extraction, validation, aggregation, analysis, visualization, and utilizing results.

Uploaded by

ingerabalu123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views

Coursera

This document outlines the steps in the data analysis life cycle according to EMC Corporation, SAS, and common project-based and big data analytics approaches. EMC's cycle has 6 steps: discovery, pre-processing, model planning, model building, communicating results, and operationalizing. SAS's iterative cycle has 7 steps: ask, prepare, explore, model, implement, act, and evaluate. A typical project-based cycle has 5 steps: defining the problem, designing data requirements, pre-processing, analyzing, and visualizing data. Big data analytics follows a 9-step cycle including business evaluation, data identification, acquisition, extraction, validation, aggregation, analysis, visualization, and utilizing results.

Uploaded by

ingerabalu123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

ASK

What you will learn:


 How data analysts solve problems with data
 The use of analytics for making data-driven decisions
 Spreadsheet formulas and functions
 Dashboard basics, including an introduction to Tableau
 Data reporting basics
Skill sets you will build:
 Asking SMART and effective questions
 Structuring how you think
 Summarizing data
 Putting things into context
 Managing team and stakeholder expectations
 Problem-solving and conflict-resolution
PREPARE

What you will learn:


 How data is generated
 Features of different data types, fields, and values
 Database structures
 The function of metadata in data analytics
 Structured Query Language (SQL) functions
Skill sets you will build:
 Ensuring ethical data analysis practices
 Addressing issues of bias and credibility
 Accessing databases and importing data
 Writing simple queries
 Organizing and protecting data
 Connecting with the data community (optional)
PROCESS

What you will learn:


 Data integrity and the importance of clean data
 The tools and processes used by data analysts to clean data
 Data-cleaning verification and reports
 Statistics, hypothesis testing, and margin of error
 Resume building and interpretation of job postings (optional)
Skill sets you will build:
 Connecting business objectives to data analysis
 Identifying clean and dirty data
 Cleaning small datasets using spreadsheet tools
 Cleaning large datasets by writing SQL queries
 Documenting data-cleaning processes

ANALYZE

What you will learn:


 Steps data analysts take to organize data
 How to combine data from multiple sources
 Spreadsheet calculations and pivot tables
 SQL calculations
 Temporary tables
 Data validation
Skill sets you will build:
 Sorting data in spreadsheets and by writing SQL queries
 Filtering data in spreadsheets and by writing SQL queries
 Converting data
 Formatting data
 Substantiating data analysis processes
 Seeking feedback and support from others during data analysis
SHARE

What you will learn:


 Design thinking
 How data analysts use visualizations to communicate about data
 The benefits of Tableau for presenting data analysis findings
 Data-driven storytelling
 Dashboards and dashboard filters
 Strategies for creating an effective data presentation
Skill sets you will build:
 Creating visualizations and dashboards in Tableau
 Addressing accessibility issues when communicating about data
 Understanding the purpose of different business communication tools
 Telling a data-driven story
 Presenting to others about data
 Answering questions about data

ACT

What you will learn:


 Programming languages and environments
 R packages
 R functions, variables, data types, pipes, and vectors
 R data frames
 Bias and credibility in R
 R visualization tools
 R Markdown for documentation, creating structure, and emphasis
Skill sets you will build:
 Coding in R
 Writing functions in R
 Accessing data in R
 Cleaning data in R
 Generating data visualizations in R
 Reporting on data analysis to stakeholders

CAPSTONE

What you will learn:


 How a data analytics portfolio distinguishes you from other candidates
 Practical, real-world problem-solving
 Strategies for extracting insights from data
 Clear presentation of data findings
 Motivation and ability to take initiative
Skill sets you will build:
 Building a portfolio
 Increasing your employability
 Showcasing your data analytics knowledge, skill, and technical expertise
 Sharing your work during an interview
 Communicating your unique value proposition to a potential employer

1. Ask questions and define the problem.


2. Prepare data by collecting and storing the information.
3. Process data by cleaning and checking the information.
4. Analyze data to find patterns, relationships, and trends.
5. Share data with your audience.
6. Act on the data and use the analysis results.
EMC's data analysis life cycle

EMC Corporation's data analytics life cycle is cyclical with six steps:

1. Discovery
2. Pre-processing data
3. Model planning
4. Model building
5. Communicate results
6. Operationalize

SAS's iterative life cycle

An iterative life cycle was created by a company called SAS, a leading data analytics
solutions provider. It can be used to produce repeatable, reliable, and predictive results:

1. Ask
2. Prepare
3. Explore
4. Model
5. Implement
6. Act
7. Evaluate

Project-based data analytics life cycle

A project-based data analytics life cycle has five simple steps:

1. Identifying the problem


2. Designing data requirements
3. Pre-processing data
4. Performing data analysis
5. Visualizing data
Big data analytics life cycle

Authors Thomas Erl, Wajid Khattak, and Paul Buhler proposed a big data analytics life
cycle in their book, Big Data Fundamentals: Concepts, Drivers & Techniques. Their life
cycle suggests phases divided into nine steps:

1. Business case evaluation


2. Data identification
3. Data acquisition and filtering
4. Data extraction
5. Data validation and cleaning
6. Data aggregation and representation
7. Data analysis
8. Data visualization
9. Utilization of analysis results

Skill

Analytical skills

Description

The qualities and characteristics associated with solving


problems using facts

Skill

A technical mindset

Description

The analytical skill that involves breaking processes down into


smaller steps and working with them in an orderly, logical way
Skill

Data design

Description

The analytical skill that involves how you organize information

Skill

Understanding context

Description

The analytical skill that has to do with how you group things
into categories

Skill

Data strategy

Description

The analytical skill that involves managing the processes and


tools used in data analysis
5 aspects of thinking analytically

1. Visualization: The graphical representation of information, could be graphs maps


or designed elements
2. Strategy: Strategic mindset we know all data is valuable
3. Problem-orientation: Use a problem in order to identify, describe and solve
4. Correlation: Two or more pieces of data how are they involved. Not all data are
correlated
5. Big-picture and detailed-oriented thinking: Looking at the complete puzzle not the
little pieces. To execute a plan using detail-oriented thinking, a data analyst
considers the specifics.

Phase 1
Ask: Define the problem and confirm stakeholder expectations
Phase 2
Prepare: Collect and store data for analysis
Phase 3
Process: Clean and transform data to ensure integrity
Phase 4
Analyze: Use data analysis tools to draw conclusions
Phase 5
Share: Interpret and communicate results to others to make data-driven decisions
Phase 6
Act: Put your insights to work in order to solve the original problem

Variations of the data life cycle


1. Plan: Decide what kind of data is needed, how it will be managed, and who will be
responsible for it.
2. Capture: Collect or bring in data from a variety of different sources.
3. Manage: Care for and maintain the data. This includes determining how and where it is
stored and the tools used to do so.
4. Analyze: Use the data to solve problems, make decisions, and support business goals.
5. Archive: Keep relevant data stored for long-term and future reference.
6. Destroy: Remove data from storage and delete any shared copies of the data.
 Plan: What plans and decisions do you need to make? What data do you need to
answer your question?
 Capture: Where does your data come from? How will you get it?
 Manage: How will you store your data? What should it be used for? How do you
keep this data secure and protected?
 Analyze: How will the company analyze the data? What tools should they use?
 Archive: What should they do with their data when it gets old? How do they
know when it's time?
 Destroy: Should they ever dispose of any data? If so, when and how?

The scenario: interview for a data analyst position

Imagine that you interview for a data analyst role at a local ice cream company. The hiring
manager explains that the company needs a data analyst because they want to learn more about
their customers. First, they want to understand their customers’ ice cream flavor preferences.
Then, they will use this customer data to help make important decisions.

The hiring manager explains that they do not collect any customer data, and they don’t know
where to begin. The hiring manager asks you: Can you please explain how you would approach
this task?

We need to collect data for a day, a week and a month, we will compare results between wich
flavors are the most selled ones and we can also make little cardboards with optional reviews and
offer them to our customers in order to know better why are they chosing the flavor and wich
flavor they would like to be in the menu. We can put all this information into a spreadsheet and
will make graphs to get a better visualization of the data, eventually we will analyze these results
and take actions like making more ice cream for the flavors that the people voted the most and
discard the less voted.
Key data analyst tools
As you are learning, the most common programs and solutions used by data analysts include
spreadsheets, query languages, and visualization tools. In this reading, you will learn more about
each one. You will cover when to use them, and why they are so important in data analytics.
Spreadsheets

Data analysts rely on spreadsheets to collect and organize data. Two popular spreadsheet
applications you will probably use a lot in your future role as a data analyst are Microsoft Excel and
Google Sheets.

Spreadsheets structure data in a meaningful way by letting you

 Collect, store, organize, and sort information


 Identify patterns and piece the data together in a way that works for each specific data
project
 Create excellent data visualizations, like graphs and charts.

Databases and query languages

A database is a collection of structured data stored in a computer system. Some popular Structured
Query Language (SQL) programs include MySQL, Microsoft SQL Server, and BigQuery.

Query languages

 Allow analysts to isolate specific information from a database(s)


 Make it easier for you to learn and understand the requests made to databases
 Allow analysts to select, create, add, or download data from a database for analysis

Visualization tools

Data analysts use a number of visualization tools, like graphs, maps, tables, charts, and more. Two
popular visualization tools are Tableau and Looker.

These tools

 Turn complex numbers into a story that people can understand


 Help stakeholders come up with conclusions that lead to informed decisions and effective
business strategies
 Have multiple features
- Tableau's simple drag-and-drop feature lets users create interactive graphs in dashboards and

worksheets

- Looker communicates directly with a database, allowing you to connect your data right to the
visual

tool you choose

A career as a data analyst also involves using programming languages, like R and Python, which are
used a lot for statistical analysis, visualization, and other data analysis.

Key takeaway

You have a lot of tools as a data analyst. This is a first glance at the possibilities, and you will explore
many of these tools in-depth throughout this program.

You might also like