Home>Store

Getting Started with Data Science: Making Sense of Data with Analytics

eBook (Watermarked)

  • Your Price: $25.59
  • List Price: $31.99
  • Includes EPUB and PDF
  • About eBook Formats
  • This eBook includes the following formats, accessible from yourAccountpage after purchase:

    ePubEPUBThe open industry format known for its reflowable content and usability on supported mobile devices.

    Adobe ReaderPDFThe popular standard, used most often with the freeAdobe® Reader®software.

    This eBook requires no passwords or activation to read. We customize your eBook by discreetly watermarking it with your name, making it uniquely yours.

Also available inother formats.

Register your productto gain access to bonus material or receive a coupon.

Description

  • Copyright 2016
  • Dimensions: 7" x 9-1/8"
  • Pages: 608
  • Edition: 1st
  • eBook (Watermarked)
  • ISBN-10: 0-13-399125-3
  • ISBN-13: 978-0-13-399125-3

Master Data Analytics Hands-On by Solving Fascinating Problems You’ll Actually Enjoy!

Harvard Business Reviewrecently called data science “The Sexiest Job of the 21st Century.” It’s not just sexy: For millions of managers, analysts, and students who need to solve real business problems, it’s indispensable. Unfortunately, there’s been nothing easy about learning data science–until now.

Getting Started with Data Sciencetakes its inspiration from worldwide best-sellers likeFreakonomicsand Malcolm Gladwell’sOutliers: It teaches through a powerful narrative packed with unforgettable stories.

Murtaza Haider offers informative, jargon-free coverage of basic theory and technique, backed with plenty of vivid examples and hands-on practice opportunities. Everything’s software and platform agnostic, so you can learn data science whether you work with R, Stata, SPSS, or SAS. Best of all, Haider teaches a crucial skillset most data science books ignore: how to tell powerful stories using graphics and tables. Every chapter is built around real research challenges, so you’ll always know why you’re doing what you’re doing.

You’ll master data science by answering fascinating questions, such as:
• Are religious individuals more or less likely to have extramarital affairs?
• Do attractive professors get better teaching evaluations?
• Does the higher price of cigarettes deter smoking?
• What determines housing prices more: lot size or the number of bedrooms?
• How do teenagers and older people differ in the way they use social media?
• Who is more likely to use online dating services?
• Why do some purchase iPhones and others Blackberry devices?
• Does the presence of children influence a family’s spending on alcohol?

For each problem, you’ll walk through defining your question and the answers you’ll need; exploring how
others have approached similar challenges; selecting your data and methods; generating your statistics;
organizing your report; and telling your story. Throughout, the focus is squarely on what matters most:
将数据转换成明确的见解,accurate, and can be acted upon.

The book’s website (www.ibmpressbooks.com/title/9780133991024) offers additional pages and software codes to illustrate every method from the book in R, SPSS, Stata, and SAS. The additional content and code files will be available for download by 1/26.



Downloads

Downloads

Download the code files:

Chapter 4(444 KB .zip)

Chapter 5(2.37 MB .zip)

Chapter 6(769 KB .zip)

Chapter 7(813 KB .zip)

Chapter 8(909 KB .zip)

Chapter 9(3.51 MB .zip)

Chapter 10(1.89 KB .zip)

Chapter 11(294 KB .zip)

Chapter 12(43 KB .zip)

Sample Content

Table of Contents

Preface xix
Chapter 1 The Bazaar of Storytellers 1
Data Science: The Sexiest Job in the 21st Century 4
讲故事在谷歌和沃尔玛6
Getting Started with Data Science 8
Do We Need Another Book on Analytics? 8
Repeat, Repeat, Repeat, and Simplify 10
Chapters’ Structure and Features 10
Analytics Software Used 12
What Makes Someone a Data Scientist? 12
Existential Angst of a Data Scientist 15
Data Scientists: Rarer Than Unicorns 16
Beyond the Big Data Hype 17
Big Data: Beyond Cheerleading 18
Big Data Hubris 19
Leading by Miles 20
Predicting Pregnancies, Missing Abortions 20
What’s Beyond This Book? 21
Summary 23
Endnotes 24
Chapter 2 Data in the 24/7 Connected World 29
The Liberated Data: The Open Data 30
The Caged Data 30
Big Data Is Big News 31
It’s Not the Size of Big Data; It’s What You Do with It 33
Free Data as in Free Lunch 34
FRED 34
Quandl 38
U.S. Census Bureau and Other National Statistical Agencies 38
Search-Based Internet Data 39
Google Trends 40
Google Correlate 42
Survey Data 44
PEW Surveys 44
ICPSR 45
Summary 45
Endnotes 46
Chapter 3 The Deliverable 49
The Final Deliverable 52
What Is the Research Question? 53
What Answers Are Needed? 54
How Have Others Researched the Same Question in the Past? 54
What Information Do You Need to Answer the Question? 58
What Analytical Techniques/Methods Do You Need? 58
The Narrative 59
The Report Structure 60
Have You Done Your Job as a Writer? 62
Building Narratives with Data 62
“大数据,Big Analytics, Big Opportunity” 63
Urban Transport and Housing Challenges 68
Human Development in South Asia 77
The Big Move 82
Summary 95
Endnotes 96
Chapter 4 Serving Tables 99
2014: The Year of Soccer and Brazil 100
使用百分比比使用原始数据104
Data Cleaning 106
Weighted Data 106
Cross Tabulations 109
Going Beyond the Basics in Tables 113
Seeing Whether Beauty Pays 115
Data Set 117
What Determines Teaching Evaluations? 118
Does Beauty Affect Teaching Evaluations? 124
Putting It All on (in) a Table 125
Generating Output with Stata 129
摘要统计信息使用内置占据130
Using Descriptive Statistics 130
Weighted Statistics 134
Correlation Matrix 134
Reproducing the Results for the Hamermesh and Parker Paper 135
Statistical Analysis Using Custom Tables 136
Summary 137
Endnotes 139
Chapter 5 Graphic Details 141
Telling Stories with Figures 142
Data Types 144
Teaching Ratings 144
The Congested Lives in Big Cities 168
Summary 185
Endnotes 185
Chapter 6 Hypothetically Speaking 187
Random Numbers and Probability Distributions 188
Casino Royale: Roll the Dice 190
Normal Distribution 194
The Student Who Taught Everyone Else 195
Statistical Distributions in Action 196
Z-Transformation 198
Probability of Getting a High or Low Course Evaluation 199
Probabilities with Standard Normal Table 201
Hypothetically Yours 205
Consistently Better or Happenstance 205
Mean and Not So Mean Differences 206
Handling Rejections 207
The Mean and Kind Differences 211
Comparing a Sample Mean When the Population SD Is Known 211
Left Tail Between the Legs 214
Comparing Means with Unknown Population SD 217
Comparing Two Means with Unequal Variances 219
Comparing Two Means with Equal Variances 223
Worked-Out Examples of Hypothesis Testing 226
Best Buy–Apple Store Comparison 226
Assuming Equal Variances 227
Exercises for Comparison of Means 228
Regression for Hypothesis Testing 228
Analysis of Variance 231
Significantly Correlated 232
Summary 233
Endnotes 234
Chapter 7 Why Tall Parents Don’t Have Even Taller Children 235
The Department of Obvious Conclusions 235
Why Regress? 236
Introducing Regression Models 238
All Else Being Equal 239
Holding Other Factors Constant 242
Spuriously Correlated 244
A Step-By-Step Approach to Regression 244
Learning to Speak Regression 247
The Math Behind Regression 248
Ordinary Least Squares Method 250
Regression in Action 259
This Just In: Bigger Homes Sell for More 260
Does Beauty Pay? Ask the Students 272
Survey Data, Weights, and Independence of Observations 276
What Determines Household Spending on Alcohol and Food 279
What Influences Household Spending on Food? 285
Advanced Topics 289
Homoskedasticity 289
Multicollinearity 293
Summary 296
Endnotes 296
Chapter 8 To Be or Not to Be 299
To Smoke or Not to Smoke: That Is the Question 300
Binary Outcomes 301
Binary Dependent Variables 301
Let’s Question the Decision to Smoke or Not 303
Smoking Data Set 304
Exploratory Data Analysis 305
What Makes People Smoke: Asking Regression for Answers 307
Ordinary Least Squares Regression 307
Interpreting Models at the Margins 310
The Logit Model 311
Interpreting Odds in a Logit Model 315
Probit Model 321
Interpreting the Probit Model 324
Using Zelig for Estimation and Post-Estimation Strategies 329
Estimating Logit Models for Grouped Data 334
Using SPSS to Explore the Smoking Data Set 338
Regression Analysis in SPSS 341
Estimating Logit and Probit Models in SPSS 343
Summary 346
Endnotes 347
Chapter 9 Categorically Speaking About Categorical Data 349
What Is Categorical Data? 351
Analyzing Categorical Data 352
Econometric Models of Binomial Data 354
Estimation of Binary Logit Models 355
Odds Ratio 356
Log of Odds Ratio 357
Interpreting Binary Logit Models 357
Statistical Inference of Binary Logit Models 362
How I Met Your Mother? Analyzing Survey Data 363
A Blind Date with the Pew Online Dating Data Set 365
Demographics of Affection 365
High-Techies 368
Romancing the Internet 368
Dating Models 371
Multinomial Logit Models 378
Interpreting Multinomial Logit Models 379
Choosing an Online Dating Service 380
Pew Phone Type Model 382
Why Some Women Work Full-Time and Others Don’t 389
Conditional Logit Models 398
Random Utility Model 400
Independence From Irrelevant Alternatives 404
Interpretation of Conditional Logit Models 405
Estimating Logit Models in SPSS 410
Summary 411
Endnotes 413
Chapter 10 Spatial Data Analytics 415
Fundamentals of GIS 417
GIS Platforms 418
Freeware GIS 420
GIS Data Structure 420
GIS Applications in Business Research 420
Retail Research 421
Hospitality and Tourism Research 422
Lifestyle Data: Consumer Health Profiling 423
Competitor Location Analysis 423
Market Segmentation 423
Spatial Analysis of Urban Challenges 424
The Hard Truths About Public Transit in North America 424
Toronto Is a City Divided into the Haves, Will Haves, and Have Nots 429
Income Disparities in Urban Canada 434
Where Is Toronto’s Missing Middle Class? It Has Suburbanized Out of Toronto 435
Adding Spatial Analytics to Data Science 444
Race and Space in Chicago 447
Developing Research Questions 448
Race, Space, and Poverty 450
Race, Space, and Commuting 454
Regression with Spatial Lags 457
Summary 460
Endnotes 461
Chapter 11 Doing Serious Time with Time Series 463
Introducing Time Series Data and How to Visualize It 464
How Is Time Series Data Different? 468
Starting with Basic Regression Models 471
What Is Wrong with Using OLS Models for Time Series Data? 473
Newey–West Standard Errors 473
Regressing Prices with Robust Standard Errors 474
Time Series Econometrics 478
Stationary Time Series 479
Autocorrelation Function (ACF) 479
Partial Autocorrelation Function (PCF) 481
White Noise Tests 483
Augmented Dickey Fuller Test 483
Econometric Models for Time Series Data 484
Correlation Diagnostics 485
Invertible Time Series and Lag Operators 485
The ARMA Model 487
ARIMA Models 487
Distributed Lag and VAR Models 488
Applying Time Series Tools to Housing Construction 492
Macro-Economic and Socio-Demographic Variables Influencing Housing Starts 498
Estimating Time Series Models to Forecast New Housing Construction 500
OLS Models 501
Distributed Lag Model 505
Out-of-Sample Forecasting with Vector Autoregressive Models 508
ARIMA Models 510
Summary 522
Endnotes 524
Chapter 12 Data Mining for Gold 525
Can Cheating on Your Spouse Kill You? 526
Are Cheating Men Alpha Males? 526
UnFair Comments: New Evidence Critiques Fair’s Research 527
Data Mining: An Introduction 527
Seven Steps Down the Data Mine 529
Establishing Data Mining Goals 529
Selecting Data 529
Preprocessing Data 530
Transforming Data 530
Storing Data 531
Mining Data 531
Evaluating Mining Results 531
Rattle Your Data 531
What Does Religiosity Have to Do with Extramarital Affairs? 533
The Principal Components of an Extramarital Affair 539
Will It Rain Tomorrow? Using PCA For Weather Forecasting 540
Do Men Have More Affairs Than Females? 542
Two Kinds of People: Those Who Have Affairs, and Those Who Don’t 542
Models to Mine Data with Rattle 544
Summary 550
Endnotes 550
Index 553


Updates

Submit Errata

More Information

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simplyemailinformation@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through ourContact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on theAccount page. If a user no longer desires our service and desires to delete his or her account, please contact us atcustomer-service@informit.comand we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive:www.e-skidka.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information toNevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read ourSupplemental privacy statement for California residentsin conjunction with this Privacy Notice. TheSupplemental privacy statement for California residentsexplains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • 有关销售、合资企业或其他交易nsfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • 学校、组织、公司或政府gency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Pleasecontact usabout this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020