Udemy Business

Teach on Udemy

Turn what you know into an opportunity and reach millions around the world.

Learn More

Your cart is empty.

Keep shopping

Statistics for Data Analysis Using R

Name: Statistics for Data Analysis Using R
Rating: 4.6 (2664 reviews)

Learn Programming in R & R Studio • Descriptive, Inferential Statistics • Plots for Data Visualization • Data Science

Highest Rated

Created bySandeep Kumar, Quality Gurus Inc.

Last updated 9/2021

English

What you'll learn

Learn R programming from the ground up, starting from the basics and progressing to advanced data analysis techniques.
Learn the basic statistical concepts first, followed by practical application using R Studio, combining theory and practice for effective learning.
Master descriptive statistics, including mean, mode, median, skewness, and kurtosis, and how to apply these concepts to your data analysis.
Understand and perform inferential statistics such as one and two-sample z-tests, t-tests, Chi-Square tests, F-tests, ANOVA, and TukeyHSD, and more.
Explore probability distributions, including normal, binomial, and Poisson, and their applications in data analysis.
Develop the skills to perform data manipulation, visualization, and statistical analysis using R.
Apply statistical concepts in real-world scenarios, enhancing your problem-solving abilities and decision-making skills.
Boost your career prospects with a strong foundation in statistics and R programming, valuable skills in today’s data-driven job market.
Equip yourself with the tools to handle large datasets and perform complex statistical analyses with confidence.
Enhance your ability to make data-driven decisions by mastering the use of R for statistical analysis.

Course content

10 sections • 111 lectures • 12h 25m total length

Introduction - Section 11:31
Install and set up R and RStudio, save files, explore basic functions and data types, learn simple operations, and access downloadable scripts, assignments, and solutions.
Installing R and R Studio (Windows)6:04
Install and reinstall R and R Studio on Windows, guiding you through downloading R 3.4.1, choosing 32 and 64 bit options, and creating a desktop shortcut for quick access.
The First Look of R and R Studio. R you ready?11:40
Explore the basics of R and R Studio, including the console, script window, environment, history, plots, and packages, and learn how to run, save, clean, and customize the interface.
The First Look at the Functions in R6:43
Examine how functions in R use arguments to perform tasks like print, create vectors with c, and compute the mean, laying the groundwork for statistics and histograms.
Saving the R Script File6:14
Learn how to save your R work by writing code in a script, using comments, assigning variables, and setting the working directory, then loading and running the script.
Data Types in R4:45
Identify the three core data types in R: numeric, character, and logical. Learn to store text with quotes, use true or false, and check type with class.
Simple Mathematical Operations2:11
Explore basic mathematical operations in R, including plus, minus, multiplication, division, power, remainder, and quotient, then apply descriptive statistics—mean, mode, median, standard deviation, and range—with R.
Download - Section 1 Notes and Codes0:06
Section 1 - Practice Assignment0:04

Introduction - Section 21:30
Explore descriptive statistics theory, focusing on population and sample concepts. Learn measures of central tendency and variation, including mean, mode, median, range, standard deviation, and interquartile range, with downloadable slides.
Understanding Basic Statistical Terms (Theory)6:57
Clarify population and sample, and define parameters and statistics, including mean, standard deviation, proportion, with sample size n and population size N.
Descriptive Statistics (Theory)4:29
Explore descriptive statistics and its difference from inferential statistics, and summarize data using central tendency and variability. Learn mean, median, mode, percentiles, quartiles, range, and standard deviation with practical examples.
Measurement of Central Tendency (Theory)12:15
Explore central tendency measures—mean, mode, and median—along with their sensitivities to outliers, and learn to compute percentiles and quartiles using simple data examples for population and sample contexts.
Measurement of Variation (Theory)11:36
Analyze measures of variation—range, interquartile range, and standard deviation—covering outlier effects, box-and-whisker plots, and the sample versus population distinction (S vs Sigma).
Download - Section 2 Slides0:02

Introduction - Section 31:42
Apply descriptive statistics in R by computing mean, mode, median, range, interquartile range, and standard deviation using code; learn how to access help in R and review notes and assignment.
Getting Help2:50
Explore descriptive statistics in R by calculating mean, median, mode, range, interquartile range, and standard deviation. Learn to access help in R with help, ?, and ?? for function details.
Measurement of Central Tendency - Mean (Using R)6:45
Calculate the mean of a height vector in R. Handle missing values with na.rm and remove na as needed, then use trim to discard extremes.
Measurement of Central Tendency - Median and Mode (Using R)4:11
Learn to measure central tendency in R by computing the median from a height vector via sorting, and derive the mode indirectly from a frequency table.
Measurement of Variation - Range, IQR and Standard Deviation (Using R)4:24
Explore variation measurement in R by computing range, quantiles and quartiles, and interquartile range. Learn to distinguish and calculate sample versus population standard deviation using length and n to adjust.
Download - Section 3 Notes and Codes0:02
Section 3 - Practice Assignment0:04

Introduction - Section 41:58
Explore vectors, factors, lists, matrices, and data frames in R, and see how Excel columns relate to vectors and data frames to worksheets for importing external data.
Introduction8:49
Introduce data organization in R by exploring vectors and data types, and demonstrate type checking and conversion with is.numeric, is.character, is.logical, as.character, and as.numeric.
Vectors Explained8:59
Explore creating and manipulating vectors in R, including numeric and character vectors, using c and seq, indexing by position or name, and assigning names to vectors.
Factors Explained11:46
Explore how vectors are created with c() and transformed into factors to reveal categories and levels. Understand nominal versus ordinal data and define ordered factors with explicit levels for summaries.
Lists Explained5:38
Learn how a list bundles vectors, factors, matrices, and data frames into a single structure, name its elements, and access them using indexing or the dollar sign.
Matrix Explained13:35
Understand how to create and name a two-dimensional matrix in R, combining hours and marks into columns, and read data from csv with read.csv, then access items with [row, column].
Data Frames Explained12:33
Explore how to create and manipulate data frames in R, mixing numeric, character, and logical data, and access, summarize, and analyze them with mean, summary, and indexing.
Download - Section 4 Notes and Codes0:05
Section 4 - Practice Assignment0:02

Introduction - Section 51:07
Explore data visualization in r by learning basic commands to plot scatter plots, histograms, and box-and-whisker plots. Use these graphs to convey data clearly to management.
Your first plot in R5:01
Learn to plot data in R with the plot function, producing scatter plots, histograms, and box and whisker plots. Compare descriptive statistics for data summaries, and try the nycflights13 dataset.
*** Scatter Plot ***9:01
Plot an R scatter plot from nycflights13 data to show how arrival delay relates to departure delay, with arrival delay on the x-axis and departure delay on the y-axis.
Add the Plot Main and Axis Lebel Text6:23
Export and zoom the scatter plot to study the relationship between arrival delays and departure delays, and add a main title with arrival delay and departure delay labels.
Let's Draw Some Lines on the Plot7:12
Draw lines on a scatter plot with abline by setting y intercept and slope, exploring 45-degree and horizontal lines, then re-plot to refine visuals.
Change the Plot Characters (pch) from Circles to Plus Signs3:30
Change the plot characters (pch) from circles to plus signs by setting pch = 3 on the arrival versus departure delay scatter plot, and explore other shapes to differentiate categories.
Let's Look at Filtered Data6:49
Explore how to customize plots in R by changing pch symbols, filtering data by carrier, and layering points, colors, and text to compare UA and AA on a single plot.
One is not enough, I want more plots on a single page!6:54
Learn to create side-by-side plots in R with par and mfrow, displaying two graphs for UA and AA with identical x and y limits for clear comparison.
Add text to the plot8:56
Explore adding text to plots in R with text and margin text using mtext, placing labels by x, y and adj, and drawing lines with h and v.
Make plot colorful, and text bigger and bold7:56
Learn to customize a scatter plot in R by adjusting point size with cex, color and color.main for elements, and fonts with font.main and font.labels, plus adding text and mtext.
Multiple pairs of scatter diagrams - when one plot is not enough!8:23
Explore how to create multiple scatter plots in one diagram with the pairs function, examining relationships between arrival delay, departure delay, distance, and air time in the nycflights13 flights data.
Time Series Plot3:54
Plot the temperature against time using a single variable time series in R, demonstrated with the built-in weather dataset and its hourly measurements.
*** Histogram ***4:56
Master histogram creation for the nycflights13 distance data by drawing histograms, adjusting breaks, and filtering by carrier (UA or AA) to compare flight distances, with x-axis distance and y-axis frequency.
*** Box and Whisker Plot ***12:41
Learn to create box and whisker plots in R using the nycflights13 dataset, plotting distance by carrier and making side-by-side comparisons with partitioning.
Download - Section 5 Notes and Codes0:03
Section 5 - Practice Assignment0:02

Introduction - Section 61:21
Revisit descriptive statistics, reviewing center tendency and variation, and demonstrate using R packages like psych to compute and present data summaries.
Descriptive Statistics Using psych Package11:09
Learn to use the psych package in R to compute descriptive statistics, describe data frames, and describeBy for group-level summaries with the nycflights13 dataset.
Download - Section 6 Notes and Codes0:03

Introduction - Section 71:02
Define basic probability and explore unions, intersections, and conditional probability, along with addition and multiplication laws, factorials, permutations, and combinations. Lay the groundwork for probability distribution.
Probability Definition7:48
Define probability as the ratio of favorable outcomes to total outcomes, illustrated using a dice example. Explain sample space, experiment, trial, event, and the probability range between 0 and 1.
Probability - Union and Intersection9:36
Explore probability through union and intersection using Venn diagrams, illustrating A and B on a dice roll to show A∪B and A∩B, and explain mutually exclusive, independent, and complementary events.
Probability - The Law of Addition, Multiplication and Conditional Probability16:18
Explore addition and multiplication rules in probability, using Venn and tree diagrams to compute P(A∪B) and P(A∩B), with independent and dependent cases like dice and marbles.
Factorial, Permutations and Combinations6:25
Explore factorials, permutations, and combinations, including factorial 0 = 1. Distinguish when order matters and apply nPr and nCr formulas, e.g., 5P2 = 20 and 5C2 = 10, in probability.
Download - Section 7 Slides0:02

Introduction - Section 81:46
Introduces central limit theorem and three key distributions: normal, binomial, and Poisson, using R and the visualize package to plot and illustrate their properties for inferential statistics.
Central Limit Theorem (Theory)4:59
Explain the central limit theorem: the sampling distribution of the mean becomes approximately normal for large samples, regardless of population shape, with standard error sigma over sqrt(n), demonstrated in R.
Central Limit Theorem Demonstration Using R15:13
demonstrates the central limit theorem by generating 10,000 uniform numbers with runif, drawing sample means of 4, 9, and 100 items, and illustrating their normal distribution and shrinking standard error.
*** Normal Probability Distribution (Theory) ***19:35
Explore the normal probability distribution, a symmetric bell-shaped curve defined by mean and standard deviation. Find areas under the curve using sigma levels and z-scores for the standard normal distribution.
R Functions for Normal Distribution - rnorm, pnorm, qnorm and dnorm11:31
Explore how to use R’s four functions for the standard normal distribution—rnorm, pnorm, qnorm, and dnorm—to generate random numbers, compute probabilities, and plot the curve.
Plotting Normal Distribution Using R Functions7:23
Plot the normal distribution using dnorm with a -4 to 4 z-value vector. Explain creating a smooth line plot and assessing areas beyond ±3 sigma with pnorm.
Introducting "visualize" Package9:26
Learn to visualize distributions in r with the visualize package, using visualize.norm to explore the standard normal and custom mu and sigma, including left, right, and tail areas.
*** Binomial Probability Distribution (Theory) ***15:51
Explore the binomial probability distribution for discrete data, with independent n trials, two outcomes, and constant p, illustrated by coin flips; derive mean, variance, and the binomial formula.
R Functions for Binomial Distribution - rbinom, pbinom, qbinom and dbinom14:49
Explore binomial distribution fundamentals with fixed trials and independent trials, and master rbinom, pbinom, qbinom, and dbinom for probabilities, quantiles, and visualize using the visualize package.
Plotting Binomial Distribution Using R Functions3:26
Explore how dbinom computes the probability of each number of heads in ten coin flips and visualize these binomial probabilities with a barplot using the visualize package.
Binomial Distribution using Visualize Package5:57
Load the visualize package and plot a binomial distribution with n=10 and p=0.5, showing five or less and five or more, and relate mean and variance to np and np(1-p).
*** Poisson Distribution (Theory) ***6:16
Explore the Poisson distribution for discrete data, compare it with binomial, and apply the mean mu and the Poisson formula to model rare events.
R Functions for Poisson Distribution - rpois, ppois, qpois and dpois6:18
Explore Poisson distribution in R, using rpois, ppois, qpois, and dpois to compute probabilities, with lambda 3.6 and seven people (about 4.24%) in a booking-counter scenario.
Plotting Poisson Distribution Using R Functions2:51
Plot the Poisson distribution in R by computing 0-10 probabilities with dpois (lambda 3.6) and visualize the results with a bar chart.
Poisson Distribution using Visualize Package5:18
Explore Poisson distribution plotting in R with the visualize package, using lambda 3.6 to compute probabilities for seven or fewer, five or more, and tails or bounded options.
Download - Section 8 Notes and Codes0:03

Introduction - Section 91:25
Delve into hypothesis testing in inferential statistics by examining samples to judge populations, and compare means and variances using tests like z, t, chi-square, F, and ANOVA with R demonstrations.
Types of Mean and Variance Tests3:52
Explore hypothesis tests for mean and variance, including one-sample z and t tests, two-sample and paired t tests, and ANOVA.
Hypothesis Testing - Types of Errors (Theory)15:39
Compare type 1 and type 2 errors in hypothesis testing using a perfume example, and explain alpha, beta, confidence level, and how sample size affects power.
What is p value? (Theory)4:10
Learn how p-values indicate the probability the null hypothesis is true; low p-values lead to rejection, high p-values fail to reject at a chosen alpha and confidence level.
*** Hypothesis Testing - One Sample Z Test (Theory) ***13:06
Master the one-sample z test to assess mean shifts, formulating null and alternate hypotheses, choosing alpha, computing the z statistic, and interpreting one- and two-tail results.
One Sample z Test Using R10:17
Explore performing a one-sample z test in R on perfume volumes (n=100, x̄=152, σ=2) to test if the mean exceeds 150 at 95% confidence, via z statistics and p-value approaches.
One Sample z Test using BSDA Package4:30
Install and load the BSDA package, run z.test on perfume_volumes machine 1 with mu 150, sigma.x 2, and alternative greater, and reject the null with p-value 2.2e-16.
*** One Sample t Test (Theory) ***6:18
Apply the one sample t test to assess if the mean volume differs from 150 cc, using the t statistic with sample s and two-tailed 95% confidence; fail to reject.
One Sample t Test Using R5:01
Plot t distributions for dof 19, 9, and 4 with dt in R to illustrate the one sample t test and assess whether a perfume’s mean volume changes.
Visualizing One Sample t Test Results using Visualize Package5:58
Shows performing a one-sample t test in R with bottle volumes, using the Visualize package to plot the t distribution and reject the null that the mean is 150 cc.
*** One Sample Variance Test - Chi Square Test (Theory) ***5:46
Perform a one-sample chi-square test for variance, comparing sample variance to the population variance, and decide via a 95% one-tailed test of null vs alternative.
One Sample Variance Test Using Envstats Package10:52
Learn to perform a one-sample variance test in R with EnvStats, comparing a sample variance of 5 to historical 4. Interpret the chi-square result to assess variance increase.
Chi Square Distribution for One Sample Variance Test7:49
Learn to conduct a one-sample chi-square test for variance in R, using built-in functions and the Envstats package, compute the statistic and critical value, and interpret a fail-to-reject outcome.
*** Two Sample Z Test (Theory) ***17:28
apply the two-sample z test to compare two means, formulating mu1 = mu2 versus not equal and computing z from X1bar and X2bar using sigma1^2/n1 and sigma2^2/n2.
Two Sample Z Test Using R9:15
Learn how to perform a two-sample z test in R using the BSDA package, comparing mean volumes from two machines, interpreting a p-value, and visualizing results.
Visualizing Two Sample Z Test Using Visualize Package10:10
Visualize two sample z test results with the visualize package by comparing mean volumes of two machines, using box plots and overlapping histograms for clear presentation.
Two Sample Z Test for Populations with Different Means2:22
Perform a two-sample z-test to detect a 1cc difference between machines by setting mu = -1. The null hypothesis is that the mean difference equals -1 (p = 0.94).
*** Two Sample t Test (Theory) ***8:45
Explain the theory of two-sample t tests, distinguishing independent and dependent samples, compare with paired t-test, and derive the t statistic with pooled variance when variances are equal.
Two Sample t Test (Equal Variance) Using R8:48
Compare two-sample t tests with equal variances in R using a perfume volume example, check variance with var.test, perform t.test, and visualize with a boxplot.
Two Sample t Test (Unequal Variance) Using R6:06
Learn to perform a two-sample t test with unequal variances in R using var.test and t.test on mc1 and mc3, interpret the p-value, and compare box plots to assess differences.
*** Paired t Test (Theory) ***8:20
Explore the paired t test for dependent samples by analyzing before-and-after measurements, computing differences, and testing whether the mean difference significantly deviates from zero.
Paired t Test Using R6:23
Explore how to perform a paired t test in R using bp.before and bp.after, interpret a p-value of 0.53, and visualize differences with a bp.diff box plot.
*** Two Sample Variance Test Using F Test (Theory) ***11:03
Use a two-sample variance f-test to test sigma1^2 equals sigma2^2; compute f from the variances and compare to the f critical value; reject when f falls in the rejection region.
Two Sample Variance Test (F Distribution) Using R11:27
Perform a two-sample variance test in R using F distribution to compare two machines' variances via raw data and var.test, rejecting equal variances at 90% confidence.
Visualizing Two Sample Variance Test Results using Visualize Package5:04
Visualize two-sample variance test by plotting F distribution with degrees of freedom and comparing f calculated to f critical, compare machine A and B variances with a box plot.
*** ANOVA Introduction (Theory) ***9:02
Explains what ANOVA is, as analysis of variance using the f-test to assess equality of several means, and why it replaces multiple t-tests when comparing more than two groups.
Understanding the concept behind ANOVA without doing any calculation.11:29
Grasp the intuition behind ANOVA by comparing machine groups, using box and whisker plots, and learning how variation between and within drives mean differences.
Formulas and calculations in ANOVA (Theory)5:12
Explore ANOVA by defining variance and sum of squares, separating between and within variation (treatment and error), and using mean squares with degrees of freedom to form the F value.
ANOVA Example Using Manual Calculations (Theory)14:04
Explore manual anova across three machines, computing within and between sum of squares, mean sum of squares, and f value; reject the null hypothesis with a critical value.
Analysis of Variance (ANOVA) Using R13:41
Perform a complete ANOVA in R by reshaping data into volume and machine columns, running aov, interpreting the summary, and planning a Tukey HSD post-hoc test.
Post-hoc Test - TukeyHSD3:25
Following a one-way anova, TukeyHSD post-hoc tests compare pairs and show that machine 3 differs from 1 and 2, with adjusted p-values indicating significance, while 1 and 2 do not.
*** Goodness of Fit Test (Theory) ***7:47
Apply the chi-square goodness-of-fit test to decide if data fit a specified distribution, using a coin-toss example. Compare observed and expected counts and understand null and alternative hypotheses.
Goodness of Fit Test Using R - Example 17:35
Apply a chi-square goodness-of-fit test in R to assess a 40/60 coin flip, using observed versus expected counts, p-value, and the decision to reject the null of an unbiased coin.
Goodness of Fit Test Using R - Example 25:10
Apply a chi-square goodness-of-fit test in R to compare expected and observed shirt-size distributions. The example uses 0.2, 0.4, 0.3, 0.1 with observed counts and visualizes the result.
*** Contingency Tables (Theory) ***9:20
Explore contingency tables to test relationships between two discrete variables using chi-square, by formulating null and alternative hypotheses and interpreting observed versus expected counts.
Contingency Table Using R - Example 19:04
Build a three-by-three contingency table in R from operator production across three shifts, perform a chi-square test, and reject the null of independence based on the p-value.
Contingency Table Using R - Example 212:30
Generate and interpret a contingency table in R using the gmodels crossTable with nycflights13 data, comparing airlines by month and performing Pearson chi-square.
Download - Section 9 Notes and Codes0:05

Requirements

Basic school level mathematics will be helpful.
You will need to download and install R and R Studio on your PC or laptop. Both R and R Studio are for Free Software.

Description

Perform simple or complex statistical calculations using R Programming! - You don't need to be a programmer for this :)

Learn statistics, and apply these concepts in your workplace using R.

The course will teach you the basic concepts related to Statistics and Data Analysis, and help you in applying these concepts. Various examples and data sets are used to explain the application.

I will explain the basic theory first, and then I will show you how to use R to perform these calculations.

The following areas of statistics are covered:

Descriptive Statistics - Mean, Mode, Median, Quartile, Range, Inter Quartile Range, Standard Deviation. (Using base R function and the psych package)

Data Visualization - 3 commonly used charts: Histogram, Box and Whisker Plot and Scatter Plot (using base R commands)

Probability - Basic Concepts, Permutations, Combinations (Basic theory only)

Population and Sampling - Basic concepts (theory only)

Probability Distributions - Normal, Binomial and Poisson Distributions (Base R functions and the visualize package)

Hypothesis Testing - One Sample and Two Samples - z Test, t-Test, F Test, Chi-Square Test

ANOVA - Perform Analysis of Variance (ANOVA) step by step doing the manual calculation and by using R.

What are other students saying about this course?

This course is a perfect mix of theory and practice. I highly recommend it for those who want to not only get good with R, but to also become proficient in statistics. (5 stars by Aaron Verive)
You get both the “how” and “why” for both the statistics and R programming. I’m really happy with this course. (5 stars by Elizabeth Crook)
Sandeep has such a clear approach, pedagogic and explains everything he does. Perfect for a novice like myself. (5 stars by Hashim Al-Haboobi)
Very clear explanation. Coming from a non-technical background, it is immensely helpful that Prof. Sandeep Kumar is explaining all the minor details to prevent any scope for confusion. (5 stars by Ann Mary Biju)
I had a limited background in R and statistics going into this course. I feel like this gave me the perfect foundation to progress to more complex topics in both of those areas. I'm very happy I took this course. (5 stars by Thach Phan)
Dr. Kumar is a fantastic teacher who takes you step by step. Can't say enough about his approach. Detailed. Not only clear descriptions of statistics but you will learn many details that make R easier to use and understand. (5 stars by James Reynolds)
This is a wonderful course, I do recommend it. The best Udemy course I took. (5 stars by Joao Alberto Arantes Do Amaral)
The course exceeded my expectations and i would like to thank the instructor Mr Sandeep Kumar for creating such an amazing course. The best thing about this course is the Theory incorporated that helps you understand what you are going to code in R. I have really learnt a lot. If you a looking for the best course for R then look no further because this is the best there can be. (5 stars by Kipchumba Brian)

What are you waiting for?

This course comes with Udemy's 30 days money-back guarantee. If you are not satisfied with the course, get your money back.

I hope to see you in the course.

Who this course is for:

Anyone who want to use statistics to make fact based decisions.
Anyone who wants to learn R and R Studio for career in data science.
Anyone who thinks Statistics is confusing and wants to learn it in plain and simple language.

Statistics for Data Analysis Using R

What you'll learn

Explore related topics

Course content

1. Getting Started with R and R Studio9 lectures • 39min

2. Bonus Section: Descriptive Statistics Theory (lessons from my other course)6 lectures • 37min

3. Descriptive Statistics Using R7 lectures • 20min

4. Vectors, Factors, Lists, Matrix and Data Frames in R9 lectures • 1hr 3min

5. Data Visualization16 lectures • 1hr 33min

6. Descriptive Statistics Re-visited3 lectures • 13min

7. Bonus Section: Basic Probability Theory (lessons from my other course)6 lectures • 41min

8. Probability Distributions16 lectures • 2hr 11min

9. Inferential Statistics - Hypothesis Tests38 lectures • 5hr 8min

Bonus Section1 lecture • 1min

Requirements

Description

Who this course is for: