Statistics Free Essay Example from StudyTiger

Data

Is the information we gather with experiments and with surveys

Variables

The characteristic that varies from one person or thing to another and is observed for the subjects in a study.

Population

The collection of all individuals or items of interest

Sample

The part of the population from which the data is obtained.

Inferential statistics

Involves making decisions or predictions about a population based on information obtained from a sample of that population.

Inferential statistics

Then you use __________ to make predictions about the population.

Descriptive statistics

Involves gathering, organizing and summarizing data.

Parameter

A numerical summary of the population. ______ values are usually unknown.

Statistic

A numerical summery of the sample

Simple random sampling

Randomness is crucial to insuring that the sample is representative of the population so that powerful inferences can be made. Each subject in the population has the same chance of being included in the sample.

Quantitative

Numerical; Measure how much of something; ex) age, IQ, GPA, height, weight.

Descriptive statistics

involves gathering, organizing and summarizing data.

Inferential statistics

involves making decisions or predictions about a population based on information obtained from a sample of that population.

Sample

the part of the population from which the data is obtained.

Population

The collection of all individuals or items of interest

Population Parameter

a numerical summary of the population. ______ values are usually unknown.

Statistic Sample

a numerical summary of the sample.

Simple Random Sampling

Categorical data

Non-numerical; some numeric data could be classified (e.

g. phone#, zip code, year born). Ex) Blood type, gender, majr, dating status.

Quantitative Data

Numerical; Measures how much of something. Ex) Age, IQ, GPA, height, weight.

Discrete Data

possible values form a set of separate numbers such as 0,1,2, etc. A _____ variable is usually a count. Ex) The number of pets in a household, the number of children in a family

Continuous Data

possible values form an interval of numbers like [0,10]. __________ variables have an infinite number of values. Ex) time,height,weight,age.

Histogram

A _________ uses bars to portray the frequencies or the relative frequencies of the possible outcomes for a quantitative variable.

Pie Chart

A circle having a " slice of the pie " for each category. The size of a slice corresponds to the percentage of observations in the category. USED FOR CATEGORICAL VARIABLES.

Bar Graph

Displays a vertical bar for each category.

The height of the bar is the percentage of observations in the category. IT IS EASIER TO COMPARE CATEGORICAL VARIABLES W/ a ____ GRAPH.

Symemetric

The side of the distribution below a central value is a mirror image of the side above that central value.

Left skewed

left tail is longer than the right tail

Right skewed

right tail is linger than the left tail

Mean

The sum of the observations divided by the number of observations.

The "average". X Bar

Median

the midpoint of the observations when they are ordered from smallest to largest. the point that splits the data in two, half the data below it and half the data above it.

Outlier

an observation that falls well above or well below the overall bulk of the data; the mean can be highly influenced by an _____.

Resistant

A numerical summary of the observations is called _________ if extreme observations have little, if any, influence on its value. Median, IQR, 1st & 3rd quartiles. UNAFFECTED BY OUTLIERS

Symmertrical

mean=median

Right-skewed

mean is larger than the median

Left-skewed

mean is smaller than the median

median

if a distribution is very highly skewed, the ______ is usually preferred over the mean because it better represents what is typical.

mean

if the distribution is close to symmetric or only mildly slewed, the ______ is usually preferred because it uses the numerical values of all the observations.

mode

is the value that occurs most frequently. there can be more than one mode; Is the highest bar in the histogram; Is most often used with categorical data.

standard deviation

gives a measure of variation by summarizing the deviations of each observation from the mean and calculating an adjusted average of the deviations; Describes how far the data fall from the mean. It is the most important measure of spread. The symbol for the _____________ of a sample is 's'.

(sx- sample standard, ox- population standard)

Range

Maximum observation minus minimum observstion

percentiles

The pth ______ is a value such that p percent of the observations fall below or at the value. Three useful _______ are the quartiles.

Quartiles

______ split the distribution into four parts, each containing one quarter (25%) of the observations.

Inner quartile range(IQR)

is the distance between the third and first quartiles.

=Q3-Q1. Gives the spread of the middle 50% of the data.

z-score

data can be standardized so that different data sets can be compared or to compare values within the same data set; is the number of standar deviations that it falls from the mean. z= obeservation-mean/standard deviation. MOST z-scored will fall between -3 and 3.

Response variable

the dependent variable, the y-variable, the outcome variable. Ex.) Blood alcohol level/Beers consumed

Explanatory variable

the independent variable also known as the predictor variable; the x-variable. Ex) Grade on test/Amount of study time

upper & lower limits

An observation is a potential outlier if it falls more than 1.5 x IQR below the first quartile or more than 1.

5 x IQR above the third quartile.

non-resistant

mean(average), range, SD, Correlation, measure of spread

linear correlation

The quantity r, called the ____________ coefficient, measures the strength and the direction of a linear relationship between two variables; Sometimes referred to as the Pearson product. Takes a value between -1 and 1.

regression equation

HAVEN’T FOUND YOUR TOPIC?