Salta ai contenuti. | Salta alla navigazione

Strumenti personali

PROBABILITY AND STATISTICS

Academic year and teacher
If you can't find the course description that you're looking for in the above list, please see the following instructions >>
Versione italiana
Academic year
2017/2018
Teacher
ALESSIA ASCANELLI
Credits
9
Didactic period
Annualità Singola
SSD
MAT/06

Training objectives

This is a course of 72 hours (9 credits); it takes place partly in the first didactic period (48 hours of probability) and partly in the second didactic period (24 hours of statistics).

The first goal of the course is to bring the student to familiarize with the mathematics of random events, especially through problems, letting him become able to reduce a real problem to a mathematical model and then solve it autonomously, using the theorems studied.

The second goal of the course, not for importance but only for temporal order, is to introduce the first elements of descriptive and inferential statistics. Indeed, the joint use of techniques coming from the two disciplines, probability and statistics, is widely applied in the interpretation and resolution of various problems coming from various contexts: economy, biology, medicine, pharmacology, quality control in productive processes, demography.

The main knowledge acquired will be:
- knowledge of probability measures and their main features
- knowledge of the rules of combinatorics
- knowledge of the concept of conditional probability
- knowledge of the concept of random variable, density, distribution function, mean and variance
- knowledge of the most famous random variables, both discrete and absolutely continuous
- knowledge of Bernoulli’s processes and Poisson’s processes
- knowledge of the main limit theorems: laws of large numbers and central limit theorem
- knowledge of the main techniques of descriptive statistics for a graphic representation of a set of data
- knowledge of the main techniques of descriptive statistics to compute positional indices, variability indices
- knowledge of the concept of sample distribution of a statistic, estimator, efficiency and non-distortion
- knowledge of the main techniques of inferential statistics: linear regression, statistical tests of hypothesis.

The main skills acquired by the students will be to know how to model real problems in the language of probability and statistics, to represent and synthesize sample information and to apply inferential techniques to support decisions.

Specifically, students will:
- be able define a probability measure and apply its properties
- be able to operate with the conditional probability, in particular with the alternatives law and the Bayes formula
- know how to count the elements of a finite set by the techniques of combinatorics
- recognize random variables, how to operate with random variables, compute density and distribution (both joint and marginal), mean and variance
- know how to deal with a Bernoulli process and make predictions about the process
- know how to deal with a Poisson process and make predictions about the process
- know how to apply the limit theorems in various contexts
- to summarize the sample information through graphs and summarize it with indices and interpret the results
- be able to apply the method of linear regression to a set of observed data
- know how to test an hypothesis

Prerequisites

Two (annual) courses on mathematical Analysis

Course programme

The program of the course is the following:

- Introduction (2 hours).
Historical notes, the problem of the stake’s division, classical definition, definition by frequence, subjective definition, axiomatic definition.

- Axioms of probability and combinatorics (8 hours).
Definition of s-algebra, probability measure, probability space, their properties. Equiprobable sample space. Combinatorics: basic rule, dispositions, permutations, combinations.

- Conditional probability and independence (6 hours).
Conditional probability. Alternatives law, graphs, Bayes theorem. Paradoxes of Monty Hall and Monty Hell. Independent events.

- Discrete random variables (8 hours).
Definition of real valued random variable: discrete, finite, countable, more than countable, distribution function and its properties. Discrete probability density, the most famous discrete densities: binomial, hypergeometric, geometric, negative binomial, Poisson.
Joint distributions and joint density of random variables, marginal densities. Independent, dependent, equally distributed random variables. Expected valued of a discrete random variable, properties, conditioned expectation. Variance and standard deviation, properties, moments. Chebyscev inequality.

- Stochastic processes (8 hours).
Introduction to stochastic processes. Bernoulli process: definition, the success will come sooner or later (almost certainly), average number of trials for the first success and for k consecutive successes, weak law of large numbers for the binomial denity. The gambler's ruin. Poisson random variable and Poisson process: coincidences, the density of Poisson as the limit of the binomial density, Poisson’s paradigm, Poisson process, the number of events that occur in a time interval.

- Random variables with density (8 hours).
Definition of random variable with density, comparison between discrete and continuous variables, mean, variance and properties. Standardization. The most famous random densities: uniform, exponential, gamma, normal or gaussian. Use of the tables for the normal distribution and properties. Waiting time for the realization of n events, the time between two succeeding events in a Poisson process. Joint distribution and density, marginal densities. Independence. Theorem of De Moivre-Laplace. Continuity correction.

- Limit theorems (6 hours).
Moment generating function, properties. The central limit theorem of Laplace, convergence in distribution. Weak law of large numbers, strong law of large numbers, convergence almost certain, comparison between the two laws.

- Descriptive statistics (6 hours)
Graphical representation of data: histograms, frequency polygons, diagrams with circular sectors, bar charts, box plot. Calculation of synthesis positional measures (mean, median, mode, quartiles), variability measures (range, interquartile range, variance, standard deviation, coefficient of variation) and shape (symmetry).

- Sampling distributions (2 hours).
Sampling distribution of a statistic, estimator and estimation, efficiency and no distortion.

- Statistical tests (10 hours).
General principles. Confidence intervals. Z tests and t tests with a sample and with two samples, one-tailed and two-tailed samples, p-value.

- Linear regression (6 hours).
The regression line and the scatter diagram. The linear correlation coefficient. The regression model and parameter estimation. The coefficient of determination. The residue analysis. Tests on the parameters and on the goodness of the model. Multiple linear regression.

Didactic methods

The course consists in both frontal lectures and exercises. Usually, after some hours of frontal lesson introducing new concepts or important theorems, a couple of hours of exercises abouth the same argument follow. For the exercises, at the beginning of every chapter, a sheet of exercises (containing the texts of many exercises) is given to the student. He has the time to give a look autonomously to the sheet, and to try to solve some of them. In classroom the most important exercises will be solved, together with the ones explicitly asked by the students.

Learning assessment procedures

The exam consists of two parts:

- a written examination of 3 hours about the part of probability, where the student is asked to solve some exercises (usually 3 or 4) and to discuss one (or two) theoretical arguments of probability.

- a written examination of one hour about the part of statistic, in which students are asked to solve descriptive and inferential statistics exercises to answer multiple choice questions.

The student gets a mark p between 0 and 31 for the probability examination, and a mark s between 0 and 20 for the statistic examination. The student passes the exam if: the mark in probability is at least 16/30 and the mark in statistic is at least 10/20, and moreover the quantity 2p/3+s/2+1 is greater or equal to 18.

The final mark depends on both the examinations and is given by 2p/3+s/2+1 approximated to the closest integer.

The student can apply several times the probability test, or the statistic test, or both; when the student gives to the teacher a new manuscript, however, it means that he wants to give up the grade achieved in the previously delivered manuscript.

Reference texts

Probability: lecture notes and exercises come mainly from the following books (in alphabetic order):
- Baldi, Paolo: Calcolo delle probabilità, seconda edizione, McGraw Hill, 2011
- Caravenna, Francesco, Dai Pra, Paolo: Probabilità. Un'introduzione attraverso modelli e applicazioni, Springer, 2013
- Dall'Aglio, Giorgio: Calcolo delle probabilità, terza edizione, Zanichelli, 2003
- Ross, Sheldon M.: Calcolo delle probabilità, seconda edizione, Apogeo, 2007

Statistics: lecture notes and exercises come mainly from the following books (in alphabetic order):
- Levine David M., Krehbiel Timothy C., Berenson Mark L.: Statistica, quinta edizione, 2010 Pearson.
- Bonnini Stefano, Grassi Angela: Esercizi svolti di Statistica e Calcolo delle Probabilità, 2015, Voltalacarta