Skip to content

Data Sets#

We need a way to organize statistical data.

Definition: Data Set

A data set is a [[Multisets|multiset]].

Each element of a data set represents a value that was measured for a [[TODO|random variable]] and its multiplicity is the number of times this value was measured.

Definition: (Absolute) Frequency

The (absolute) frequency of the \(i\)-th data point in a [[Data Sets|data set]] is its [[Multisets|multiplicity]].

Notation

\[ f_i \]

Definition: Relative Frequency

The relative frequency of the \(i\)-th data point in a [[Data Sets|data set]] \(S\) is the ratio of its [[Data Sets|absolute frequency]] to the [[Multisets|cardinality]] of the data set.

\[ \tilde{f}_i \overset{\text{def}}{=} \frac{f_i}{|S|} \]