Unit#6 Arrays - Elementary Statistics | ||||||||||||||||||||||
|
||||||||||||||||||||||
|
||||||||||||||||||||||
For the maximum, we designate as temporary maximum the first element of the data and then we sweep through the rest one by one. As we do so, we compare each entry to the temporary maximum. If the current entry is greater than the temporary maximum then we replace the temporary maximum by this entry, otherwise we pass. At the end the temporary maximum will end up being the actual maximum. For the minimum we follow the exact same procedure but of course we replace the temporary minimum with the current entry only if the entry is less than the temporary minimum. For the arithmetic mean we form the sum of all data using sum = sum + entry and then at the end we divide the sum by the number of entries. Another elementary statistical characteristic of a given set of data is the Standard Deviation. Look at the two sets of data given below
Both have arithmetic mean 65.86 but they are different in character. In the first set all numbers are near and about 65 or 66. But in the second set the max is 90 and the min 40. How can we quantify the difference between the two sets? We can calculate the standard deviation using the formula: Here x with a bar denotes the arithmetic mean ( 65.86 for the present examples) and N is the number of data entries ( 7 in these examples). Calculation of STD using our calculator produces 2.23 for the first set but 17.65 for the second set. Using arrays makes the calculation of STD very simple. Here we give an example program which performs such a calculation /* An example on the use of arrays : Calculating the standard deviation in a list of scores in a data file. file: 6ex4.cpp FALL 1998 ___________________________________
Jacob Y. Kazakia jyk0 Purpose: This program reads a column of 20 float numbers from a file named 6ex4data.txt into the array a[20] . It outputs these numbers to a file named 6ex1rep.txt . It then outputs to the same file the average score and the standard deviation. Algorithm: The average score avg is obtained by calculating the sum of the twenty scores and then dividing by 20. For the standard deviation we first calculate the sum of the squares of score - average score, ie. sumsq = sum ( from m=0 to m=19) of { (a[m] - avg ) ^ 2 }, and then we calculate the standard deviation std by: std = sqrt ( sumsq / 20 ) */ #include <iostream.h> #include <iomanip.h> #include <fstream.h> #include <math.h> void main() { // declare the variables of the main function float a[20]; // This is an array of twenty entries named as: // a[0], a[1], a[2], ......., a[18], a[19]. int m; float sum = 0.0; float avg = 0.0; // the average score float sumsq = 0.0 ; float std = 0.0; ifstream FinalExam ( "6ex4data.txt" , ios:: in); ofstream report ( "6ex4rep.txt" , ios:: out); for ( m = 0 ; m <= 19 ; m++ ) { FinalExam >> a[m] ; report << setiosflags( ios :: fixed) << setprecision(1); report << " a( " << setw(2) << m <<" ) = "<< setw(5) << a[m]; sum = sum + a[m] ; report << endl; } // calculation of the average avg = sum / 20 ; report << setprecision(4); report <<"\n\n the average score is: " << avg << endl <<endl; // calculation of the standard deviation for ( m = 0 ; m <= 19 ; m++ ) sumsq = sumsq + ( a[m] - avg )* ( a[m] - avg ); std = sqrt ( sumsq / 20 ); report << " \n\n The standard deviation is: " << std << endl << endl ; cout<< " \n\n DONE ! The output is in the file 6ex4rep.txt \n\n"; cout<<" \n\n enter e (exit) to terminate the program...."; char hold; cin>>hold; } /* THIS IS THE FILE 6ex4rep.txt : a( 0 ) = 78.5 a( 1 ) = 67.3 a( 2 ) = 90.9 a( 3 ) = 89.6 a( 4 ) = 23.4 a( 5 ) = 0.0 a( 6 ) = 67.5 a( 7 ) = 89.6 a( 8 ) = 79.5 a( 9 ) = 77.3 a( 10 ) = 94.9 a( 11 ) = 89.6 a( 12 ) = 45.4 a( 13 ) = 10.0 a( 14 ) = 67.5 a( 15 ) = 89.6 a( 16 ) = 78.5 a( 17 ) = 67.3 a( 18 ) = 90.9 a( 19 ) = 89.6 the average score is: 69.3450 The standard deviation is: 27.3840 */ ( Click here for a text file of the above example 6ex4 ) |
||||||||||||||||||||||
Jacob Y. Kazakia © 2001 All rights reserved |