Friday, April 01, 2005

I have a data mining assignment to do as I upgrade my statistics education. The assignment is to filter spam. I have been given some 4601 observations or emails. I have 57 variables based on word frequency, character frequency and capitals frequency and length. Then the 58th variable is 1 or 0 depending on whether the email is spam or not. So this is supervised learning. I was able to open this data set in both R and SAS. In SAS I did a proc means and also a proc chart. I have 58 histograms coming out of SAS. I am printing these histograms 4 per page on both sides of the page. I am also printing the results of the proc means.

I am still reading Tomasina Borkman on self help groups and experiential knowledge and have been reading the chapter on professionals over night. I have also started to read abook on African telecommunication development. I also read more about educational reform in the USA in the 1990´s from a book on Urban eco-systems and teaching urban eco-systems. Last night I read a little bit more about a feminist Supreme court Justice.

No comments: