## Software Evolution

**Introduction**

Software is present on every activity nowadays; it is on our homes, car, mobile phones, etc. An interesting aspect is that software invariably evolves. In this post we discuss some features related to Software Evolution.

**Dimensions of Software Evolution**

In accordance with (Mens and Demeyer, 2008), the software evolution types can be classified in:

**Perfective maintenance**is any modification of a software product after delivery to improve performance or maintainability;

**Corrective maintenance**is the reactive modification of a software product performed after delivery to correct discovered faults;

**Adaptive maintenance**is the modification of a software product performed after delivery to keep a computer program usable in a changed or changing environment;

**Preventive maintenance**refers to software modifications performed for the purpose of preventing problems before they occur.

## Levels of Evolution

A non-complete list of levels of software evolution is described in with (Mens and Demeyer, 2008).

**Requirements evolution**– Changes to accommodate the needs of the stakeholders. The requirements change because the client has not a clear idea what she wants on the middle stage of software development.**Architecture evolution**– The architecture defines the initial model of the system in conjunction with the software requirements. These artifacts are also subject to change.**Data evolution**– The data schema of the software could also evolve.**Runtime evolution**– Many software need to continue running constantly, in this way the software need to be evolved at the time it is in use.**Language evolution**– Companies could decide to use a new language or software paradigm.

**References**

Tom Mens and Serge Demeyer. 2008. *Software Evolution* (1 ed.). Springer Publishing Company, Incorporated.

## Datasets

# Introduction

This page includes dataset available for researches.

# Datasets

## Data Science

- Colorado DataSet. http://data.opencolorado.org/
- Datasets for Data Mining and Data Science[KDnuggets]: http://www.kdnuggets.com/datasets/index.html
- Finding Data on the Internet[Inside R]: http://www.inside-r.org/howto/finding-data-internet

## Computer Vision

The CVonline: Image database contains a list of data set that may interested you. The list includes:

- Action Databases
- Biological/Medical
- Face Databases
- Fingerprints
- General Images
- Gesture Databases
- Image, Video and Shape Database Retrieval
- Object Databases
- People, Pedestrian, Eye/Iris, Template Detection/Tracking Databases
- Segmentation
- Surveillance
- Textures
- General Videos
- Other Collection Pages
- Miscellaneous Topics

## R Resources

# Introduction

This page includes resource to resolve problems using the R statistics language.

## One Page R: A Survival Guide to Data Science with R

“*A collection of useful one-page resources for a data miner, data scientist, and/or a decision scientist. The modules include code, lectures, and one-page recipes for getting things done.*” (Kdnuggets, 2014).

Link: http://www.kdnuggets.com/2014/02/one-page-r-survival-guide-data-science-with-r.html

# References

Kdnuggets. http://www.kdnuggets.com/2014/02/one-page-r-survival-guide-data-science-with-r.html, Available 2014.

## Courses

# Introduction

This page describes a list of course that you can watch online.

## Learning From Data

Introductory Machine Learning course covering theory, algorithms and applications. Our focus is on real understanding, not just “knowing.” (Edx, 2014).

Link: http://goo.gl/J5BIaF

## Data Analysis

The course was ended, but you can have access to the course videos from Youtube videos tanged by weeks.

Link: http://goo.gl/aIDmUS

## Introduction to Data Science

Introduction to Data Science is a class at Columbia University in the Department of Statistics. The course was designed and taught by Dr. Rachel Schutt in the Fall of 2012 (DataScience, 2014).

Link: http://columbiadatascience.com/

## Machine Learning Carnegie Mellon University

“This course covers the theory and practical algorithms for machine learning from a variety of perspectives. We cover topics such as Bayesian networks, decision tree learning, Support Vector Machines, statistical learning methods, unsupervised learning and reinforcement learning. The course covers theoretical concepts such as inductive bias, the PAC learning framework, Bayesian learning methods, margin-based learning, and Occam’s Razor. Short programming assignments include hands-on experiments with various learning algorithms, and a larger course project gives students a chance to dig into an area of their choice. This course is designed to give a graduate-level student a thorough grounding in the methodologies, technologies, mathematics and algorithms currently needed by people who do research in machine learning” (MLCarnegie, 2011).

Link: http://www.cs.cmu.edu/~tom/10701_sp11/

## CS109 Data Science

“The course page describes the course as: “Learning from data in order to gain useful predictions and insights. This course introduces methods for five key facets of an investigation: data wrangling, cleaning, and sampling to get a suitable data set; data management to be able to access big data quickly and reliably; exploratory data analysis to generate hypotheses and intuition; prediction based on statistical methods such as regression and classification; and communication of results through visualization, stories, and interpretable summaries.” (DSHarvard, 2014).

Link: http://www.cs109.org/

# References

Edx. http://edx.org. Available 2014.

DataScience. http://columbiadatascience.com/2013/09/16/introduction-to-data-science-version-2-0/. Available 20114.

MLCarnegie. http://www.cs.cmu.edu/~tom/10701_sp11/. Available 2014.

DSHarvard. http://www.cs109.org/. Available 2014.

## Machine Learning Books

# Introduction

In this post we list a set of books that can be useful to the student learning machine learning.

# Books

## A Course in Machine Learning

A Course in Machine Learning is a set of introductory materials that covers most major aspects of modern machine learning (supervised learning, unsupervised learning, large margin methods, probabilistic modeling, learning theory, etc.). Its focus is on broad applications with a rigorous backbone (Hal Daumé III, 2012).

Link: http://ciml.info/

## The Elements of Statistical Learning: Data Mining, Inference, and Prediction

With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. […]This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics (Hastie, T et al., 2001).

Link: http://statweb.stanford.edu/~tibs/ElemStatLearn/

## Data Mining and Analysis: Fundamental Concepts and Algorithms

This book is an outgrowth of data mining courses at RPI and UFMG; the RPI course has been oﬀered every Fall since 1998, whereas the UFMG course has been oﬀered since 2002. While there are several good books on data mining and related topics, we felt that many of them are either too high-level or too advanced. Our goal was to write an introductory text which focuses on the fundamental algorithms in data mining and analysis (Mohammed Zaki and Wagner Meira Jr, 2014).

Link: http://www2.dcc.ufmg.br/livros/miningalgorithms/DokuWiki/doku.php

## An Introduction to Data Science

Introduction to Data Science, by Jeffrey Stanton, provides non-technical readers with a gentle introduction to essential concepts and activities of data science. For more technical readers, the book provides explanations and code for a range of interesting applications using the open source R language for statistical computing and graphics (Jeffrey M. Stanton, 2014).

Link: http://blog.revolutionanalytics.com/2013/02/free-e-book-on-data-science-with-r.html

# Advanced Data Analysis from an Elementary Point of View

These are the notes for 36-402, Advanced Data Analysis, at Carnegie Mellon. The class presumes a ﬁrm grasp on linear algebra and multivariable calculus, and that you can read and write simple functions in R (Cosma R. Shalizi, 2012).

Link: http://www.stat.cmu.edu/~cshalizi/ADAfaEPoV/

** **

## Bayesian Reasoning and Machine Learning

This hands-on text opens these opportunities to computer science students with modest mathematical backgrounds. It is designed for final-year undergraduates and master’s students with limited background in linear algebra and calculus. Comprehensive and coherent, it develops everything from basic reasoning to advanced techniques within the framework of graphical models (David Barber, 2012).

Link: http://web4.cs.ucl.ac.uk/staff/D.Barber/pmwiki/pmwiki.php?n=Brml.HomePage

** **

## Applied Data Science

The purpose of this course is to take people with strong mathematical/statistical knowledge and teach them software development fundamentals (Ian Langmore Daniel Krasner, 2014).

Link: http://goo.gl/gZXSLK

** A First Encounter with Machine Learning**

“Much of machine learning is build upon concepts from mathematics such as partial derivatives, eigenvalue decompositions, multivariate probability densities and so on. I quickly found that these concepts could not be taken for granted at an undergraduate level. The situation was aggravated by the lack of a suitable textbook. Excellent textbooks do exist for this ﬁeld, but I found all of them to be too technical for a ﬁrst encounter with machine learning. This experience led me to believe there was a genuine need for a simple, intuitive introduction into the concepts of machine learning. A ﬁrst read to wet the appetite so to speak, a prelude to the more technical and advanced textbooks” (Welling, 2011).

Link: http://goo.gl/khJ0Q

# References

Hastie, T.; Tibshirani, R. & Friedman, J. (2001), The Elements of Statistical Learning , Springer New York Inc. , New York, NY, USA .

Hal Daumé III. (2012). A Course in Machine Learning.

Data Mining and Analysis: Fundamental Concepts and Algorithms (28 February 2014) by Mohammed J. Zaki, Wagner Meira.

Jeffrey M. Stanton. Introduction to Data Science: Using the R Language for Statistical Computing and Graphics

Advanced Data Analysis from an Elementary Point of View (2012) by Cosma R. Shalizi.

David Barber. 2012. *Bayesian Reasoning and Machine Learning*. Cambridge University Press, New York, NY, USA.

Ian Langmore Daniel Krasner. (Accessed 2014) Applied Data Science. Columbia University.

Max Welling . (2011).A First Encounter with Machine Learning.

## Stanford Scientists put Free Text-Analysis Tool on the Web.

## CS281: Advanced Machine Learning

http://www.seas.harvard.edu/courses/cs281/

This is the most complete Machine Learning course material that I have ever seen.

## Stanford Algorithm Analyzes Sentence Sentiment, Advances Machine Learning

Stanford Algorithm Analyzes Sentence Sentiment, Advances Machine Learning – See more at: http://goo.gl/jrZu1f

## David Mease Videos Statistical Aspects of Data Mining

A big deal of researches are point out the videos of David Mease as a good source for learning statistic applied to Data Mining using R. If you want to try it out, take a look at his play list on Youtube here.

## Deep Learning

Deep Learning is attracting the Machine Learning community. To learn the model, two courses material are selected. The first, **Deep Learning and Unsupervised Feature Learning ** give a close look at the model, and the second, Unsupervised Feature Learning and Deep Learning, is under construction, but the first lectures are posted.