A blog about software and making.

Feature Extraction

Talk about how to turn raw data into features to use in learning and modelling and a short presentation on linear modelling.

  • Adding new features that describe how the data changes over time (ex: position/velocity/acceleration/jerk).
  • Figure out what variables are important to deduce dimensions. Fewer dimensions reduce error.
  • PCA - capture variance in a new vector that maximizes variance.
  • Choose components based on proportion of variance (How much variance does this data account for?)
  • PCA may make things worse! There may be too many relationships between variables and we don’t want to lose any.
  • Over-fitting - Using too much local data that doesn’t account for variance. The model becomes fitted to the data you are seeing instead of the relationships between variables.
  • The log function can be used to separate data.
  • Linear modelling - Fit a line to minimize the amount of error. Best if the error is normally distributed (most errors are zero).

Meetup Event