A blog about software and making.

Ensemble Learning Basics

Interesting talk about how ensemble learning was used in the Netflix contest and a short presentation on statistical bootstrapping.

  • The basic idea is that multiple models are used and they are fitted to the data they work best with.
  • Usually, we compare models and take the best one but what if instead we combine different models and take the best characteristics of each?
  • Cross-validation - using a portion of the sample data to check of over-fitting.

Meetup Event

Odds & Ends - September 2015

Thoughts, terms, and ideas I’ve come across over the last few months.

  • Mobile experiences fill gaps while we wait. Nobody wants to wait while they wait.
    1. Preform actions optimistically
      • Show +1/Like/Comment before the request is even sent. Show an error if it fails.
    2. Adaptively pre-load content
      • Load data before it’s needed.
      • Re-prioritize what to load based on user interest.
    3. Move bits when no one is watching
      • Send contact list while signing in.
      • Start uploading files while the user is filling out the details.
      • Send data as soon as part of it is ready to go and match it up on the server later.
  • Code reviews are like having a shared brain.
  • Shed load on social media sites by having shorter feeds.
  • Achieve loose coupling using notifications, events, signals, etc.
  • Project scaling
    • Code standards so everything looks the same
    • Unit tests and design documents so it’s easy to switch implementations and modify.
    • All you need a work priority queue with ideas and bugs.
    • Be transparent.
    • Don’t be date focused! It’s too idealistic.
    • Have short cycles, quick deliverables, and frequent estimates and re-estimates.
  • Web crawling techniques
    1. Start at seed page and recursively follow all/subset of links.
    2. Identify a pattern in the URL for pages you want. Ex: resource/id/ and check every valid id in range.
    3. Read the sitemap and choose which links to follow.
  • Append only event stores
    • Observations about the world are recorded for perpetuity and the results of observations are calculated on demand.
    • Example: Changes -> Append to transaction log -> DB is a roll-up view of the changes captured in the transaction log.
    • Don’t delete or update anything. Just accrete new knowledge and distil new implications based upon your increasing knowledge.
    • It’s like a warehouse. Shipments come in, shipments go out, and at any point, we can check the current inventory levels.