The Future of MongooseJS

Two weeks ago marked a big milestone: mongoose 3.9.0 was released. Be warned, mongoose’s versioning practice is that even numbered branches are stable and odd are unstable. While all our tests check out on 3.9.0, I would recommend sticking to 3.8.x releases in production for now. 3.9.0 was mongoose’s first unstable release since October 2013. While the changes in 3.9.0 were relatively minor, they open the door to getting some interesting features into 4.0. Here are some of the high-level features I think should make it in to 4.0:

1) Update() with Validators

Mongoose right now doesn’t run validators on calls to Model.update(). I’ve found often that its more elegant and performant to call update() directly instead of loading the document, modifying it, and then saving it. Mongoose should have better support for this paradigm in the future.

2) Browser-friendly and browserify-friendly schema validation module.

Currently, there’s no good way to send your schemas to the browser to do client-side validation. While introducing an API endpoint for validation is quite possible, hooking up mongoose schema validation directly to a tool like AngularJS in the browser can open up some incredibly cool opportunities.

3) Better integration with Koa.js and Harmony in general

Fair warning, I’m not well versed in the particulars of ES6 or Koa just yet, but I have noticed some people opening Github issues related to these subjects. As more people start moving to ES6, mongoose needs to have its A-game ready.


4) Per-document events

The general idea is that mongoose doesn’t scope document events to a particular document, that is, doc1.on(‘event’) will get triggered by doc2.emit(‘event’) if doc1 and doc2 are instances of the same model. This is expected behavior now, but its very counterintuitive. At the very least, in 4.0 doc1.on(‘event’) will get triggered by doc2.emit(‘event’) if doc1 and doc2 are the same JS object. However, we may introduce behavior where doc1.on(‘event’) will get triggered by doc2.emit(‘event’) if doc1 and doc2 have the same _id.

5) Reworking Population

Populate is extremely useful, but also has some very unfortunate dark corners and counter-intuitive behavior that I’d like to rework. There are numerous features, such as caching integration, manual population, and populating on fields other than _id that the current implementation makes very difficult. I’m hoping to get all these features into 4.0.

I’m still very much in the planning stages for mongoose 4.0, so comments, concerns, and feature suggestions are very much welcome. Feel free to open up issues on Github with features you’d like to see in 4.0.

What’s New in Mongoose 3.8.9

I have an important announcement to make: over the last couple weeks I’ve been taking over maintaining mongoose, the popular MongoDB/NodeJS ODM. I have some very big shoes to fill, Aaron Heckmann has done an extraordinary job building mongoose into an indispensable part of the NodeJS ecosystem. As an avid user of mongoose over the last two years, I look forward to continuing mongoose’s storied tradition of making dealing with data elegant and fun. However, mongoose isn’t perfect, and I’m already looking forward to the next major stable release, 4.0.0. Suggestions are most welcome, but please be patient, I’m still trying to catch up on the backlog of issues and pull requests.

On to what’s new in 3.8.9

On that note, Mongoose 3.8.9 was (finally) released yesterday. This was primarily a maintenance release, the major priority was to clean up several test failures against the new stable version of the MongoDB server, 2.6.x, without any backward-breaking API changes. I’m proud to say that 3.8.9 should be compatible with MongoDB 2.2.x, 2.4.x, and 2.6.x. In addition, I added improved support for a couple of key MongoDB 2.6 features:

Support for Text Search in MongoDB 2.6.x

As I mentioned in my post on text search, mongoose 3.8.8 didn’t quite support text search yet: mongoose prevented you from sorting by text score. This commit, which went into mquery 0.6.0, allows you to use the new $meta operator in sort() calls. Here’s an example of how you would use text search with sorting in mongoose:

/* Blog post collection with two documents:
 * { title : 'text search in mongoose' }
 * { title : 'searching in mongoose' }
 * and a text index on the 'title' field */
    { $text : { $search : 'text search' } },
    { score : { $meta: "textScore" } }
  sort({ score : { $meta : 'textScore' } }).
  exec(function(error, documents) {
    assert.equal(2, documents.length);
    assert.equal('text search in mongoose', documents[0].title);
    assert.equal('searching in mongoose', documents[1].title);

The relevant test case can be found here (there’s also test coverage for text search without sorting). Please note that you’re responsible for making sure you’re running >= MongoDB 2.6.0, running text queries against older versions of MongoDB will not give you the expected behavior. MongoDB’s docs about text search can be found here.

Aggregation helper for $out:

As I mentioned in my post about the aggregation framework’s $out pipeline stage (which pipes the aggregation output to a collection), mongoose’s aggregate() function doesn’t prevent you from using $out. However, mongoose also supports syntactic sugar for chaining helper functions onto aggregate() for building an aggregation pipeline:

  .exec(function (err, res) {

This commit adds a .out() helper function that you can use to add a $out stage to your pipeline. Note that you’re responsible for making sure that the .out() function is the last stage of your pipeline, because the MongoDB server will return an error if it isn’t. The relevant test case can be found here. Here’s how the new helper function looks in action:

var outputCollection = 'my_output_collection';

  .exec(function(error, result) {

A Minor Caveat For 2.6.x Compatibility

There is still one unfortunate edge case remaining in 3.8.9 which only affects MongoDB 2.6.x. MongoDB 2.6.x unfortunately no longer allows empty $set operators to be passed to update() and findAndModify(). This change only affects mongoose in the case where you set the upsert flag to true. This commit attempts to mitigate this API inconsistency, but there is still one case where you will get an error on MongoDB 2.6.x but not in 2.4.x: if the query passed to your findAndModify() only includes an _id field. For example,

  { _id: 'MY_ID' },
  { upsert: true },
  function(error, document) {

Will return a server error on MongoDB 2.6.1 but not 2.4.10. Right now, there is no good way to handle this case in both 2.4 and 2.6 without either doing an if-statement on the version or breaking the existing API. You can track the progress of this issue on Github.


Hope y’all are as excited about mongoose’s future as I am. There’s lots of exciting ideas that I’m looking forward to getting into mongoose 4.0. You’re more than welcome to add suggestions for new features or behavior changes on Github issues. I’m looking forward to seeing what y’all can come up with for improving mongoose and what y’all will be able to do with future versions.


A NodeJS Perspective on What’s New in MongoDB 2.6, Part II: Aggregation $out

From a performance perspective as well as a developer productivity perspective, MongoDB really shines when you only need to load one document to display a particular page. A traditional hard drive only needs one sequential read to load a single MongoDB document, which limits your performance overhead. In addition, much like how Nas says life is simple because all he needs is one mic, grouping all the data for a single page into one document makes understanding and debugging the page much simpler.

A place where the one document per page heuristic is particularly relevant is on pages that display historical data. Loading a single user object is fast and simple, but running an aggregation to compute the average number of times per month a user performed a certain action over the last 6 months is a costly operation that you don’t necessarily want to do on-demand. NodeJS devs are spoiled in this regard, because scheduling in NodeJS is extremely simple. You can easily schedule these aggregations to run once per day and avoid the performance overhead of running the aggregation every time a user hits the particular page.

However, before MongoDB 2.6, shipping the results of an aggregation into a separate collection required pulling the aggregation results in through the NodeJS driver and inserting them back into MongoDB. Furthermore, aggregation results were limited to 16MB in size, which made doing aggregations that would output one document per user impossible. MongoDB 2.6, however, introduced a $out aggregation pipeline stage, which writes the output of the aggregation to a separate collection, and removed the 16MB aggregation limit.

Getting transformed data $out of aggregation

Let’s take a look at how this can be used in practice in NodeJS. Recall the food journal app from the first part of this series: let’s add a route that will display the user’s average calories per day broken down on a per-week basis. This involves a slow and complex aggregation, so we’ll schedule this aggregation to run once per day and dump its data to a new collection using $out. The data for this route will get recomputed for all users using one aggregation, and each time the user hits the API endpoint all the server will do is read one document. Here’s what the aggregation looks like in NodeJS (you can also copy/paste this aggregation pipeline into a mongo shell and get the same result). You can also find this code on Github.

  // Pull out week of the year and day of the week from the date
    $project : {
      week : { $week : "$date" },
      dayOfWeek : { $dayOfWeek : "$date" },
      year : { $year : "$date" },
      user : "$user",
      foods : "$foods"
  // Generate a document for each food item
    $unwind : "$foods"
  // And for each nutrient
    $unwind : "$foods.nutrients"
  // Only care about calories
    $match : {
     'foods.nutrients.tagname' : 'ENERC_KCAL'
  // Add up calories for each week, keeping track of how many days in that
  // week the user recorded eating something. Output one document per
  // user and week.
    $group : {
      _id : {
        week : "$week",
        user : "$user",
        year : "$year"
      days : { $addToSet : '$dayOfWeek' },
      calories : {
        $sum : {
          $multiply : [
            { $divide : ['$foods.selectedWeight.grams', 100] }
  // Aggregate all the documents on a per-user basis.
    $group : {
      _id : "$_id.user",
      weeks : { $push : "$_id.week" },
      yearForWeek : { $push : "$_id.year" },
      daysPerWeek : { $push : "$days" },
      caloriesPerWeek : { $push : "$calories" }
  // Output to the 'weekly_calories' collection
    // Hardcode string here so can copy/paste this aggregation into shell
    // for instructional purposes.
    $out : 'weekly_calories'
], callback);

The particular details of the aggregation aren’t that important, what really matters is the $out stage at the end. The $out stage does something very cool: not only will the resulting documents get inserted into a new collection called weekly_calories, $out will overwrite the existing collection once the aggregation completes. In other words, if this aggregation runs for an hour, the weekly_calories collection will remain unchanged until the aggregation is done. After the aggregation finishes, the weekly_calories collection will be atomically replaced by the result of the aggregation. Note that, right now, $out doesn’t have any way of appending to the output collection, it can only overwrite the output collection. Design your aggregations accordingly.

Taking a look at the results

Using a bit of NodeJS magic, we can wrap this aggregation in a service that uses node-cron to schedule itself to run once per day at 0030 (12:30 am) server time:


We can then inject this service into an ExpressJS route and expose the route as a GET /api/weekly JSON API endpoint:

// app.js
app.get('/api/weekly', checkLogin, api.byWeek.inject(di));

// api.js
exports.byWeek = function(weeklyCalorieAggregator) {
  return function(req, res) {
    weeklyCalorieAggregator.get(req.user.username, function(error, doc) {

A little extra work (git diff) to put together a UI that displays the data from GET /api/weekly gives a very satisfying result:


NodeJS Project Version Compatibility

Good news, this time around, the latest versions of node-mongodb-native (1.4.2), mquery (0.6.0), and mongoose (3.8.8) support $out in aggregation. I’ve run the above aggregation with versions 1.3 and 1.2 of node-mongodb-native and version 3.6 of mongoose and those handle $out correctly too.


MongoDB 2.6’s improvements to the aggregation framework are a quantum leap forward, and enable you to do some amazing things. While scheduled analytics calculations certainly aren’t the only use case of $out, I hope this post showed you at least one way in which $out allows you to play to MongoDB’s strengths in a new way.

This is Part II of a 3-part series on using new MongoDB 2.6 features in NodeJS. Part III of this series is coming up in 2 weeks, in which I’ll take a look at some of MongoDB 2.6’s query framework improvements, primarily index filters.

A NodeJS Perspective on What’s New in MongoDB 2.6, Part I: Text Search

MongoDB shipped the newest stable version of its server, 2.6.0, this week. This new release is massive: there were about 4000 commits between 2.4 and 2.6. Unsurprisingly, the release notes are a pretty dense read and don’t quite convey how cool some of these new features are. To remedy that, I’ll dedicate a couple posts to putting on my NodeJS web developer hat and exploring interesting use cases for new features in 2.6. The first feature I’ll dig in to is text search, or, in layman’s terms, Google for your MongoDB documents.

Text search was technically in 2.4, but it was an experimental feature and not part of the query framework. Now, in 2.6, text is a full-fledged query operator, enabling you search for documents by text in 15 different languages.

Getting Started With Text Search

Let’s dive right in and use text search on the USDA SR-25 data set described in this post. You can download a mongorestore-friendly version of the data set here. The data set contains 8194 food items with associated nutrition data, and each food item has a human-readable description, e.g. “Kale, raw” or “Bison, ground, grass-fed, cooked”. Ideally, as a client of this data set, we shouldn’t have to remember whether we need to enter “Bison, grass-fed, ground, cooked” or “Bison, ground, grass-fed, cooked” to get the data we’re looking for. We should just be able to put in “grass-fed bison” and get reasonable results.

Thankfully, text search makes this simple. In order to do text search, first we need to create a text index on your copy of the USDA nutrition collection. Lets create one on the food item’s description:

db.nutrition.ensureIndex({ description : "text" });

Now, we can search the data set for our “raw kale” and “grass-fed bison”, and see what we get:

  { $text : { $search : "grass-fed bison" } },
  { description : 1 }).

  { $text : { $search : "raw kale" } },
  { description : 1 }).


Unfortunately, the results we got aren’t that useful, because they’re not in order of relevance. Unless we explicitly tell MongoDB to sort by the text score, we probably won’t get the most relevant documents first. Thankfully, with the help of the new $meta keyword (which is currently only useful for getting the text score), we can tell MongoDB to sort by text score as described here:

  { $text : { $search : "raw kale" } },
  { description : 1, textScore : { $meta : "textScore" } }).
    sort({ textScore : { $meta : "textScore" } }).

Using Text Search in NodeJS

First, an important note on the compatibility of text search with NodeJS community projects: the MongoDB NodeJS driver is compatible with text search going back to at least 1.3.0. However, only the latest version of mquery, 0.6.0, is compatible with text search. By extension, the popular ODM Mongoose, which relies on mquery, unfortunately doesn’t have a text search compatible release at the time of this blog post. I pushed a commit to fix this and the next version of Mongoose, 3.8.9, should allow you to sort by text score. In summary, to use MongoDB text search, here are the version restrictions:

MongoDB NodeJS driver: >= 1.4.0 is recommended, but it seems to work going back to at least 1.2.0 in my personal experiments.

mquery: >= 0.6.0.

Mongoose: >= 3.8.9 (unfortunately not released yet as of 4/9/14)

Now that you know which versions are supported, let’s demonstrate how to actually do text search with the NodeJS driver. I created a simple food journal (e.g. an app that counts calories for you when you enter in how much of a certain food you’ve eaten) app that is meant to tie in to the SR-25 data set. This app is available on GitHub here, so feel free to play with it.

The LeanMEAN app exposes an API endpoint, GET /api/food/search/:search, that runs text search on a local copy of the SR-25 data set. The implementation of this endpoint is here. For convenience, here is the actual implementation, where the foodItem variable is a wrapper around the Node driver’s connection to the SR-25 collection.

/* Because MongooseJS doesn't quite support sorting by text search score
* just yet, just use the NodeJS driver directly */
exports.searchFood = function(foodItem) {
 return function(req, res) {
   var search =;
       { $text : { $search : search } },
       { score : { $meta: "textScore" } }
     sort({ score: { $meta : "textScore" } }).
     toArray(function(error, foodItems) {
       if (error) {
         res.json(500, { error : error });
       } else {

Unsurprisingly, this code looks pretty similar to the shell version, so it shouldn’t look unfamiliar to you NodeJS pros :)

Looking Forward

And that’s all on text search for now. In the next post (scheduled for 4/25), we’ll tackle some of the awesome new features in the aggregation framework, including text search in aggregation.


Plugging USDA Nutrition Data into MongoDB

As much as I love geeking out about basketball stats, I want to put a MongoDB data set out there that’s a bit more app-friendly: the USDA SR25 nutrient database. You can download this data set from my S3 bucket here, and plug it into your MongoDB instance using mongorestore. I’m very meticulous about nutrition and have, at times, kept a food journal, but sites like FitDay and DailyBurn have far too much spam and are far too poorly designed to be a viable option. With this data set, I plan on putting together an open source web-based food journal in the near future. However, I encourage you to use this data set to build your own apps.

Data Set Structure

The data set contains one collection, ‘nutrition’. The documents in this collection contain merged data from the SR25 database’s very relational FOOD_DES, NUTR_DEF, NUT_DATA, and WEIGHT files. In more comprehensible terms, the documents contain a description of a food item, a list of nutrients with measurements per 100g, and a list of common serving sizes for that food. Here’s what the top level document for grass-fed ground bison looks like in RoboMongo, a simple MongoDB GUI:

The top level document is fairly simple: the description is a human-readable description of the food, the manufacturer is the company that manufactures the product, and survey is whether or not the data set has values for the 65 nutrients used for some government survey. However, the real magic happens in the nutrients and weights subdocuments. Lets see what happens when we open up nutrients:

You’ll see that there are an incredible amount of nutrients. The nutrients data is in an array, where each subdocument in the array has a tagname, which is a common scientific abbreviation for the nutrient, a human-readable description, and an amountPer100G with corresponding units. In the above example, you’ll see that 100 grams of cooked grass-fed ground bison contains about 25.45 g of protein.

(Note: the original data set includes some more detailed data, including standard deviations and sample sizes for the nutrient measurements, but that’s outside the scope of what I want to do with this data set. If you want that data, feel free to read through the government data set’s documentation and fork my converter on github.)

Finally, the weights subdocument is another array which contains sub-documents that describe common serving sizes for the food item and their mass in grams. In the grass-fed ground bison example, the weights list contains a single serving size, 3 oz, which approximately 85 grams:

Exploring the Data Set

First things first: since the nutrients for each food are in an array, its not immediately obvious what nutrients this data set has. Thankfully, MongoDB’s distinct command makes this very easy:

There are a lot of different nutrients in this data set. In fact, there are 145:

So how are we going to find nutrient data for a food that we’re interested in? Suppose we’re looking to find how many carbs are in raw kale. Pretty easy to do because MongoDB’s shell supports JavaScript regular expressions, so lets just find documents where the description includes ‘kale’:

Of course, this doesn’t include the carbohydrate content, so lets add a $elemMatch to the projection to limit output to the carbohydrates in raw kale:

Running Aggregations to Test Nutritional Claims

My favorite burger joint in Chelsea, brgr, claims that grass-fed beef has as much omega-3 as salmon. Lets see if this advertising claim holds up to scrutiny:

Right now, this is a bit tricky. Since I imported the data from the USDA as-is, total omega-3 fatty acids is not tracked as a single nutrient. The amounts for individual omega-3 fatty acids, such as EPA and DHA, are recorded separately. However, the different types of omega-3 fatty acids all have n-3 in the description, so it should be pretty easy to identify which nutrients we need to sum up to get total omega-3 fatty acids. Of course, when you need to significantly transform your data, its time to bust out the MongoDB aggregation framework.

The first aggregation we’re going to do is find the salmon item that has the least amount of total omega-3 fatty acids per 100 grams. To do that, we first need to transform the documents to include the total amount of omega-3’s, rather than the individual omega-3 fats like EPA and DHA. With the $group pipeline state and the $sum operator, this is pretty simple. Keep in mind that the nutrient descriptions for omega-3 fatty acids are always in grams in this data set, so we don’t have to worry about unit conversions.

You can get a text version of the above aggregation on Github. To verify brgr’s claim, lets run the same aggregation for grass-fed ground beef, but reversing the sort order.

Looks like brgr’s claim doesn’t quite hold up to a cursory glance. I’d be curious to see what the basis for their claim is, specifically if they assume a smaller serving size for salmon than for grass-fed beef.


Phew, that was a lot of information to cram into one post. The data set, as provided by the USDA, is a bit complex and could really benefit from some simplification. Thankfully, MongoDB 2.6 is coming out soon, and, with it, the $out aggregation operator. The $out operator will enable you to pipe output from the aggregation framework to a separate collection, so I’ll hopefully be able to add total omega-3 fatty acids as a nutrient, among other things. Once again, feel free to download the data set here (or check out the converter repo on Github) and use it to build some awesome nutritional apps.



Why Math is Necessary for CS Majors

While math and computer science have been lumped together for about as long as the latter has existed, there’s a lot of backlash recently toward the idea that a solid math background is integral to being a good developer. The relationship between the two was something that I struggled to grasp as an undergraduate in Computer Science. The relationship between math and CS isn’t as direct as, say, math and physics, or even philosophy and CS. However, taking a rigorous pure math course as an undergraduate will help you significantly, whether you choose to be an ivory tower academic, a developer for the latest hip startup out of Silicon Valley, or an engineer for a big NYC bank.

The reason why has nothing to do with learning what most people would call “practical skills.” Even as an undergraduate specializing in theory and theoretical computer vision, I realized that the limit of my practical use of mathematics was a high school-level understanding of linear algebra, some basic graph theory, and whatever I needed for big-O notation. While some advanced mathematics, like Galois Theory, have CS-related applications, you probably won’t use them in CS outside of the most closeted of ivory towers. I can honestly say that, in the 8 years since I got my first software engineering internship back in high school, I’ve never had to use anything I learned in undergrad Real Analysis or Galois Theory (thankfully, because I honestly deserved to fail Galois Theory). So clearly, when I completed the required courses to graduate as a math major, I was wasting my time, right? Wrong!

The fallacy in the above reasoning is that learning CS isn’t a video game styled tech tree. Just because understanding a proof of Stokes’ Theorem isn’t a strict prerequisite for being an effective developer, doesn’t mean that it doesn’t help. As Irish poet W.B. Yeats once wisely said, “education is not the filling of a pail, but the lighting of a fire.” Similarly, learning to be a developer isn’t about crossing off a checklist of practical skills and making your resume look like a buzzword bingo board. Learning to be a developer is about practice (which is why Allen Iverson wasn’t a software developer), and what a pure math class gives you is a slightly different environment in which to practice your skills. When you have spent a little time looking at both, you’ll realize that going through a pure math textbook and remembering the correct theorems and lemmas to use in a homework proof is pretty damn similar to figuring out which modules you need to effectively add some new functionality to your codebase.

Many engineers bemoan the lack of unit testing instruction in undergrad CS curricula, they forget that unit tests are only useful if you have the rigor to write them in the first place. Not only is the process of figuring out which theorems to use an exercise in dependency management, but the process of proving a theorem is similar to writing unit tests to prove the correctness of your module. A lot of developers nowadays have copped a bad attitude when it comes to writing proper unit tests, saying that their code is trivially correct. I bet these people haven’t sat down to prove that there are no rational numbers satisfying the equation x^2 = 2 either, and I think that a solid grounding in pure math can nip this weakness in the bud. This way, Rudin’s Principles of Mathematical Analysis, the bane of every freshman math major’s existence, is essentially a large codebase for you to practice on.

Similarly, graph theory is absolutely integral to the day-to-day of being a software developer, even if you’re not working with graphs directly. Speaking of modules, to an experienced developer, a well-organized codebase looks a lot like a graph. Pieces of code are bits of logic with dependencies which are references to other bits of logic, which, of course, intuitively maps to nodes and edges. All refactoring work comes down to just thinking about graphs and how to make your code graph comprehensible. Beyond this simple example of refactoring and managing dependencies in code, the applications of reasoning about graphs in software development are endless, from breaking a bulky task down into components, to thinking about points of failure in a sophisticated network topology. While I’ve never had to think about Hadwinger’s conjecture in a professional context, my undergrad Graph Theory course gave me a lot of valuable practice reasoning about graphs in a rigorous way. This practice continues to serve me well to this day, whether I’m trying to organize my dependencies in AngularJS, thinking about the topology of my MongoDB cluster, or just figuring what tasks I need to get done today.

Bottom line, kids out there, if you really want to be a successful software developer, taking proof-based math (and proof-based graph theory in particular) is an excellent step in the right direction. It won’t be easy, but becoming good at something never is.


The Optimal Setup for Listening to Talks at 2x Playback Speed

If you’re an avid podcast listener and online courseware consumer like I am, odds are you’ve gotten frustrated with how long it takes to listen to a single lecture. An hour-long podcast on Bulletproof Executive? 20 minutes listening to a TEDTalk from a HackDesign lesson? No offense to these awesome content creators, but ain’t nobody got time for that.


Thankfully, you can listen to Youtube videos and mp3’s at 2x speed pretty easily. While processing speech at twice the speed may seem intimidating, with a little preparation and a simple biohack, you can absorb information from 2x playback speed as well as you do at 1x.

Technical Details

So how do you actually take all these talks and listen to them at 2x playback speed? Currently, I rely on either finding the talk on Youtube or getting a downloadable mp3 version. Most podcasts I’ve seen link to downloadable mp3 versions, and most TED talks are pretty easy to find on Youtube, so I haven’t found this limitation to be significant.

To listen to Youtube videos at 2x, opt-in to using Youtube’s html5 player here. Obviously, you need an html5-enabled browser, but if you’re using a recent version of Chrome, Firefox, or Opera like a civilized human being, you should be fine. Once you’ve opted in, you should see the below cog icon on Youtube videos. As a first experiment, try watching Andrew Stanton’s excellent TEDTalk about storytelling at 2x.


Listening to mp3’s at 2x is also extremely simple. My preferred approach uses VLC media player, but, if you’re willing to take the questionable risk of allowing Apple products on your computer, Quicktime player works just as well. In VLC’s top menu bar, Playback -> Speed -> Faster increases the current playback speed by 50%. Make sure you do this twice to get to our desired 2x playback speed.


Get Focused

One of the obvious difficulties inherent to listening to 2x playback speed audio is you miss more when you lose focus. You can get away with distractions when listening to talks with a lot of fluff, like the storytelling TEDTalk above, but if you lose focus for 30 seconds when listening to a Bulletproof Exec podcast because of a gchat notification, you’re going to be lost. When listening to 2x audio, you should channel the guy from My New Haircut and don’t let anything or anyone interrupt you while you’re in the zone. Here’s a couple tips to cut down on your distractions:

1) Exit out of all email tabs, IM clients, Facebook, and any other notification-generating apps. This includes putting your phone on silent.

2) Don’t actually watch the corresponding video. Unless somebody’s drawing a diagram, the visuals of the talk don’t contribute much to the actual content, and can be a source of distraction. Instead, point your browser to a very static and very boring page, like my personal favorite,

3) Binaural beats are a simple and powerful biohack that really help get your mind in the proper state for absorbing information. At a high level, binaural beats consist of two tones played at slightly different frequencies through your headphones. For example, one ear hears a 310 Hz tone, the other a 300 Hz tone, which helps entrain 10 Hz brain waves, i.e. alpha and mu waves. The theory is simple enough, so I recommend you head over to this Youtube channel and try it!

Personally, I usually start a 12 Hz binaural beat shortly before listening to a talk or podcast at 2x and keep it playing throughout. Not only do binaural beats help optimize your mental state, but they also provide a consistent baseline of sound to block out extraneous noise from your home, office, or crowded commuter train. Conventional wisdom around binaural beats usually says that a 8-10 Hz beat is optimal for learning new information, but 12 Hz works better in my own highly unquantified N=1 experiment.


I hope this information helps you get started in optimizing your information consumption. As a developer, I’m all about efficiency. And after starting this routine, I’ve been able to regularly digest my favorite online audio content in half the time, which has been a huge win.