Yesterday I presented at NoSQL Now! 2013 in San Jose, CA, on how Castlight Health uses MongoDB to support low latency geospatial searches on huge amounts of health care pricing data. Given the number of parallel tracks and the total number of people attending sessions, I was pleased to get about 50 engaged people in my session.
When I presented a very similar talk at MongoDB Days SF in May, we had about 600 million prices. In August we went over 1 billion. That’s a lot of JSON documents.
The main update in this version of the presentation is that I described my process of switching from the haystack index to the 2DSphere index. The thing that has changed is our data. It is now more common for us to have very large collections of prices in which the majority of rates in an area are for provider networks that are not in-network for the user. The 2DSphere index allows me to add the provider network id as a third part of the composite index. The performance I’m now seeing is anywhere from 10% slower to 5 times faster. My admittedly statistically insignificant testing so far suggests I’m going to get an average speedup of about 50%.