MongoDB aggregation queries for ‘counts per day’ (part 1)

I need a MongoDB query to retrieve document counts per day to feed a heatmap display (using https://kamisama.github.io/cal-heatmap/), for my Amateur Radio received signals historical visualization service, SpotViz.

The data to feed Cal-heatmap looks like this:

{
"946721039":4,
"946706853":2,
"946706340":7,
...
}

What’s interesting about this data structure is the property name is variable, and I’m not sure how to project a result into a property name in a MongoDB query. I asked this question on StackOverflow: “Return a computed value as field name in MongoDB query?” – so far I haven’t had any answers or suggestions, so I’m not sure this is possible.

There doesn’t seem to be a way to do exactly what I need, so my next challenge was how to group documents per day (ignoring the time part of a date), and return a count per day.

I started with a working Aggregation query from the shell, and then took that and implemented using the MongoDB Java api. The challenge with this query is that there doesn’t seem to be any out of the box feature that allows you to select matching documents based on a date and exclude the time portion of new Date(). What I need is the equivalent of ‘find counts of documents that are grouped by the same day’. The catch is to not group docs by exactly the same yyyy/MM/dd hh:mm:ss values, but to group by only the same yyyy/MM/dd values.

Since there is a way to extract the year, month and day values from a date with the aggregation $year, $month, $dayOfMonth operators, these could be used to get the result I need (the counts per day), but this format doesn’t help me get the property name for the counts in a seconds past 1/1/1970, e.g. “946721039”.

A query using this approach would look like this:

db.Spot.aggregate(
[
  {$match: {spotter: "kk6dct"}},
  {$group: { _id : {
    year:{$year:"$spotReceivedTimestamp"},
    month:{$month:"$spotReceivedTimestamp"},
    day:{$dayOfMonth:"$spotReceivedTimestamp"}
    },
    count:{$sum: 1 }
  }
}
])

… this approach follows a suggestion from this SO post.

This approach to group the document counts by day is good, but it doesn’t return the docs in the format I need with each day represented by seconds since 1/1/1970.

A better approach would be to group by millis for the date, and return that value. Converting a date in mongo to another format however seems to be somewhat challenging – I spent probably far too much time to work out a query to do this, getting close, but still not what I wanted, and ended up with this rather complex query:

db.Spot.aggregate(
[
  {$match: {spotter: "kk6dct"}},
  {$group: { _id : {
    yearval:{$year:"$spotReceivedTimestamp"},
    monthval:{$month:"$spotReceivedTimestamp"},
    dayval:{$dayOfMonth:"$spotReceivedTimestamp"},
    "h" : {
      "$hour" : "$spotReceivedTimestamp"
      },
    "m" : {
      "$minute" : "$spotReceivedTimestamp"
    },
    "s" : {
      "$second" : "$spotReceivedTimestamp"
    },
    "ml" : {
      "$millisecond" : "$spotReceivedTimestamp"
    }
  },
  count:{$sum: 1 }
}
},

{$project :

{
  "date" : {
  "$subtract" : [
  "$spotReceivedTimestamp",
  {
    "$add" : [
      "$ml",
      { "$multiply" : [ "$s", 1000 ] },
      { "$multiply" : [ "$m", 60, 1000 ] },
      { "$multiply" : [ "$h", 60, 60, 1000 ] }
    ]
  }
]
}
}
}
])

What I was attempting to do with this approach was to use the $project stage to subtract the $hour, $minute and $second values converted to millis from each of the timestamp values to get just the millis value of the yyyy/MM/dd but ignoring the time part. This is about as close as I got, but I couldn’t get the math to work, or at least convert between types so the calculations would work the way I wanted.

My next attempt was based on the suggestion in this SO post. This is a much simpler approach to the problem – my new query looks like this:

db.Spot.aggregate( [
  {$match: {spotter: "kk6dct"}},
  {"$group": {
    "_id": {
      "$subtract": [
        { "$subtract": [ "$spotReceivedTimestamp", new Date("1970-01-01") ] },
        { "$mod": [ { "$subtract": [ "$spotReceivedTimestamp", new Date("1970-01-01") ] }, 1000 * 60 * 60 * 24 ] }
      ]
    },
    count:{$sum: 1 }
  }
}
])

If I try and break this down into words, then what I’m doing is:

– for date x, calculate millis since 1/1/1970 (the epoch date)

– subtract from this the number of millis since the start of the day (this is the millis since 1/1/1970 mod number of millis in a day, the remainder of one divided by the other)

… the result is the millis of each date at midnight, i.e. excluding the time part.

Ok, almost there! How I then took this query and converted into the MongoDB Java Drvier API is coming in part 2.

Windows Phone – is it dead yet?

In recent weeks, Microsoft has axed Stephen Elop, former CEO from acquired Nokia and recent lead of their Mobile Devices Group, written off $7.6 billion as a loss from their failed acquisition of Nokia, cut 7,800 jobs, mostly in their Windows Phone business, is getting slammed with bold statements in the press such as “Windows Phone is dead. Microsoft Lumia is dead”, and still, pro-Microsoft industry bloggers like Paul Therrott are holding on to beliefs that maybe it’s not dead yet, with statements like “Analysis: Microsoft is scaling back on Windows Phone Drastically“.

Uhuh. Really? It’s Dead. Let it go. It sucked and no-one bought it. Apart from that one guy somewhere.

Getting Started with Cal-HeatMap and the AngularJS Cal-HeatMap directive

Working on my AngularJS app that visualizes historical playback of received Amateur Radio signals (www.spotviz.info) I wanted to have a ‘heatmap’ type display that shows density of the received signals per day (or hour). Searching around, the most common library to display a heatmap seems to be Cal-HeatMap. And since this is an AngularJS based app, it makes sense to use a directive to display the heatmap in my view where needed – Angular Cal-HeatMap seemed to be what I needed.

Starting with anything new it’s far easier to build something small that works first, and then include it into something larger/more complicated. Here’s my standalone test app using Angular Cal-HeatMap with Cal-HeatMap. It has hardcoded data, just to show the heatmap rendering.

The first issue I ran into, was that my standalone app was rendering fine, but copying the same directive usage into my SpotViz app, I could not get the directive to recognize my config options, or render any data.

As with everything I’ve found so far with AngularJS, sometimes stuff that should be easy turns out to be hard, but in most cases only because I don’t understand what’s going on, or how I assume it to be working is completely wrong 🙂 At least I’m not the only one who finds this about AngularJS, to quote Nathan LeClair from one of his posts:

"As I’ve gotten a little into AngularJS I’ve been surprised
by how often my assumptions about how things will work 
have turned out to be wrong".

Lesson #1:

I don’t know if this is an issue specifically with how Angular Cal-HeatMap directive is implemented or not, but the way I was trying to use it, it appears it initializes before the point where I retrieve and setup the data for display. Once I retrieve the data, setting it in the $scope variable that the directive is using seems to have no effect; it doesn’t see the data and display as I’d expect.

My work around for this was to use ng-if to only initialize the directive when I toggle a flag after the data is ready. Previously I was using ng-show to toggle the display of the directive, but that apparently causes the directive to initialize even though it’s not visible. I don’t know if there’s something else I can do to re-initialize the directive, but this works for now.

Here’s what I ended up with:

[code]<cal-heatmap ng-if="search.showDataDensity"
config="search.heatmap.config"></cal-heatmap>[/code]

For context, here’s an example of what the heatmap looks like with some example received signal data from 2014:

heatmap_example

 

 

Lesson #2:

Cal-HeatMap uses 5 CSS styles by default for coloring according to how many data points are for each date, from .q1 through .q5. For data with a count > that whatever the top range is set to, apparently css class .q6 gets applied. Without any other change however, if you haven’t defined your own .q6, these dates will be displayed as if there’s no data on those dates. This wasn’t obvious to me, but is discussed in this post on SO. Add a custom CSS class for .q6 with a slightly darker color than .q5, problem solved.

Summary

I’ve some polishing and cleanup to do, but my heatmap display is almost ready to go!