Data Vis Tutorial: D3 Bar Chart + NYC Open Data

Bar charts are some of the simplest forms of visualization and can be a good place to start when making your first vis. Inspired by some of the Socrata and D3 examples, I put together a simple chart showing the daily distribution of my dataset from the SODA tutorial (311 noise complaints, in zip code 11231, for September 2015). As you can see, most complaints occur on the weekend.

This chart is based on Mike Bostock’s simple bar chart tutorial, with modifications made to read the relevant NYC Open Data. Like with many of these types of visualizations, its code is written as CSS, HTML, and Javascript. This tutorial assumes proficiency with CSS and HTML, and a basic understanding of Javascript.

CSS

In the style section, we set up the fills, strokes, fonts, and shape rendering styles for our bars and axes. We also specify that there will be no x-axis line.

.bar {
 fill: orange;
}

.bar:hover {
 fill: lightpink;
}

.axis {
 font: 10px sans-serif;
}

.axis path,
.axis line {
 fill: none;
 stroke: #000;
 shape-rendering: crispEdges;
}

.x.axis path {
 display: none;
}

HTML

Our HTML body contains only two lines to include the Javascript libraries D3.js and Moment.js. And of course, it contains our custom JS functions, but we’ll discuss that below.

<script src="https://cdnjs.cloudflare.com/ajax/libs/d3/3.5.5/d3.min.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.10.6/moment.min.js"></script>

Javascript

First, we declare and initialize our spatial variables: the margins, width, and height of the vis; the x and y positioning variables, the x-axis and the y-axis, and the SVG (scalable vector graphic) into which we’ll draw our vis.

Note: I’ll add more detail later about the D3 functions used to initialize these variables, and about the D3 functions used throughout the script. In the meantime, you can start with this info, and you can also drill down into the D3 documentation from Mike Bostock’s simple bar chart tutorial.

var margin = {top: 20, right: 20, bottom: 30, left: 40},
 width = 960 - margin.left - margin.right,
 height = 500 - margin.top - margin.bottom;

var x = d3.scale.ordinal()
 .rangeRoundBands([0, width], .1);

var y = d3.scale.linear()
 .range([height, 0]);

var xAxis = d3.svg.axis()
 .scale(x)
 .orient("bottom");

var yAxis = d3.svg.axis()
 .scale(y)
 .orient("left")
 .ticks(10);

var svg = d3.select("body").append("svg")
 .attr("width", width + margin.left + margin.right)
 .attr("height", height + margin.top + margin.bottom)
 .append("g")
 .attr("transform", "translate(" + margin.left + "," + margin.top + ")");

Our remaining bit of code inputs the NYC Open Data csv, parses it, processes it, and visualizes it.

Input it

Now we’ll call D3’s CSV function to input our NYC data. Notice that my query from Tutorial 1 has an addition filter: “&$select=created_date.” I added this filter to output ONLY the created_date for each record, because this is the only field that concerns me for this visualization. If there is an issue with the data being read, the “if (error) throw error” will halt the script with an error code.

d3.csv("https://data.cityofnewyork.us/resource/erm2-nwe9.csv?$where=starts_with(complaint_type,'Noise') AND created_date >='2015-08-01T00:00:00' AND incident_zip='11231' &$select=created_date", function (error, data) {
    if (error) throw error;

Parse and Process it

Now we’ll set up an array “week” to represent Sunday (represented as week[0]) through Saturday (represented as week[6]). We will iterate through our NYC dataset with a “forEach” loop, and use the array to count the occurrence of noise complaints for each day.

This section of code is where we parse each date using Moment.js. Because NYC Open Data is NOT outputting dates in ISO 8601 format, we use Moment to parse the date. We also use Moment’s “day()” function to interpret that date as a day of the week.

var week = [0, 0, 0, 0, 0, 0, 0]; 

data.forEach(function (d) {

if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 0) {
 week[0]++;
 } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 1) {
 week[1]++;
 } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 2) {
 week[2]++;
 } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 3) {
 week[3]++;
 } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 4) {
 week[4]++;
 } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 5) {
 week[5]++;
 } else if (moment(d["Created Date"], "M/DD/YYYY hh:mm:ss a").day() == 6) {
 week[6]++;
 }
 });

I threw in a few debug statements to check the contents of the final array, as well as check the day with the largest number of complaints. If you’re modifying this tutorial with your own data, these debug statements may be useful to you.

 console.log("week:");
 for (index = 0; index < week.length; index++) {
 console.log(week[index]);
 }
 console.log("most complaints in a day:", Math.max.apply(Math, week));

Note: If you’re using Chrome and running the code for this bar chart as a browser page, you can see the output for these statements in View > Developer > Javascript Console. Firefox and Safari have similar Javascript consoles.

Visualize it!

Now it’s time to visualize the array. First, we’ll set up our domains, or the array of values that our axes must span. Domain values will be displayed as labels on our axes.

  • For the y-axis, the domain is from 0 to the maximum value of complaints aggregated from a given day of the week.
  • For the x-axis, the domain is the days of the week.
y.domain([0, Math.max.apply(Math, week)]);
x.domain(["Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"]);

Next we will draw the x-axis.

 svg.append("g")
 .attr("class", "x axis")
 .attr("transform", "translate(0," + height + ")")
 .call(xAxis);

Now we’ll draw the y-axis. We will label this axis “Number of Complaints.”

  svg.append("g")
 .attr("class", "y axis")
 .call(yAxis)
 .append("text")
 .attr("transform", "rotate(-90)")
 .attr("y", 6)
 .attr("dy", ".71em")
 .style("text-anchor", "end")
 .text("Number of Complaints");

Finally, we will draw a vertical bar for each day of the week. We iterate through the array week for this process.

  • To determine the value of x, we use the index of week to look of the corresponding value of x.domain (which will be Sunday or another day of the week). The position of x is set by its “rangeBand,” which we set up early on in our variable declarations.
  • The determine the value of y, we simply take the value of week‘s element. The position of y is set by the total height of the vis minus the value of y.
  svg.selectAll(".bar")
 .data(week)
 .enter().append("rect")
 .attr("class", "bar")
 .attr("x", function (d, i) {return x(x.domain()[i]);})
 .attr("width", x.rangeBand())
 .attr("y", y)
 .attr("height", function (d) {return height - y(d);});

Checking and Troubleshooting

You can fork and fiddle this vis using JSFiddle. Because I will require a standalone file from you for the midterm, it might be helpful to see the vis as a standalone HTML file. You can copy the file to your own server and modify the code with your own data. From there, you can debug and see console statements using Chrome > View > Developer > Javascript Console (if you’re using Chrome).